Note on EM-training of IBM-model 1

Size: px
Start display at page:

Download "Note on EM-training of IBM-model 1"

Transcription

1 Note on EM-tranng of IBM-model INF58 Language Technologcal Applcatons, Fall The sldes on ths subject (nf58 6.pdf) ncludng the example seem nsuffcent to gve a good grasp of what s gong on. Hence here are some supplementary notes wth more detals. Hopefully they make thngs clearer. The man dea There are two man tems nvolved: - Translaton probabltes - Word algnments The translaton probabltes are assgned to the blngual lexcon: For a par of words (e,f) n the lexcon, how probable s t that e gets translated as f, expressed by f e). Beware, ths s calculated from the whole corpus; we do not consder these probabltes for a sngle sentence. A word algnment s assgned to a par of sentences (e, f). (We are usng bold face to ndcate that e s a strng (array) of words e, e, e k, etc.) When we have a parallel corpus where the sentences are sentence algned whch may be expressed by (e, f ), (e, f ),,(e m, f m ) we are consderng the algnment of each sentence par ndvdually. Ideally, we are lookng for the best algnment of each sentence. But as we do not know t, we wll nstead consder the probablty of the varous algnments of the sentence. For each sentence, the probablty of the varous algnments must add to. The EM-tranng then goes as follows. Intalzng a. We start wth ntalzng t. When we don t have other nformaton, we ntalze t unformly. That s, f e) /s, where s s the number of F-words n the lexcon. b. For each sentence n the corpus, we estmate the probablty dstrbuton for the varous algnments of the sentence. Ths s done on the bass of t, and should reflect t: For example, f f e k ) f j e k ), then algnments whch algns f to e k should be tmes more probable than they whch algn f j to e k. (Well, actually, n round ths s more trval, snce all algnments are equally lkely when we start wth a unform t.). Next round a. We count how many tmes a word e s translated as f on the bass of the probablty dstrbutons for the sentences. Ths s a fractonal count. Gven a sentence par (e j, f j ). If e occurs n e j. and f occurs n f j, we consder algnments whch algns f to e. Gven such an algnment, a, we consder ts probablty P(a), and from ths algnment we count that e s translated as f P(a) many tmes. For example, f P(a) s., we wll add. to the count of how many tmes e s translated as f. After we have done ths for all algnments of all the sentences, we can recalculate t. The notaton n Koehn s book for the dfferent counts and measures s not stellar, but as we adopted the same notaton n the sldes, we wll stck to t to make the smlartes transparent. Koehn used the notaton f e) for the fractonal count of the

2 par (e,f) n a partcular sentence. To make t clear that t s the count n the specfc sentence par (e, f), he also uses the notaton f e; f, e ). To ndcate the fractonal count of the word (type) par (e,f) over the whole corpus, he uses (f,e) f e; f, e ) (.e. we add the fractonal counts for all the sentences.) An alternatve notaton for the same would have been m f e; f, e ) gven there are m sentences n the corpus. We ntroduced the notaton tc for total count for ths on the sldes. tf e) f e; f, e) (f,e) The reestmated translaton probablty can then be calculated from ths f e) ( f, e ) f e; f, e) f e; f, e) Here f vares over all the F-words n the lexcon. b. Wth these new translaton probabltes, we may return to the algnments, and for each sentence estmate the best probablty dstrbuton over the possble algnments. Ths tme there s no smple way as there was n round. For each algnment, we calculate a probablty on the bass of t, and normalze to make sure that the sum of the probabltes for each sentence add up to. Next round: a. We go about exactly as n step (a). On the bass of the algnment probabltes estmated n step (b), we may now calculate new translatons probabltes t, b. And on the bass of the translaton probabltes estmate new algnment probabltes. And so we may repeat the two steps as long as we lke Propertes What s nce wth ths algorthm s: - We can prove that the result gets better (or stay the same) after each round. It never deterorates. - The result converges towards a local optmum. - For IBM model (but not n general) ths local optmum s also a global optmum. The fast way We have descrbed here the underlyng dea of the algorthm. The descrpton above s probably the best for understandng what s gong on. There s a problem when applyng t. There are so many (too many) dfferent algnments. We therefore derved a modfed algorthm where we do not calculate the probabltes of the actual algnments. Instead we calculate the translaton probablty n step (a) drectly from the translaton probabltes from step (a) and the translaton probabltes n step (a) drectly from the translaton probabltes n (a) wthout actually calculatng the ntermedate algnment probabltes (step b). f ( f, e) f ' t f e) t f ' e)

3 Examples There s a very smple example n Jurafsky and Martn whch llustrates the calculaton wth the orgnal algorthm. You should consult ths frst. In the example n the lecture, we followed the modfed algorthm where we sdestep the actual algnments. Let us now see how the example from the lecture would go wth the full algorthm frst (smlarly to the Jurafsky-Martn example), before we compare t to the example from the lecture wth some more detals flled n. We wll number the examples Sentence : - e : dog barked - f : hund bjeffet Sentence : - e : dog bt dog - f : hund bet hund to have the smplest example frst. The theoretcal sound, but computatonally ntractable way: Step a - Intalzaton. Snce there are Norwegan words all f e) s set to /. hund dog) / bet dog) / bjeffet dog) / hund bt) / bet bt) / bjeffet bt) / hund barked) / bet barked) / bjeffet barked) / hund ) / bet ) / bjeffet ) / Step b Algnments We must also nclude n the E-sentence to ndcate that a word n the F-sentence may be algned to nothng. Each of the words n the sentence f may come from one of dfferent words n sentence e. Hence there are 9 dfferent algnments: <,>, <,>, <,>, <,>, <,>, <,>, <,>, <,>, <,>. Snce all translaton probabltes are equally lkely, each algnment wll have the same probablty. Snce there are 9 dfferent algnments, each of them wll have the probablty /9. Wrtng a for the algnment probablty of the frst sentence, we have a (<,>) a (<,>) a (<,>)/9. For sentence, there are words n f.. Each of them may be algned to any of 4 dfferent words n e (ncludng ). Hence there are 4*4*464 dfferent algnments, rangng from <,,> to <,,>. We could take the easy way out and say that each of them s equally lkely, hence a (<,,>) a (<,,>) a (<,,>)/64. But to prepare our understandng for later rounds, let us see what happens f we follow the recpe. To calculate the probablty of one partcular algnment, we multply together the nvolved translaton probabltes, eg. P (<,,>) hund dog)*bet bt)*hund )/7. In ths round, we get exactly the same result for all the algnments, /7. But that sn t the same as /64. Has anythng gone wrong here? No. The score /7 s not the probablty of the algnment. To get at the probablty we must normalze. Frst we sum together the scores for all the algnments whch yelds 64/7. Then to get the probablty for each algnment, we must dvde the second wth ths sum. Hence the probablty for each algnment s (/7)/(64/7) a complcated way to wrte /64.

4 Step a Maxmze the translaton probabltes Then the show may start. We frst calculate the fractonal counts for the word pars n the lexcon, and we do ths sentence by sentence, startng wth sentence. To take one example, what s the fractonal count of (dog, hund) n sentence? We must see whch algnments whch algn the two words. There are : <,>, <,>, <,>. (A good advce at ths pont s to draw the algnments whle you read.) To get the fractonal count we must add the probabltes of these algnments,.e., hund dog; f, e ) a (<,>)+ a (<,>)+ a (<,>)*(/9) /. We can repeat for the par (hund, barked) and get hund barked; f, e ) a (<,>)+ a (<,>)+ a (<,>)*(/9) /, and so on. We see we get the same for all word pars n ths sentence hund dog) / bjeffet dog) / hund barked) / bjeffet barked) / hund ) / bjeffet ) / (There s a typo n the lecture sldes and n the frst verson of these notes, wrtng t nstead of c n the rght column. The same for sentence.) Sentence s more extng. Consder frst the par (bet, bt). They get algned by all algnments of the form <x,, y> where x and y are any of,,,. There are 6 such algnments. (We don t bother to wrte them out). Each algnment has probablty /64. Hence bet bt; f, e ) 6/64 ¼ Smlarly we get bet ; f, e ) 6/64 ¼. To count the par (dog, bet), they are algned by all algnments of the form <x,,y> and all algnments of the form <x,,y>, hence bet dog; f, e ) *6/64 ½ To count the par (bt, hund), we must consder both algnments of the form <,x,y> and of the form <x,y,>. (Observe that <,x,> should be counted twce snce two occurrences of hund are algned to bt.) And to count the par (hund, dog), we must consder all algnments <,x,y>, <,x,y>, <x,y,> and <x,y,>. We get the followng counts for sentence : hund dog) bet dog) / hund bt) ½ bet bt) /4 hund ) / bet ) /4 4

5 We get the total counts (tc) by addng the fractonal counts for all the sentences n the corpus resultng n thund dog) +/ tbet dog) / tbjeffet dog) / t* dog)4/+/+/ /6 thund bt) ½ tbet bt) ¼ tbjeffet bt) t* bt)/4 thund barked) / tbet barked) tbjeffet barked) / t* barked) / thund ) ½+/ tbet ) /4 tbjeffet ) / t* )7/ In the last column we have added all the total counts for one E word, e.g. t dog) tf e; f, e ) f We can then fnally calculate the new translaton probabltes: e f f e) exact decmal hund (5/6)/(7/)/ bet (/4)/(7/)/ bjeffet (/)/(7/)4/7.594 dog hund (4/)/(/6) 8/.6585 dog bet (/)/(/6) /.769 dog bjeffet (/)/(/6) /.5846 bt hund (/)/(/4) / bt bet (/4)/(/4) /. barked hund (/)/(/ /.5 barked bjeffet (/)/(/) /.5 5

6 Step b_ Estmate algnment probabltes It s tme to estmate the algnment probabltes agan. Remember ths s done sentence by sentence, startng wth sentence. There are 9 dfferent algnments to consder. For each of them we may calculate an ntal unnormalzed probablty, call t P, on the bass of the last translaton probabltes. P PP /,44546 P (<,>) hund )*bjeffet ) (/7)*(/7),86,7848 P (<,>) hund )*bjeffet dog) (/7)*(/),94977,69766 P (<,>) hund )*bjeffet barked) (/7)*(/),948,794 P (<,>) hund dog)*bjeffet ) (8/)*(/7),8597,76778 P (<,>) hund dog)*bjeffet dog) (8/)*(/),946746,66994 P (<,>) hund dog)*bjeffet barked) (8/)*(/),769,75 P (<,>) hund barked)*bjeffet ) (/)*(/7),885,6775 P (<,>) hund barked)*bjeffet dog) (/)*(/),769,5485 P (<,>) hund barked)*bjeffet barked) (/)*(/),5,7675 Sum of P s,44546 We sum the P scores (last lne) and normalze them n the last column to get the probablty dstrbuton over the algnments. We may do the same for sentence. But because there are 64 dfferent algnments we refran from carryng out the detals. Step a Maxmze the translaton probabltes We proceed exactly as n step a. We frst collect the fractonal counts sentence by sentence, startng wth sentence. For example, we get hund barked; f, e ) a (<,>)+ a (<,>)+ a (<,>),6775+,5485+,7675,949 And smlarly for the other fractonal counts n sentence. Snce we have not calculated the algnments for sentence, we stop here. Hopefully the dea s clear by now. The fast lane Manually we refran from calculatng 64 algnments, but t wouldn t have been a problem for a machne. However, a short sentence of words has roughly algnments and soon also the machnes must gve n. Let us repeat the calculatons from the sldes from the lecture. The pont s that we skp the algnments and pass drectly from step a to step a and then to step a etc. The key s the formula m k f e) f e; e, f ) δ ( f, f ) δ ( e, e ) whch lets us calculate fractonal counts drectly from (last round of) translaton probabltes. k j f e j ) 6

7 Step a Maxmze the translaton probabltes To understand the formula, f j refers to the word at poston j n sentence f. Thus n sentence, f f s hund, δ f, f j for j, whle, δ f, f j for j. Smlarly, e refers to the word n poston n the Englsh strng. Hence, hund barked) hund barked; e, f) (, ) (, ) δ hund f j δ barked e f e ) j ( δ ( hund, hund) + δ ( hund, bjeffet)) ( ) ( δ ( barked,) + δ ( barked, dog) + δ ( barked, barked)) / and smlarly for the other word pars. We get the same fractonal counts for sentence as when we used explct algnments. Then sentence. To take two examples bet bt) bet bt; e, f ) (, ) (, ) / 4 δ bet f j δ bt e bet e ) j ( ) hund dog) c hund dog; e, f ) (, ) (, ) δ hund f j δ dog e hund e ) j ( ) ( Hurray we get the same fractonal counts as wth the explct use of algnments. And we may proceed as we dd there, calculatng frst the total fractonal counts, tc, and then the translaton probabltes, t. Step a Maxmze the translaton probabltes We can harvest the award when we come to the next round and want to calculate the fractonal counts. Take an example from sentence : hund barked; e, f hund barked) ) (, ) (, ) δ hund f j δ barked e f e ) j Whch s close enough to the result we got by takng the long route (gven that we use calculator and round off for each round). The mracle s that ths works equally well on sentence, for example: hund dog; e, f hund dog) ) (, ) (, ) δ hund f j δ dog e hund e ) j.6585?

8 Summng up Ths concludes the examples. Hopefully t s now possble to better see: The motvaton between the orgnal approach where we explctly calculate algnments That the faster algorthm yelds the same results as the orgnal algorthm, at least on the example we explctly calculated. And that even though t may be hard to see step by step that the two algorthms produce the same results n general, we may open up to the dea. That the fast algorthm s computatonally tractable. 8

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law:

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law: CE304, Sprng 2004 Lecture 4 Introducton to Vapor/Lqud Equlbrum, part 2 Raoult s Law: The smplest model that allows us do VLE calculatons s obtaned when we assume that the vapor phase s an deal gas, and

More information

Case A. P k = Ni ( 2L i k 1 ) + (# big cells) 10d 2 P k.

Case A. P k = Ni ( 2L i k 1 ) + (# big cells) 10d 2 P k. THE CELLULAR METHOD In ths lecture, we ntroduce the cellular method as an approach to ncdence geometry theorems lke the Szemeréd-Trotter theorem. The method was ntroduced n the paper Combnatoral complexty

More information

THE SUMMATION NOTATION Ʃ

THE SUMMATION NOTATION Ʃ Sngle Subscrpt otaton THE SUMMATIO OTATIO Ʃ Most of the calculatons we perform n statstcs are repettve operatons on lsts of numbers. For example, we compute the sum of a set of numbers, or the sum of the

More information

Section 8.3 Polar Form of Complex Numbers

Section 8.3 Polar Form of Complex Numbers 80 Chapter 8 Secton 8 Polar Form of Complex Numbers From prevous classes, you may have encountered magnary numbers the square roots of negatve numbers and, more generally, complex numbers whch are the

More information

Grover s Algorithm + Quantum Zeno Effect + Vaidman

Grover s Algorithm + Quantum Zeno Effect + Vaidman Grover s Algorthm + Quantum Zeno Effect + Vadman CS 294-2 Bomb 10/12/04 Fall 2004 Lecture 11 Grover s algorthm Recall that Grover s algorthm for searchng over a space of sze wors as follows: consder the

More information

Assortment Optimization under MNL

Assortment Optimization under MNL Assortment Optmzaton under MNL Haotan Song Aprl 30, 2017 1 Introducton The assortment optmzaton problem ams to fnd the revenue-maxmzng assortment of products to offer when the prces of products are fxed.

More information

Lecture 12: Discrete Laplacian

Lecture 12: Discrete Laplacian Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly

More information

C/CS/Phy191 Problem Set 3 Solutions Out: Oct 1, 2008., where ( 00. ), so the overall state of the system is ) ( ( ( ( 00 ± 11 ), Φ ± = 1

C/CS/Phy191 Problem Set 3 Solutions Out: Oct 1, 2008., where ( 00. ), so the overall state of the system is ) ( ( ( ( 00 ± 11 ), Φ ± = 1 C/CS/Phy9 Problem Set 3 Solutons Out: Oct, 8 Suppose you have two qubts n some arbtrary entangled state ψ You apply the teleportaton protocol to each of the qubts separately What s the resultng state obtaned

More information

Homework Assignment 3 Due in class, Thursday October 15

Homework Assignment 3 Due in class, Thursday October 15 Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

More information

On the correction of the h-index for career length

On the correction of the h-index for career length 1 On the correcton of the h-ndex for career length by L. Egghe Unverstet Hasselt (UHasselt), Campus Depenbeek, Agoralaan, B-3590 Depenbeek, Belgum 1 and Unverstet Antwerpen (UA), IBW, Stadscampus, Venusstraat

More information

Module 9. Lecture 6. Duality in Assignment Problems

Module 9. Lecture 6. Duality in Assignment Problems Module 9 1 Lecture 6 Dualty n Assgnment Problems In ths lecture we attempt to answer few other mportant questons posed n earler lecture for (AP) and see how some of them can be explaned through the concept

More information

1 Matrix representations of canonical matrices

1 Matrix representations of canonical matrices 1 Matrx representatons of canoncal matrces 2-d rotaton around the orgn: ( ) cos θ sn θ R 0 = sn θ cos θ 3-d rotaton around the x-axs: R x = 1 0 0 0 cos θ sn θ 0 sn θ cos θ 3-d rotaton around the y-axs:

More information

1 GSW Iterative Techniques for y = Ax

1 GSW Iterative Techniques for y = Ax 1 for y = A I m gong to cheat here. here are a lot of teratve technques that can be used to solve the general case of a set of smultaneous equatons (wrtten n the matr form as y = A), but ths chapter sn

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

1 Convex Optimization

1 Convex Optimization Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,

More information

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture # 15 Scribe: Jieming Mao April 1, 2013

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture # 15 Scribe: Jieming Mao April 1, 2013 COS 511: heoretcal Machne Learnng Lecturer: Rob Schapre Lecture # 15 Scrbe: Jemng Mao Aprl 1, 013 1 Bref revew 1.1 Learnng wth expert advce Last tme, we started to talk about learnng wth expert advce.

More information

Lecture 10: May 6, 2013

Lecture 10: May 6, 2013 TTIC/CMSC 31150 Mathematcal Toolkt Sprng 013 Madhur Tulsan Lecture 10: May 6, 013 Scrbe: Wenje Luo In today s lecture, we manly talked about random walk on graphs and ntroduce the concept of graph expander,

More information

Chapter Newton s Method

Chapter Newton s Method Chapter 9. Newton s Method After readng ths chapter, you should be able to:. Understand how Newton s method s dfferent from the Golden Secton Search method. Understand how Newton s method works 3. Solve

More information

Vapnik-Chervonenkis theory

Vapnik-Chervonenkis theory Vapnk-Chervonenks theory Rs Kondor June 13, 2008 For the purposes of ths lecture, we restrct ourselves to the bnary supervsed batch learnng settng. We assume that we have an nput space X, and an unknown

More information

A Robust Method for Calculating the Correlation Coefficient

A Robust Method for Calculating the Correlation Coefficient A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal

More information

Limited Dependent Variables

Limited Dependent Variables Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages

More information

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

More information

PHYS 705: Classical Mechanics. Newtonian Mechanics

PHYS 705: Classical Mechanics. Newtonian Mechanics 1 PHYS 705: Classcal Mechancs Newtonan Mechancs Quck Revew of Newtonan Mechancs Basc Descrpton: -An dealzed pont partcle or a system of pont partcles n an nertal reference frame [Rgd bodes (ch. 5 later)]

More information

The Feynman path integral

The Feynman path integral The Feynman path ntegral Aprl 3, 205 Hesenberg and Schrödnger pctures The Schrödnger wave functon places the tme dependence of a physcal system n the state, ψ, t, where the state s a vector n Hlbert space

More information

COS 511: Theoretical Machine Learning

COS 511: Theoretical Machine Learning COS 5: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture #0 Scrbe: José Sões Ferrera March 06, 203 In the last lecture the concept of Radeacher coplexty was ntroduced, wth the goal of showng that

More information

Lecture 7: Boltzmann distribution & Thermodynamics of mixing

Lecture 7: Boltzmann distribution & Thermodynamics of mixing Prof. Tbbtt Lecture 7 etworks & Gels Lecture 7: Boltzmann dstrbuton & Thermodynamcs of mxng 1 Suggested readng Prof. Mark W. Tbbtt ETH Zürch 13 März 018 Molecular Drvng Forces Dll and Bromberg: Chapters

More information

Density matrix. c α (t)φ α (q)

Density matrix. c α (t)φ α (q) Densty matrx Note: ths s supplementary materal. I strongly recommend that you read t for your own nterest. I beleve t wll help wth understandng the quantum ensembles, but t s not necessary to know t n

More information

18.1 Introduction and Recap

18.1 Introduction and Recap CS787: Advanced Algorthms Scrbe: Pryananda Shenoy and Shjn Kong Lecturer: Shuch Chawla Topc: Streamng Algorthmscontnued) Date: 0/26/2007 We contnue talng about streamng algorthms n ths lecture, ncludng

More information

1 The Mistake Bound Model

1 The Mistake Bound Model 5-850: Advanced Algorthms CMU, Sprng 07 Lecture #: Onlne Learnng and Multplcatve Weghts February 7, 07 Lecturer: Anupam Gupta Scrbe: Bryan Lee,Albert Gu, Eugene Cho he Mstake Bound Model Suppose there

More information

Physics 240: Worksheet 30 Name:

Physics 240: Worksheet 30 Name: (1) One mole of an deal monatomc gas doubles ts temperature and doubles ts volume. What s the change n entropy of the gas? () 1 kg of ce at 0 0 C melts to become water at 0 0 C. What s the change n entropy

More information

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #16 Scribe: Yannan Wang April 3, 2014

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #16 Scribe: Yannan Wang April 3, 2014 COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture #16 Scrbe: Yannan Wang Aprl 3, 014 1 Introducton The goal of our onlne learnng scenaro from last class s C comparng wth best expert and

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

Complex Numbers. x = B B 2 4AC 2A. or x = x = 2 ± 4 4 (1) (5) 2 (1)

Complex Numbers. x = B B 2 4AC 2A. or x = x = 2 ± 4 4 (1) (5) 2 (1) Complex Numbers If you have not yet encountered complex numbers, you wll soon do so n the process of solvng quadratc equatons. The general quadratc equaton Ax + Bx + C 0 has solutons x B + B 4AC A For

More information

Split alignment. Martin C. Frith April 13, 2012

Split alignment. Martin C. Frith April 13, 2012 Splt algnment Martn C. Frth Aprl 13, 2012 1 Introducton Ths document s about algnng a query sequence to a genome, allowng dfferent parts of the query to match dfferent parts of the genome. Here are some

More information

Randomness and Computation

Randomness and Computation Randomness and Computaton or, Randomzed Algorthms Mary Cryan School of Informatcs Unversty of Ednburgh RC 208/9) Lecture 0 slde Balls n Bns m balls, n bns, and balls thrown unformly at random nto bns usually

More information

Notes on Frequency Estimation in Data Streams

Notes on Frequency Estimation in Data Streams Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to

More information

Lecture 4. Instructor: Haipeng Luo

Lecture 4. Instructor: Haipeng Luo Lecture 4 Instructor: Hapeng Luo In the followng lectures, we focus on the expert problem and study more adaptve algorthms. Although Hedge s proven to be worst-case optmal, one may wonder how well t would

More information

Chapter Twelve. Integration. We now turn our attention to the idea of an integral in dimensions higher than one. Consider a real-valued function f : D

Chapter Twelve. Integration. We now turn our attention to the idea of an integral in dimensions higher than one. Consider a real-valued function f : D Chapter Twelve Integraton 12.1 Introducton We now turn our attenton to the dea of an ntegral n dmensons hgher than one. Consder a real-valued functon f : R, where the doman s a nce closed subset of Eucldean

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

MA 323 Geometric Modelling Course Notes: Day 13 Bezier Curves & Bernstein Polynomials

MA 323 Geometric Modelling Course Notes: Day 13 Bezier Curves & Bernstein Polynomials MA 323 Geometrc Modellng Course Notes: Day 13 Bezer Curves & Bernsten Polynomals Davd L. Fnn Over the past few days, we have looked at de Casteljau s algorthm for generatng a polynomal curve, and we have

More information

Module 2. Random Processes. Version 2 ECE IIT, Kharagpur

Module 2. Random Processes. Version 2 ECE IIT, Kharagpur Module Random Processes Lesson 6 Functons of Random Varables After readng ths lesson, ou wll learn about cdf of functon of a random varable. Formula for determnng the pdf of a random varable. Let, X be

More information

Structure and Drive Paul A. Jensen Copyright July 20, 2003

Structure and Drive Paul A. Jensen Copyright July 20, 2003 Structure and Drve Paul A. Jensen Copyrght July 20, 2003 A system s made up of several operatons wth flow passng between them. The structure of the system descrbes the flow paths from nputs to outputs.

More information

Basic Regular Expressions. Introduction. Introduction to Computability. Theory. Motivation. Lecture4: Regular Expressions

Basic Regular Expressions. Introduction. Introduction to Computability. Theory. Motivation. Lecture4: Regular Expressions Introducton to Computablty Theory Lecture: egular Expressons Prof Amos Israel Motvaton If one wants to descrbe a regular language, La, she can use the a DFA, Dor an NFA N, such L ( D = La that that Ths

More information

Important Instructions to the Examiners:

Important Instructions to the Examiners: Summer 0 Examnaton Subject & Code: asc Maths (70) Model Answer Page No: / Important Instructons to the Examners: ) The Answers should be examned by key words and not as word-to-word as gven n the model

More information

HMMT February 2016 February 20, 2016

HMMT February 2016 February 20, 2016 HMMT February 016 February 0, 016 Combnatorcs 1. For postve ntegers n, let S n be the set of ntegers x such that n dstnct lnes, no three concurrent, can dvde a plane nto x regons (for example, S = {3,

More information

EPR Paradox and the Physical Meaning of an Experiment in Quantum Mechanics. Vesselin C. Noninski

EPR Paradox and the Physical Meaning of an Experiment in Quantum Mechanics. Vesselin C. Noninski EPR Paradox and the Physcal Meanng of an Experment n Quantum Mechancs Vesseln C Nonnsk vesselnnonnsk@verzonnet Abstract It s shown that there s one purely determnstc outcome when measurement s made on

More information

Course 395: Machine Learning - Lectures

Course 395: Machine Learning - Lectures Course 395: Machne Learnng - Lectures Lecture 1-2: Concept Learnng (M. Pantc Lecture 3-4: Decson Trees & CC Intro (M. Pantc Lecture 5-6: Artfcal Neural Networks (S.Zaferou Lecture 7-8: Instance ased Learnng

More information

Computational Biology Lecture 8: Substitution matrices Saad Mneimneh

Computational Biology Lecture 8: Substitution matrices Saad Mneimneh Computatonal Bology Lecture 8: Substtuton matrces Saad Mnemneh As we have ntroduced last tme, smple scorng schemes lke + or a match, - or a msmatch and -2 or a gap are not justable bologcally, especally

More information

Hidden Markov Model Cheat Sheet

Hidden Markov Model Cheat Sheet Hdden Markov Model Cheat Sheet (GIT ID: dc2f391536d67ed5847290d5250d4baae103487e) Ths document s a cheat sheet on Hdden Markov Models (HMMs). It resembles lecture notes, excet that t cuts to the chase

More information

Moments of Inertia. and reminds us of the analogous equation for linear momentum p= mv, which is of the form. The kinetic energy of the body is.

Moments of Inertia. and reminds us of the analogous equation for linear momentum p= mv, which is of the form. The kinetic energy of the body is. Moments of Inerta Suppose a body s movng on a crcular path wth constant speed Let s consder two quanttes: the body s angular momentum L about the center of the crcle, and ts knetc energy T How are these

More information

ELASTIC WAVE PROPAGATION IN A CONTINUOUS MEDIUM

ELASTIC WAVE PROPAGATION IN A CONTINUOUS MEDIUM ELASTIC WAVE PROPAGATION IN A CONTINUOUS MEDIUM An elastc wave s a deformaton of the body that travels throughout the body n all drectons. We can examne the deformaton over a perod of tme by fxng our look

More information

From Biot-Savart Law to Divergence of B (1)

From Biot-Savart Law to Divergence of B (1) From Bot-Savart Law to Dvergence of B (1) Let s prove that Bot-Savart gves us B (r ) = 0 for an arbtrary current densty. Frst take the dvergence of both sdes of Bot-Savart. The dervatve s wth respect to

More information

Difference Equations

Difference Equations Dfference Equatons c Jan Vrbk 1 Bascs Suppose a sequence of numbers, say a 0,a 1,a,a 3,... s defned by a certan general relatonshp between, say, three consecutve values of the sequence, e.g. a + +3a +1

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016

CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016 CS 29-128: Algorthms and Uncertanty Lecture 17 Date: October 26, 2016 Instructor: Nkhl Bansal Scrbe: Mchael Denns 1 Introducton In ths lecture we wll be lookng nto the secretary problem, and an nterestng

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

Solutions to Problem Set 6

Solutions to Problem Set 6 Solutons to Problem Set 6 Problem 6. (Resdue theory) a) Problem 4.7.7 Boas. n ths problem we wll solve ths ntegral: x sn x x + 4x + 5 dx: To solve ths usng the resdue theorem, we study ths complex ntegral:

More information

An Experiment/Some Intuition (Fall 2006): Lecture 18 The EM Algorithm heads coin 1 tails coin 2 Overview Maximum Likelihood Estimation

An Experiment/Some Intuition (Fall 2006): Lecture 18 The EM Algorithm heads coin 1 tails coin 2 Overview Maximum Likelihood Estimation An Experment/Some Intuton I have three cons n my pocket, 6.864 (Fall 2006): Lecture 18 The EM Algorthm Con 0 has probablty λ of heads; Con 1 has probablty p 1 of heads; Con 2 has probablty p 2 of heads

More information

THE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE

THE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE THE ROYAL STATISTICAL SOCIETY 6 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER I STATISTICAL THEORY The Socety provdes these solutons to assst canddates preparng for the eamnatons n future years and for

More information

Gaussian Mixture Models

Gaussian Mixture Models Lab Gaussan Mxture Models Lab Objectve: Understand the formulaton of Gaussan Mxture Models (GMMs) and how to estmate GMM parameters. You ve already seen GMMs as the observaton dstrbuton n certan contnuous

More information

MAXIMUM A POSTERIORI TRANSDUCTION

MAXIMUM A POSTERIORI TRANSDUCTION MAXIMUM A POSTERIORI TRANSDUCTION LI-WEI WANG, JU-FU FENG School of Mathematcal Scences, Peng Unversty, Bejng, 0087, Chna Center for Informaton Scences, Peng Unversty, Bejng, 0087, Chna E-MIAL: {wanglw,

More information

CS286r Assign One. Answer Key

CS286r Assign One. Answer Key CS286r Assgn One Answer Key 1 Game theory 1.1 1.1.1 Let off-equlbrum strateges also be that people contnue to play n Nash equlbrum. Devatng from any Nash equlbrum s a weakly domnated strategy. That s,

More information

Turing Machines (intro)

Turing Machines (intro) CHAPTER 3 The Church-Turng Thess Contents Turng Machnes defntons, examples, Turng-recognzable and Turng-decdable languages Varants of Turng Machne Multtape Turng machnes, non-determnstc Turng Machnes,

More information

Workshop: Approximating energies and wave functions Quantum aspects of physical chemistry

Workshop: Approximating energies and wave functions Quantum aspects of physical chemistry Workshop: Approxmatng energes and wave functons Quantum aspects of physcal chemstry http://quantum.bu.edu/pltl/6/6.pdf Last updated Thursday, November 7, 25 7:9:5-5: Copyrght 25 Dan Dll (dan@bu.edu) Department

More information

} Often, when learning, we deal with uncertainty:

} Often, when learning, we deal with uncertainty: Uncertanty and Learnng } Often, when learnng, we deal wth uncertanty: } Incomplete data sets, wth mssng nformaton } Nosy data sets, wth unrelable nformaton } Stochastcty: causes and effects related non-determnstcally

More information

6.3.4 Modified Euler s method of integration

6.3.4 Modified Euler s method of integration 6.3.4 Modfed Euler s method of ntegraton Before dscussng the applcaton of Euler s method for solvng the swng equatons, let us frst revew the basc Euler s method of numercal ntegraton. Let the general from

More information

Lecture Nov

Lecture Nov Lecture 18 Nov 07 2008 Revew Clusterng Groupng smlar obects nto clusters Herarchcal clusterng Agglomeratve approach (HAC: teratvely merge smlar clusters Dfferent lnkage algorthms for computng dstances

More information

11 Tail Inequalities Markov s Inequality. Lecture 11: Tail Inequalities [Fa 13]

11 Tail Inequalities Markov s Inequality. Lecture 11: Tail Inequalities [Fa 13] Algorthms Lecture 11: Tal Inequaltes [Fa 13] If you hold a cat by the tal you learn thngs you cannot learn any other way. Mark Twan 11 Tal Inequaltes The smple recursve structure of skp lsts made t relatvely

More information

8.6 The Complex Number System

8.6 The Complex Number System 8.6 The Complex Number System Earler n the chapter, we mentoned that we cannot have a negatve under a square root, snce the square of any postve or negatve number s always postve. In ths secton we want

More information

Bit Juggling. Representing Information. representations. - Some other bits. - Representing information using bits - Number. Chapter

Bit Juggling. Representing Information. representations. - Some other bits. - Representing information using bits - Number. Chapter Representng Informaton 1 1 1 1 Bt Jugglng - Representng nformaton usng bts - Number representatons - Some other bts Chapter 3.1-3.3 REMINDER: Problem Set #1 s now posted and s due next Wednesday L3 Encodng

More information

CHEM 112 Exam 3 Practice Test Solutions

CHEM 112 Exam 3 Practice Test Solutions CHEM 11 Exam 3 Practce Test Solutons 1A No matter what temperature the reacton takes place, the product of [OH-] x [H+] wll always equal the value of w. Therefore, f you take the square root of the gven

More information

10-701/ Machine Learning, Fall 2005 Homework 3

10-701/ Machine Learning, Fall 2005 Homework 3 10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40

More information

Credit Card Pricing and Impact of Adverse Selection

Credit Card Pricing and Impact of Adverse Selection Credt Card Prcng and Impact of Adverse Selecton Bo Huang and Lyn C. Thomas Unversty of Southampton Contents Background Aucton model of credt card solctaton - Errors n probablty of beng Good - Errors n

More information

= z 20 z n. (k 20) + 4 z k = 4

= z 20 z n. (k 20) + 4 z k = 4 Problem Set #7 solutons 7.2.. (a Fnd the coeffcent of z k n (z + z 5 + z 6 + z 7 + 5, k 20. We use the known seres expanson ( n+l ( z l l z n below: (z + z 5 + z 6 + z 7 + 5 (z 5 ( + z + z 2 + z + 5 5

More information

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011 Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected

More information

12. The Hamilton-Jacobi Equation Michael Fowler

12. The Hamilton-Jacobi Equation Michael Fowler 1. The Hamlton-Jacob Equaton Mchael Fowler Back to Confguraton Space We ve establshed that the acton, regarded as a functon of ts coordnate endponts and tme, satsfes ( ) ( ) S q, t / t+ H qpt,, = 0, and

More information

Lecture 17 : Stochastic Processes II

Lecture 17 : Stochastic Processes II : Stochastc Processes II 1 Contnuous-tme stochastc process So far we have studed dscrete-tme stochastc processes. We studed the concept of Makov chans and martngales, tme seres analyss, and regresson analyss

More information

Homework Notes Week 7

Homework Notes Week 7 Homework Notes Week 7 Math 4 Sprng 4 #4 (a Complete the proof n example 5 that s an nner product (the Frobenus nner product on M n n (F In the example propertes (a and (d have already been verfed so we

More information

Linear Regression Analysis: Terminology and Notation

Linear Regression Analysis: Terminology and Notation ECON 35* -- Secton : Basc Concepts of Regresson Analyss (Page ) Lnear Regresson Analyss: Termnology and Notaton Consder the generc verson of the smple (two-varable) lnear regresson model. It s represented

More information

Advanced Quantum Mechanics

Advanced Quantum Mechanics Advanced Quantum Mechancs Rajdeep Sensarma! sensarma@theory.tfr.res.n ecture #9 QM of Relatvstc Partcles Recap of ast Class Scalar Felds and orentz nvarant actons Complex Scalar Feld and Charge conjugaton

More information

Sampling Theory MODULE VII LECTURE - 23 VARYING PROBABILITY SAMPLING

Sampling Theory MODULE VII LECTURE - 23 VARYING PROBABILITY SAMPLING Samplng heory MODULE VII LECURE - 3 VARYIG PROBABILIY SAMPLIG DR. SHALABH DEPARME OF MAHEMAICS AD SAISICS IDIA ISIUE OF ECHOLOGY KAPUR he smple random samplng scheme provdes a random sample where every

More information

Salmon: Lectures on partial differential equations. Consider the general linear, second-order PDE in the form. ,x 2

Salmon: Lectures on partial differential equations. Consider the general linear, second-order PDE in the form. ,x 2 Salmon: Lectures on partal dfferental equatons 5. Classfcaton of second-order equatons There are general methods for classfyng hgher-order partal dfferental equatons. One s very general (applyng even to

More information

Lecture 4: Universal Hash Functions/Streaming Cont d

Lecture 4: Universal Hash Functions/Streaming Cont d CSE 5: Desgn and Analyss of Algorthms I Sprng 06 Lecture 4: Unversal Hash Functons/Streamng Cont d Lecturer: Shayan Oves Gharan Aprl 6th Scrbe: Jacob Schreber Dsclamer: These notes have not been subjected

More information

Learning Theory: Lecture Notes

Learning Theory: Lecture Notes Learnng Theory: Lecture Notes Lecturer: Kamalka Chaudhur Scrbe: Qush Wang October 27, 2012 1 The Agnostc PAC Model Recall that one of the constrants of the PAC model s that the data dstrbuton has to be

More information

Supplementary Notes for Chapter 9 Mixture Thermodynamics

Supplementary Notes for Chapter 9 Mixture Thermodynamics Supplementary Notes for Chapter 9 Mxture Thermodynamcs Key ponts Nne major topcs of Chapter 9 are revewed below: 1. Notaton and operatonal equatons for mxtures 2. PVTN EOSs for mxtures 3. General effects

More information

For example, if the drawing pin was tossed 200 times and it landed point up on 140 of these trials,

For example, if the drawing pin was tossed 200 times and it landed point up on 140 of these trials, Probablty In ths actvty you wll use some real data to estmate the probablty of an event happenng. You wll also use a varety of methods to work out theoretcal probabltes. heoretcal and expermental probabltes

More information

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg prnceton unv. F 17 cos 521: Advanced Algorthm Desgn Lecture 7: LP Dualty Lecturer: Matt Wenberg Scrbe: LP Dualty s an extremely useful tool for analyzng structural propertes of lnear programs. Whle there

More information

Rockefeller College University at Albany

Rockefeller College University at Albany Rockefeller College Unverst at Alban PAD 705 Handout: Maxmum Lkelhood Estmaton Orgnal b Davd A. Wse John F. Kenned School of Government, Harvard Unverst Modfcatons b R. Karl Rethemeer Up to ths pont n

More information

20. Mon, Oct. 13 What we have done so far corresponds roughly to Chapters 2 & 3 of Lee. Now we turn to Chapter 4. The first idea is connectedness.

20. Mon, Oct. 13 What we have done so far corresponds roughly to Chapters 2 & 3 of Lee. Now we turn to Chapter 4. The first idea is connectedness. 20. Mon, Oct. 13 What we have done so far corresponds roughly to Chapters 2 & 3 of Lee. Now we turn to Chapter 4. The frst dea s connectedness. Essentally, we want to say that a space cannot be decomposed

More information

AS-Level Maths: Statistics 1 for Edexcel

AS-Level Maths: Statistics 1 for Edexcel 1 of 6 AS-Level Maths: Statstcs 1 for Edecel S1. Calculatng means and standard devatons Ths con ndcates the slde contans actvtes created n Flash. These actvtes are not edtable. For more detaled nstructons,

More information

Errors for Linear Systems

Errors for Linear Systems Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch

More information

Text S1: Detailed proofs for The time scale of evolutionary innovation

Text S1: Detailed proofs for The time scale of evolutionary innovation Text S: Detaled proofs for The tme scale of evolutonary nnovaton Krshnendu Chatterjee Andreas Pavloganns Ben Adlam Martn A. Nowak. Overvew and Organzaton We wll present detaled proofs of all our results.

More information

STATISTICAL MECHANICS

STATISTICAL MECHANICS STATISTICAL MECHANICS Thermal Energy Recall that KE can always be separated nto 2 terms: KE system = 1 2 M 2 total v CM KE nternal Rgd-body rotaton and elastc / sound waves Use smplfyng assumptons KE of

More information

Expectation Maximization Mixture Models HMMs

Expectation Maximization Mixture Models HMMs -755 Machne Learnng for Sgnal Processng Mture Models HMMs Class 9. 2 Sep 200 Learnng Dstrbutons for Data Problem: Gven a collecton of eamples from some data, estmate ts dstrbuton Basc deas of Mamum Lelhood

More information

An Explicit Construction of an Expander Family (Margulis-Gaber-Galil)

An Explicit Construction of an Expander Family (Margulis-Gaber-Galil) An Explct Constructon of an Expander Famly Marguls-Gaber-Gall) Orr Paradse July 4th 08 Prepared for a Theorst's Toolkt, a course taught by Irt Dnur at the Wezmann Insttute of Scence. The purpose of these

More information

A particle in a state of uniform motion remain in that state of motion unless acted upon by external force.

A particle in a state of uniform motion remain in that state of motion unless acted upon by external force. The fundamental prncples of classcal mechancs were lad down by Galleo and Newton n the 16th and 17th centures. In 1686, Newton wrote the Prncpa where he gave us three laws of moton, one law of gravty,

More information

Outline and Reading. Dynamic Programming. Dynamic Programming revealed. Computing Fibonacci. The General Dynamic Programming Technique

Outline and Reading. Dynamic Programming. Dynamic Programming revealed. Computing Fibonacci. The General Dynamic Programming Technique Outlne and Readng Dynamc Programmng The General Technque ( 5.3.2) -1 Knapsac Problem ( 5.3.3) Matrx Chan-Product ( 5.3.1) Dynamc Programmng verson 1.4 1 Dynamc Programmng verson 1.4 2 Dynamc Programmng

More information

COS 521: Advanced Algorithms Game Theory and Linear Programming

COS 521: Advanced Algorithms Game Theory and Linear Programming COS 521: Advanced Algorthms Game Theory and Lnear Programmng Moses Charkar February 27, 2013 In these notes, we ntroduce some basc concepts n game theory and lnear programmng (LP). We show a connecton

More information

10.34 Fall 2015 Metropolis Monte Carlo Algorithm

10.34 Fall 2015 Metropolis Monte Carlo Algorithm 10.34 Fall 2015 Metropols Monte Carlo Algorthm The Metropols Monte Carlo method s very useful for calculatng manydmensonal ntegraton. For e.g. n statstcal mechancs n order to calculate the prospertes of

More information