Note on EMtraining of IBMmodel 1


 Leon Crawford
 4 years ago
 Views:
Transcription
1 Note on EMtranng of IBMmodel INF58 Language Technologcal Applcatons, Fall The sldes on ths subject (nf58 6.pdf) ncludng the example seem nsuffcent to gve a good grasp of what s gong on. Hence here are some supplementary notes wth more detals. Hopefully they make thngs clearer. The man dea There are two man tems nvolved:  Translaton probabltes  Word algnments The translaton probabltes are assgned to the blngual lexcon: For a par of words (e,f) n the lexcon, how probable s t that e gets translated as f, expressed by f e). Beware, ths s calculated from the whole corpus; we do not consder these probabltes for a sngle sentence. A word algnment s assgned to a par of sentences (e, f). (We are usng bold face to ndcate that e s a strng (array) of words e, e, e k, etc.) When we have a parallel corpus where the sentences are sentence algned whch may be expressed by (e, f ), (e, f ),,(e m, f m ) we are consderng the algnment of each sentence par ndvdually. Ideally, we are lookng for the best algnment of each sentence. But as we do not know t, we wll nstead consder the probablty of the varous algnments of the sentence. For each sentence, the probablty of the varous algnments must add to. The EMtranng then goes as follows. Intalzng a. We start wth ntalzng t. When we don t have other nformaton, we ntalze t unformly. That s, f e) /s, where s s the number of Fwords n the lexcon. b. For each sentence n the corpus, we estmate the probablty dstrbuton for the varous algnments of the sentence. Ths s done on the bass of t, and should reflect t: For example, f f e k ) f j e k ), then algnments whch algns f to e k should be tmes more probable than they whch algn f j to e k. (Well, actually, n round ths s more trval, snce all algnments are equally lkely when we start wth a unform t.). Next round a. We count how many tmes a word e s translated as f on the bass of the probablty dstrbutons for the sentences. Ths s a fractonal count. Gven a sentence par (e j, f j ). If e occurs n e j. and f occurs n f j, we consder algnments whch algns f to e. Gven such an algnment, a, we consder ts probablty P(a), and from ths algnment we count that e s translated as f P(a) many tmes. For example, f P(a) s., we wll add. to the count of how many tmes e s translated as f. After we have done ths for all algnments of all the sentences, we can recalculate t. The notaton n Koehn s book for the dfferent counts and measures s not stellar, but as we adopted the same notaton n the sldes, we wll stck to t to make the smlartes transparent. Koehn used the notaton f e) for the fractonal count of the
2 par (e,f) n a partcular sentence. To make t clear that t s the count n the specfc sentence par (e, f), he also uses the notaton f e; f, e ). To ndcate the fractonal count of the word (type) par (e,f) over the whole corpus, he uses (f,e) f e; f, e ) (.e. we add the fractonal counts for all the sentences.) An alternatve notaton for the same would have been m f e; f, e ) gven there are m sentences n the corpus. We ntroduced the notaton tc for total count for ths on the sldes. tf e) f e; f, e) (f,e) The reestmated translaton probablty can then be calculated from ths f e) ( f, e ) f e; f, e) f e; f, e) Here f vares over all the Fwords n the lexcon. b. Wth these new translaton probabltes, we may return to the algnments, and for each sentence estmate the best probablty dstrbuton over the possble algnments. Ths tme there s no smple way as there was n round. For each algnment, we calculate a probablty on the bass of t, and normalze to make sure that the sum of the probabltes for each sentence add up to. Next round: a. We go about exactly as n step (a). On the bass of the algnment probabltes estmated n step (b), we may now calculate new translatons probabltes t, b. And on the bass of the translaton probabltes estmate new algnment probabltes. And so we may repeat the two steps as long as we lke Propertes What s nce wth ths algorthm s:  We can prove that the result gets better (or stay the same) after each round. It never deterorates.  The result converges towards a local optmum.  For IBM model (but not n general) ths local optmum s also a global optmum. The fast way We have descrbed here the underlyng dea of the algorthm. The descrpton above s probably the best for understandng what s gong on. There s a problem when applyng t. There are so many (too many) dfferent algnments. We therefore derved a modfed algorthm where we do not calculate the probabltes of the actual algnments. Instead we calculate the translaton probablty n step (a) drectly from the translaton probabltes from step (a) and the translaton probabltes n step (a) drectly from the translaton probabltes n (a) wthout actually calculatng the ntermedate algnment probabltes (step b). f ( f, e) f ' t f e) t f ' e)
3 Examples There s a very smple example n Jurafsky and Martn whch llustrates the calculaton wth the orgnal algorthm. You should consult ths frst. In the example n the lecture, we followed the modfed algorthm where we sdestep the actual algnments. Let us now see how the example from the lecture would go wth the full algorthm frst (smlarly to the JurafskyMartn example), before we compare t to the example from the lecture wth some more detals flled n. We wll number the examples Sentence :  e : dog barked  f : hund bjeffet Sentence :  e : dog bt dog  f : hund bet hund to have the smplest example frst. The theoretcal sound, but computatonally ntractable way: Step a  Intalzaton. Snce there are Norwegan words all f e) s set to /. hund dog) / bet dog) / bjeffet dog) / hund bt) / bet bt) / bjeffet bt) / hund barked) / bet barked) / bjeffet barked) / hund ) / bet ) / bjeffet ) / Step b Algnments We must also nclude n the Esentence to ndcate that a word n the Fsentence may be algned to nothng. Each of the words n the sentence f may come from one of dfferent words n sentence e. Hence there are 9 dfferent algnments: <,>, <,>, <,>, <,>, <,>, <,>, <,>, <,>, <,>. Snce all translaton probabltes are equally lkely, each algnment wll have the same probablty. Snce there are 9 dfferent algnments, each of them wll have the probablty /9. Wrtng a for the algnment probablty of the frst sentence, we have a (<,>) a (<,>) a (<,>)/9. For sentence, there are words n f.. Each of them may be algned to any of 4 dfferent words n e (ncludng ). Hence there are 4*4*464 dfferent algnments, rangng from <,,> to <,,>. We could take the easy way out and say that each of them s equally lkely, hence a (<,,>) a (<,,>) a (<,,>)/64. But to prepare our understandng for later rounds, let us see what happens f we follow the recpe. To calculate the probablty of one partcular algnment, we multply together the nvolved translaton probabltes, eg. P (<,,>) hund dog)*bet bt)*hund )/7. In ths round, we get exactly the same result for all the algnments, /7. But that sn t the same as /64. Has anythng gone wrong here? No. The score /7 s not the probablty of the algnment. To get at the probablty we must normalze. Frst we sum together the scores for all the algnments whch yelds 64/7. Then to get the probablty for each algnment, we must dvde the second wth ths sum. Hence the probablty for each algnment s (/7)/(64/7) a complcated way to wrte /64.
4 Step a Maxmze the translaton probabltes Then the show may start. We frst calculate the fractonal counts for the word pars n the lexcon, and we do ths sentence by sentence, startng wth sentence. To take one example, what s the fractonal count of (dog, hund) n sentence? We must see whch algnments whch algn the two words. There are : <,>, <,>, <,>. (A good advce at ths pont s to draw the algnments whle you read.) To get the fractonal count we must add the probabltes of these algnments,.e., hund dog; f, e ) a (<,>)+ a (<,>)+ a (<,>)*(/9) /. We can repeat for the par (hund, barked) and get hund barked; f, e ) a (<,>)+ a (<,>)+ a (<,>)*(/9) /, and so on. We see we get the same for all word pars n ths sentence hund dog) / bjeffet dog) / hund barked) / bjeffet barked) / hund ) / bjeffet ) / (There s a typo n the lecture sldes and n the frst verson of these notes, wrtng t nstead of c n the rght column. The same for sentence.) Sentence s more extng. Consder frst the par (bet, bt). They get algned by all algnments of the form <x,, y> where x and y are any of,,,. There are 6 such algnments. (We don t bother to wrte them out). Each algnment has probablty /64. Hence bet bt; f, e ) 6/64 ¼ Smlarly we get bet ; f, e ) 6/64 ¼. To count the par (dog, bet), they are algned by all algnments of the form <x,,y> and all algnments of the form <x,,y>, hence bet dog; f, e ) *6/64 ½ To count the par (bt, hund), we must consder both algnments of the form <,x,y> and of the form <x,y,>. (Observe that <,x,> should be counted twce snce two occurrences of hund are algned to bt.) And to count the par (hund, dog), we must consder all algnments <,x,y>, <,x,y>, <x,y,> and <x,y,>. We get the followng counts for sentence : hund dog) bet dog) / hund bt) ½ bet bt) /4 hund ) / bet ) /4 4
5 We get the total counts (tc) by addng the fractonal counts for all the sentences n the corpus resultng n thund dog) +/ tbet dog) / tbjeffet dog) / t* dog)4/+/+/ /6 thund bt) ½ tbet bt) ¼ tbjeffet bt) t* bt)/4 thund barked) / tbet barked) tbjeffet barked) / t* barked) / thund ) ½+/ tbet ) /4 tbjeffet ) / t* )7/ In the last column we have added all the total counts for one E word, e.g. t dog) tf e; f, e ) f We can then fnally calculate the new translaton probabltes: e f f e) exact decmal hund (5/6)/(7/)/ bet (/4)/(7/)/ bjeffet (/)/(7/)4/7.594 dog hund (4/)/(/6) 8/.6585 dog bet (/)/(/6) /.769 dog bjeffet (/)/(/6) /.5846 bt hund (/)/(/4) / bt bet (/4)/(/4) /. barked hund (/)/(/ /.5 barked bjeffet (/)/(/) /.5 5
6 Step b_ Estmate algnment probabltes It s tme to estmate the algnment probabltes agan. Remember ths s done sentence by sentence, startng wth sentence. There are 9 dfferent algnments to consder. For each of them we may calculate an ntal unnormalzed probablty, call t P, on the bass of the last translaton probabltes. P PP /,44546 P (<,>) hund )*bjeffet ) (/7)*(/7),86,7848 P (<,>) hund )*bjeffet dog) (/7)*(/),94977,69766 P (<,>) hund )*bjeffet barked) (/7)*(/),948,794 P (<,>) hund dog)*bjeffet ) (8/)*(/7),8597,76778 P (<,>) hund dog)*bjeffet dog) (8/)*(/),946746,66994 P (<,>) hund dog)*bjeffet barked) (8/)*(/),769,75 P (<,>) hund barked)*bjeffet ) (/)*(/7),885,6775 P (<,>) hund barked)*bjeffet dog) (/)*(/),769,5485 P (<,>) hund barked)*bjeffet barked) (/)*(/),5,7675 Sum of P s,44546 We sum the P scores (last lne) and normalze them n the last column to get the probablty dstrbuton over the algnments. We may do the same for sentence. But because there are 64 dfferent algnments we refran from carryng out the detals. Step a Maxmze the translaton probabltes We proceed exactly as n step a. We frst collect the fractonal counts sentence by sentence, startng wth sentence. For example, we get hund barked; f, e ) a (<,>)+ a (<,>)+ a (<,>),6775+,5485+,7675,949 And smlarly for the other fractonal counts n sentence. Snce we have not calculated the algnments for sentence, we stop here. Hopefully the dea s clear by now. The fast lane Manually we refran from calculatng 64 algnments, but t wouldn t have been a problem for a machne. However, a short sentence of words has roughly algnments and soon also the machnes must gve n. Let us repeat the calculatons from the sldes from the lecture. The pont s that we skp the algnments and pass drectly from step a to step a and then to step a etc. The key s the formula m k f e) f e; e, f ) δ ( f, f ) δ ( e, e ) whch lets us calculate fractonal counts drectly from (last round of) translaton probabltes. k j f e j ) 6
7 Step a Maxmze the translaton probabltes To understand the formula, f j refers to the word at poston j n sentence f. Thus n sentence, f f s hund, δ f, f j for j, whle, δ f, f j for j. Smlarly, e refers to the word n poston n the Englsh strng. Hence, hund barked) hund barked; e, f) (, ) (, ) δ hund f j δ barked e f e ) j ( δ ( hund, hund) + δ ( hund, bjeffet)) ( ) ( δ ( barked,) + δ ( barked, dog) + δ ( barked, barked)) / and smlarly for the other word pars. We get the same fractonal counts for sentence as when we used explct algnments. Then sentence. To take two examples bet bt) bet bt; e, f ) (, ) (, ) / 4 δ bet f j δ bt e bet e ) j ( ) hund dog) c hund dog; e, f ) (, ) (, ) δ hund f j δ dog e hund e ) j ( ) ( Hurray we get the same fractonal counts as wth the explct use of algnments. And we may proceed as we dd there, calculatng frst the total fractonal counts, tc, and then the translaton probabltes, t. Step a Maxmze the translaton probabltes We can harvest the award when we come to the next round and want to calculate the fractonal counts. Take an example from sentence : hund barked; e, f hund barked) ) (, ) (, ) δ hund f j δ barked e f e ) j Whch s close enough to the result we got by takng the long route (gven that we use calculator and round off for each round). The mracle s that ths works equally well on sentence, for example: hund dog; e, f hund dog) ) (, ) (, ) δ hund f j δ dog e hund e ) j.6585?
8 Summng up Ths concludes the examples. Hopefully t s now possble to better see: The motvaton between the orgnal approach where we explctly calculate algnments That the faster algorthm yelds the same results as the orgnal algorthm, at least on the example we explctly calculated. And that even though t may be hard to see step by step that the two algorthms produce the same results n general, we may open up to the dea. That the fast algorthm s computatonally tractable. 8
Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law:
CE304, Sprng 2004 Lecture 4 Introducton to Vapor/Lqud Equlbrum, part 2 Raoult s Law: The smplest model that allows us do VLE calculatons s obtaned when we assume that the vapor phase s an deal gas, and
More informationCase A. P k = Ni ( 2L i k 1 ) + (# big cells) 10d 2 P k.
THE CELLULAR METHOD In ths lecture, we ntroduce the cellular method as an approach to ncdence geometry theorems lke the SzemerédTrotter theorem. The method was ntroduced n the paper Combnatoral complexty
More informationTHE SUMMATION NOTATION Ʃ
Sngle Subscrpt otaton THE SUMMATIO OTATIO Ʃ Most of the calculatons we perform n statstcs are repettve operatons on lsts of numbers. For example, we compute the sum of a set of numbers, or the sum of the
More informationSection 8.3 Polar Form of Complex Numbers
80 Chapter 8 Secton 8 Polar Form of Complex Numbers From prevous classes, you may have encountered magnary numbers the square roots of negatve numbers and, more generally, complex numbers whch are the
More informationGrover s Algorithm + Quantum Zeno Effect + Vaidman
Grover s Algorthm + Quantum Zeno Effect + Vadman CS 2942 Bomb 10/12/04 Fall 2004 Lecture 11 Grover s algorthm Recall that Grover s algorthm for searchng over a space of sze wors as follows: consder the
More informationAssortment Optimization under MNL
Assortment Optmzaton under MNL Haotan Song Aprl 30, 2017 1 Introducton The assortment optmzaton problem ams to fnd the revenuemaxmzng assortment of products to offer when the prces of products are fxed.
More informationLecture 12: Discrete Laplacian
Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly
More informationC/CS/Phy191 Problem Set 3 Solutions Out: Oct 1, 2008., where ( 00. ), so the overall state of the system is ) ( ( ( ( 00 ± 11 ), Φ ± = 1
C/CS/Phy9 Problem Set 3 Solutons Out: Oct, 8 Suppose you have two qubts n some arbtrary entangled state ψ You apply the teleportaton protocol to each of the qubts separately What s the resultng state obtaned
More informationHomework Assignment 3 Due in class, Thursday October 15
Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.
More informationOn the correction of the hindex for career length
1 On the correcton of the hndex for career length by L. Egghe Unverstet Hasselt (UHasselt), Campus Depenbeek, Agoralaan, B3590 Depenbeek, Belgum 1 and Unverstet Antwerpen (UA), IBW, Stadscampus, Venusstraat
More informationModule 9. Lecture 6. Duality in Assignment Problems
Module 9 1 Lecture 6 Dualty n Assgnment Problems In ths lecture we attempt to answer few other mportant questons posed n earler lecture for (AP) and see how some of them can be explaned through the concept
More information1 Matrix representations of canonical matrices
1 Matrx representatons of canoncal matrces 2d rotaton around the orgn: ( ) cos θ sn θ R 0 = sn θ cos θ 3d rotaton around the xaxs: R x = 1 0 0 0 cos θ sn θ 0 sn θ cos θ 3d rotaton around the yaxs:
More information1 GSW Iterative Techniques for y = Ax
1 for y = A I m gong to cheat here. here are a lot of teratve technques that can be used to solve the general case of a set of smultaneous equatons (wrtten n the matr form as y = A), but ths chapter sn
More information2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification
E395  Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton
More informationKernel Methods and SVMs Extension
Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general
More information1 Convex Optimization
Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,
More informationCOS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture # 15 Scribe: Jieming Mao April 1, 2013
COS 511: heoretcal Machne Learnng Lecturer: Rob Schapre Lecture # 15 Scrbe: Jemng Mao Aprl 1, 013 1 Bref revew 1.1 Learnng wth expert advce Last tme, we started to talk about learnng wth expert advce.
More informationLecture 10: May 6, 2013
TTIC/CMSC 31150 Mathematcal Toolkt Sprng 013 Madhur Tulsan Lecture 10: May 6, 013 Scrbe: Wenje Luo In today s lecture, we manly talked about random walk on graphs and ntroduce the concept of graph expander,
More informationChapter Newton s Method
Chapter 9. Newton s Method After readng ths chapter, you should be able to:. Understand how Newton s method s dfferent from the Golden Secton Search method. Understand how Newton s method works 3. Solve
More informationVapnikChervonenkis theory
VapnkChervonenks theory Rs Kondor June 13, 2008 For the purposes of ths lecture, we restrct ourselves to the bnary supervsed batch learnng settng. We assume that we have an nput space X, and an unknown
More informationA Robust Method for Calculating the Correlation Coefficient
A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal
More informationLimited Dependent Variables
Lmted Dependent Varables. What f the lefthand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages
More informationProblem Set 9 Solutions
Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem
More informationPHYS 705: Classical Mechanics. Newtonian Mechanics
1 PHYS 705: Classcal Mechancs Newtonan Mechancs Quck Revew of Newtonan Mechancs Basc Descrpton: An dealzed pont partcle or a system of pont partcles n an nertal reference frame [Rgd bodes (ch. 5 later)]
More informationThe Feynman path integral
The Feynman path ntegral Aprl 3, 205 Hesenberg and Schrödnger pctures The Schrödnger wave functon places the tme dependence of a physcal system n the state, ψ, t, where the state s a vector n Hlbert space
More informationCOS 511: Theoretical Machine Learning
COS 5: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture #0 Scrbe: José Sões Ferrera March 06, 203 In the last lecture the concept of Radeacher coplexty was ntroduced, wth the goal of showng that
More informationLecture 7: Boltzmann distribution & Thermodynamics of mixing
Prof. Tbbtt Lecture 7 etworks & Gels Lecture 7: Boltzmann dstrbuton & Thermodynamcs of mxng 1 Suggested readng Prof. Mark W. Tbbtt ETH Zürch 13 März 018 Molecular Drvng Forces Dll and Bromberg: Chapters
More informationDensity matrix. c α (t)φ α (q)
Densty matrx Note: ths s supplementary materal. I strongly recommend that you read t for your own nterest. I beleve t wll help wth understandng the quantum ensembles, but t s not necessary to know t n
More information18.1 Introduction and Recap
CS787: Advanced Algorthms Scrbe: Pryananda Shenoy and Shjn Kong Lecturer: Shuch Chawla Topc: Streamng Algorthmscontnued) Date: 0/26/2007 We contnue talng about streamng algorthms n ths lecture, ncludng
More information1 The Mistake Bound Model
5850: Advanced Algorthms CMU, Sprng 07 Lecture #: Onlne Learnng and Multplcatve Weghts February 7, 07 Lecturer: Anupam Gupta Scrbe: Bryan Lee,Albert Gu, Eugene Cho he Mstake Bound Model Suppose there
More informationPhysics 240: Worksheet 30 Name:
(1) One mole of an deal monatomc gas doubles ts temperature and doubles ts volume. What s the change n entropy of the gas? () 1 kg of ce at 0 0 C melts to become water at 0 0 C. What s the change n entropy
More informationCOS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #16 Scribe: Yannan Wang April 3, 2014
COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture #16 Scrbe: Yannan Wang Aprl 3, 014 1 Introducton The goal of our onlne learnng scenaro from last class s C comparng wth best expert and
More informationFor now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.
Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson
More informationComplex Numbers. x = B B 2 4AC 2A. or x = x = 2 ± 4 4 (1) (5) 2 (1)
Complex Numbers If you have not yet encountered complex numbers, you wll soon do so n the process of solvng quadratc equatons. The general quadratc equaton Ax + Bx + C 0 has solutons x B + B 4AC A For
More informationSplit alignment. Martin C. Frith April 13, 2012
Splt algnment Martn C. Frth Aprl 13, 2012 1 Introducton Ths document s about algnng a query sequence to a genome, allowng dfferent parts of the query to match dfferent parts of the genome. Here are some
More informationRandomness and Computation
Randomness and Computaton or, Randomzed Algorthms Mary Cryan School of Informatcs Unversty of Ednburgh RC 208/9) Lecture 0 slde Balls n Bns m balls, n bns, and balls thrown unformly at random nto bns usually
More informationNotes on Frequency Estimation in Data Streams
Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to
More informationLecture 4. Instructor: Haipeng Luo
Lecture 4 Instructor: Hapeng Luo In the followng lectures, we focus on the expert problem and study more adaptve algorthms. Although Hedge s proven to be worstcase optmal, one may wonder how well t would
More informationChapter Twelve. Integration. We now turn our attention to the idea of an integral in dimensions higher than one. Consider a realvalued function f : D
Chapter Twelve Integraton 12.1 Introducton We now turn our attenton to the dea of an ntegral n dmensons hgher than one. Consder a realvalued functon f : R, where the doman s a nce closed subset of Eucldean
More informationGeneralized Linear Methods
Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set
More informationMA 323 Geometric Modelling Course Notes: Day 13 Bezier Curves & Bernstein Polynomials
MA 323 Geometrc Modellng Course Notes: Day 13 Bezer Curves & Bernsten Polynomals Davd L. Fnn Over the past few days, we have looked at de Casteljau s algorthm for generatng a polynomal curve, and we have
More informationModule 2. Random Processes. Version 2 ECE IIT, Kharagpur
Module Random Processes Lesson 6 Functons of Random Varables After readng ths lesson, ou wll learn about cdf of functon of a random varable. Formula for determnng the pdf of a random varable. Let, X be
More informationStructure and Drive Paul A. Jensen Copyright July 20, 2003
Structure and Drve Paul A. Jensen Copyrght July 20, 2003 A system s made up of several operatons wth flow passng between them. The structure of the system descrbes the flow paths from nputs to outputs.
More informationBasic Regular Expressions. Introduction. Introduction to Computability. Theory. Motivation. Lecture4: Regular Expressions
Introducton to Computablty Theory Lecture: egular Expressons Prof Amos Israel Motvaton If one wants to descrbe a regular language, La, she can use the a DFA, Dor an NFA N, such L ( D = La that that Ths
More informationImportant Instructions to the Examiners:
Summer 0 Examnaton Subject & Code: asc Maths (70) Model Answer Page No: / Important Instructons to the Examners: ) The Answers should be examned by key words and not as wordtoword as gven n the model
More informationHMMT February 2016 February 20, 2016
HMMT February 016 February 0, 016 Combnatorcs 1. For postve ntegers n, let S n be the set of ntegers x such that n dstnct lnes, no three concurrent, can dvde a plane nto x regons (for example, S = {3,
More informationEPR Paradox and the Physical Meaning of an Experiment in Quantum Mechanics. Vesselin C. Noninski
EPR Paradox and the Physcal Meanng of an Experment n Quantum Mechancs Vesseln C Nonnsk vesselnnonnsk@verzonnet Abstract It s shown that there s one purely determnstc outcome when measurement s made on
More informationCourse 395: Machine Learning  Lectures
Course 395: Machne Learnng  Lectures Lecture 12: Concept Learnng (M. Pantc Lecture 34: Decson Trees & CC Intro (M. Pantc Lecture 56: Artfcal Neural Networks (S.Zaferou Lecture 78: Instance ased Learnng
More informationComputational Biology Lecture 8: Substitution matrices Saad Mneimneh
Computatonal Bology Lecture 8: Substtuton matrces Saad Mnemneh As we have ntroduced last tme, smple scorng schemes lke + or a match,  or a msmatch and 2 or a gap are not justable bologcally, especally
More informationHidden Markov Model Cheat Sheet
Hdden Markov Model Cheat Sheet (GIT ID: dc2f391536d67ed5847290d5250d4baae103487e) Ths document s a cheat sheet on Hdden Markov Models (HMMs). It resembles lecture notes, excet that t cuts to the chase
More informationMoments of Inertia. and reminds us of the analogous equation for linear momentum p= mv, which is of the form. The kinetic energy of the body is.
Moments of Inerta Suppose a body s movng on a crcular path wth constant speed Let s consder two quanttes: the body s angular momentum L about the center of the crcle, and ts knetc energy T How are these
More informationELASTIC WAVE PROPAGATION IN A CONTINUOUS MEDIUM
ELASTIC WAVE PROPAGATION IN A CONTINUOUS MEDIUM An elastc wave s a deformaton of the body that travels throughout the body n all drectons. We can examne the deformaton over a perod of tme by fxng our look
More informationFrom BiotSavart Law to Divergence of B (1)
From BotSavart Law to Dvergence of B (1) Let s prove that BotSavart gves us B (r ) = 0 for an arbtrary current densty. Frst take the dvergence of both sdes of BotSavart. The dervatve s wth respect to
More informationDifference Equations
Dfference Equatons c Jan Vrbk 1 Bascs Suppose a sequence of numbers, say a 0,a 1,a,a 3,... s defned by a certan general relatonshp between, say, three consecutve values of the sequence, e.g. a + +3a +1
More informationLecture 10 Support Vector Machines II
Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the faketest data; fxed
More informationCS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016
CS 29128: Algorthms and Uncertanty Lecture 17 Date: October 26, 2016 Instructor: Nkhl Bansal Scrbe: Mchael Denns 1 Introducton In ths lecture we wll be lookng nto the secretary problem, and an nterestng
More informationLecture Notes on Linear Regression
Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume
More informationSolutions to Problem Set 6
Solutons to Problem Set 6 Problem 6. (Resdue theory) a) Problem 4.7.7 Boas. n ths problem we wll solve ths ntegral: x sn x x + 4x + 5 dx: To solve ths usng the resdue theorem, we study ths complex ntegral:
More informationAn Experiment/Some Intuition (Fall 2006): Lecture 18 The EM Algorithm heads coin 1 tails coin 2 Overview Maximum Likelihood Estimation
An Experment/Some Intuton I have three cons n my pocket, 6.864 (Fall 2006): Lecture 18 The EM Algorthm Con 0 has probablty λ of heads; Con 1 has probablty p 1 of heads; Con 2 has probablty p 2 of heads
More informationTHE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE
THE ROYAL STATISTICAL SOCIETY 6 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER I STATISTICAL THEORY The Socety provdes these solutons to assst canddates preparng for the eamnatons n future years and for
More informationGaussian Mixture Models
Lab Gaussan Mxture Models Lab Objectve: Understand the formulaton of Gaussan Mxture Models (GMMs) and how to estmate GMM parameters. You ve already seen GMMs as the observaton dstrbuton n certan contnuous
More informationMAXIMUM A POSTERIORI TRANSDUCTION
MAXIMUM A POSTERIORI TRANSDUCTION LIWEI WANG, JUFU FENG School of Mathematcal Scences, Peng Unversty, Bejng, 0087, Chna Center for Informaton Scences, Peng Unversty, Bejng, 0087, Chna EMIAL: {wanglw,
More informationCS286r Assign One. Answer Key
CS286r Assgn One Answer Key 1 Game theory 1.1 1.1.1 Let offequlbrum strateges also be that people contnue to play n Nash equlbrum. Devatng from any Nash equlbrum s a weakly domnated strategy. That s,
More informationTuring Machines (intro)
CHAPTER 3 The ChurchTurng Thess Contents Turng Machnes defntons, examples, Turngrecognzable and Turngdecdable languages Varants of Turng Machne Multtape Turng machnes, nondetermnstc Turng Machnes,
More informationWorkshop: Approximating energies and wave functions Quantum aspects of physical chemistry
Workshop: Approxmatng energes and wave functons Quantum aspects of physcal chemstry http://quantum.bu.edu/pltl/6/6.pdf Last updated Thursday, November 7, 25 7:9:55: Copyrght 25 Dan Dll (dan@bu.edu) Department
More information} Often, when learning, we deal with uncertainty:
Uncertanty and Learnng } Often, when learnng, we deal wth uncertanty: } Incomplete data sets, wth mssng nformaton } Nosy data sets, wth unrelable nformaton } Stochastcty: causes and effects related nondetermnstcally
More information6.3.4 Modified Euler s method of integration
6.3.4 Modfed Euler s method of ntegraton Before dscussng the applcaton of Euler s method for solvng the swng equatons, let us frst revew the basc Euler s method of numercal ntegraton. Let the general from
More informationLecture Nov
Lecture 18 Nov 07 2008 Revew Clusterng Groupng smlar obects nto clusters Herarchcal clusterng Agglomeratve approach (HAC: teratvely merge smlar clusters Dfferent lnkage algorthms for computng dstances
More information11 Tail Inequalities Markov s Inequality. Lecture 11: Tail Inequalities [Fa 13]
Algorthms Lecture 11: Tal Inequaltes [Fa 13] If you hold a cat by the tal you learn thngs you cannot learn any other way. Mark Twan 11 Tal Inequaltes The smple recursve structure of skp lsts made t relatvely
More information8.6 The Complex Number System
8.6 The Complex Number System Earler n the chapter, we mentoned that we cannot have a negatve under a square root, snce the square of any postve or negatve number s always postve. In ths secton we want
More informationBit Juggling. Representing Information. representations.  Some other bits.  Representing information using bits  Number. Chapter
Representng Informaton 1 1 1 1 Bt Jugglng  Representng nformaton usng bts  Number representatons  Some other bts Chapter 3.13.3 REMINDER: Problem Set #1 s now posted and s due next Wednesday L3 Encodng
More informationCHEM 112 Exam 3 Practice Test Solutions
CHEM 11 Exam 3 Practce Test Solutons 1A No matter what temperature the reacton takes place, the product of [OH] x [H+] wll always equal the value of w. Therefore, f you take the square root of the gven
More information10701/ Machine Learning, Fall 2005 Homework 3
10701/15781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons10701@autonlaborg for queston Problem 1 Regresson and Crossvaldaton [40
More informationCredit Card Pricing and Impact of Adverse Selection
Credt Card Prcng and Impact of Adverse Selecton Bo Huang and Lyn C. Thomas Unversty of Southampton Contents Background Aucton model of credt card solctaton  Errors n probablty of beng Good  Errors n
More information= z 20 z n. (k 20) + 4 z k = 4
Problem Set #7 solutons 7.2.. (a Fnd the coeffcent of z k n (z + z 5 + z 6 + z 7 + 5, k 20. We use the known seres expanson ( n+l ( z l l z n below: (z + z 5 + z 6 + z 7 + 5 (z 5 ( + z + z 2 + z + 5 5
More informationStanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011
Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected
More information12. The HamiltonJacobi Equation Michael Fowler
1. The HamltonJacob Equaton Mchael Fowler Back to Confguraton Space We ve establshed that the acton, regarded as a functon of ts coordnate endponts and tme, satsfes ( ) ( ) S q, t / t+ H qpt,, = 0, and
More informationLecture 17 : Stochastic Processes II
: Stochastc Processes II 1 Contnuoustme stochastc process So far we have studed dscretetme stochastc processes. We studed the concept of Makov chans and martngales, tme seres analyss, and regresson analyss
More informationHomework Notes Week 7
Homework Notes Week 7 Math 4 Sprng 4 #4 (a Complete the proof n example 5 that s an nner product (the Frobenus nner product on M n n (F In the example propertes (a and (d have already been verfed so we
More informationLinear Regression Analysis: Terminology and Notation
ECON 35*  Secton : Basc Concepts of Regresson Analyss (Page ) Lnear Regresson Analyss: Termnology and Notaton Consder the generc verson of the smple (twovarable) lnear regresson model. It s represented
More informationAdvanced Quantum Mechanics
Advanced Quantum Mechancs Rajdeep Sensarma! sensarma@theory.tfr.res.n ecture #9 QM of Relatvstc Partcles Recap of ast Class Scalar Felds and orentz nvarant actons Complex Scalar Feld and Charge conjugaton
More informationSampling Theory MODULE VII LECTURE  23 VARYING PROBABILITY SAMPLING
Samplng heory MODULE VII LECURE  3 VARYIG PROBABILIY SAMPLIG DR. SHALABH DEPARME OF MAHEMAICS AD SAISICS IDIA ISIUE OF ECHOLOGY KAPUR he smple random samplng scheme provdes a random sample where every
More informationSalmon: Lectures on partial differential equations. Consider the general linear, secondorder PDE in the form. ,x 2
Salmon: Lectures on partal dfferental equatons 5. Classfcaton of secondorder equatons There are general methods for classfyng hgherorder partal dfferental equatons. One s very general (applyng even to
More informationLecture 4: Universal Hash Functions/Streaming Cont d
CSE 5: Desgn and Analyss of Algorthms I Sprng 06 Lecture 4: Unversal Hash Functons/Streamng Cont d Lecturer: Shayan Oves Gharan Aprl 6th Scrbe: Jacob Schreber Dsclamer: These notes have not been subjected
More informationLearning Theory: Lecture Notes
Learnng Theory: Lecture Notes Lecturer: Kamalka Chaudhur Scrbe: Qush Wang October 27, 2012 1 The Agnostc PAC Model Recall that one of the constrants of the PAC model s that the data dstrbuton has to be
More informationSupplementary Notes for Chapter 9 Mixture Thermodynamics
Supplementary Notes for Chapter 9 Mxture Thermodynamcs Key ponts Nne major topcs of Chapter 9 are revewed below: 1. Notaton and operatonal equatons for mxtures 2. PVTN EOSs for mxtures 3. General effects
More informationFor example, if the drawing pin was tossed 200 times and it landed point up on 140 of these trials,
Probablty In ths actvty you wll use some real data to estmate the probablty of an event happenng. You wll also use a varety of methods to work out theoretcal probabltes. heoretcal and expermental probabltes
More informationprinceton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg
prnceton unv. F 17 cos 521: Advanced Algorthm Desgn Lecture 7: LP Dualty Lecturer: Matt Wenberg Scrbe: LP Dualty s an extremely useful tool for analyzng structural propertes of lnear programs. Whle there
More informationRockefeller College University at Albany
Rockefeller College Unverst at Alban PAD 705 Handout: Maxmum Lkelhood Estmaton Orgnal b Davd A. Wse John F. Kenned School of Government, Harvard Unverst Modfcatons b R. Karl Rethemeer Up to ths pont n
More information20. Mon, Oct. 13 What we have done so far corresponds roughly to Chapters 2 & 3 of Lee. Now we turn to Chapter 4. The first idea is connectedness.
20. Mon, Oct. 13 What we have done so far corresponds roughly to Chapters 2 & 3 of Lee. Now we turn to Chapter 4. The frst dea s connectedness. Essentally, we want to say that a space cannot be decomposed
More informationASLevel Maths: Statistics 1 for Edexcel
1 of 6 ASLevel Maths: Statstcs 1 for Edecel S1. Calculatng means and standard devatons Ths con ndcates the slde contans actvtes created n Flash. These actvtes are not edtable. For more detaled nstructons,
More informationErrors for Linear Systems
Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons Â and ˆb avalable. Then the best thng we can do s to solve Âˆx ˆb exactly whch
More informationText S1: Detailed proofs for The time scale of evolutionary innovation
Text S: Detaled proofs for The tme scale of evolutonary nnovaton Krshnendu Chatterjee Andreas Pavloganns Ben Adlam Martn A. Nowak. Overvew and Organzaton We wll present detaled proofs of all our results.
More informationSTATISTICAL MECHANICS
STATISTICAL MECHANICS Thermal Energy Recall that KE can always be separated nto 2 terms: KE system = 1 2 M 2 total v CM KE nternal Rgdbody rotaton and elastc / sound waves Use smplfyng assumptons KE of
More informationExpectation Maximization Mixture Models HMMs
755 Machne Learnng for Sgnal Processng Mture Models HMMs Class 9. 2 Sep 200 Learnng Dstrbutons for Data Problem: Gven a collecton of eamples from some data, estmate ts dstrbuton Basc deas of Mamum Lelhood
More informationAn Explicit Construction of an Expander Family (MargulisGaberGalil)
An Explct Constructon of an Expander Famly MargulsGaberGall) Orr Paradse July 4th 08 Prepared for a Theorst's Toolkt, a course taught by Irt Dnur at the Wezmann Insttute of Scence. The purpose of these
More informationA particle in a state of uniform motion remain in that state of motion unless acted upon by external force.
The fundamental prncples of classcal mechancs were lad down by Galleo and Newton n the 16th and 17th centures. In 1686, Newton wrote the Prncpa where he gave us three laws of moton, one law of gravty,
More informationOutline and Reading. Dynamic Programming. Dynamic Programming revealed. Computing Fibonacci. The General Dynamic Programming Technique
Outlne and Readng Dynamc Programmng The General Technque ( 5.3.2) 1 Knapsac Problem ( 5.3.3) Matrx ChanProduct ( 5.3.1) Dynamc Programmng verson 1.4 1 Dynamc Programmng verson 1.4 2 Dynamc Programmng
More informationCOS 521: Advanced Algorithms Game Theory and Linear Programming
COS 521: Advanced Algorthms Game Theory and Lnear Programmng Moses Charkar February 27, 2013 In these notes, we ntroduce some basc concepts n game theory and lnear programmng (LP). We show a connecton
More information10.34 Fall 2015 Metropolis Monte Carlo Algorithm
10.34 Fall 2015 Metropols Monte Carlo Algorthm The Metropols Monte Carlo method s very useful for calculatng manydmensonal ntegraton. For e.g. n statstcal mechancs n order to calculate the prospertes of
More information