Statistical Machine Translation
|
|
- Gwen Stone
- 6 years ago
- Views:
Transcription
1 Statistical Machie Traslatio LECTURE 5 HIGHER IBM MODELS APRIL 6 200
2 Brief Outlie - IBM Model 2 - IBM Model 3 - IBM Model 4 - IBM Model 5 Ref: The Mathematics of Statistical Machie Traslatio: Parameter Estimatio - Peter F Brow et.al. Computatioal Liguistics Vol 9 No
3 IBM Model 2 3
4 IBM Model 2 Model takes o otice of where the words appear i the traslatio: E.g. questa casa è bella aturalmete Of course this house is beautiful This house beautiful is of course Are equally probable uder Model Model 2 takes care of this. 4
5 Aligmet Model: IBM Model 2 The assumptio is that the traslatio of f to e i depeds upo aligmet probability: P a ( i m ) Q. What does it mea?? Q2. How to compute?? 5
6 IBM Model 2 Thus traslatio is a two-step process: Lexical Traslatio step: modeled by t(e p f q ) Aligmet Step: modeled by P a ( i m ) e.g. Questa casa è bella aturalmete this house is beautiful aturally Naturally this house is beautiful Traslatio Step Aligmet Step 6
7 IBM Model 2 Uder this model we have: p( e a f) = c m i= t( e Hece : p( e f) = p ( e a f ) = c = c a... t ( e a( ) = 0 a( m) = 0 = m = i= 0 t ( e i m f f i a ) ( i) ) pa ( a( i) i m ) p a f a ( ) ) pa ( a ( ) m ) ( a( ) m ) 7
8 IBM Model 2 We eed to maximize this p Subect to the followig costraits: ( e f ). i t ( e f ) = = i 2. p ( i m ) a = 0 = i = m Thus we have a larger set of Lagragia costats 8
9 IBM Model 2 The auxiliary fuctio becomes: h( t p λ µ ) a = c m.... t ( e f a ( ) a( ) = 0 a( m) = 0 = ) p a ( a( ) m ) λ ( t ( e f ) ) µ ( im i i i p a ( i m ) ) To fid the extremum we eed to differetiate 9
10 IBM Model 2 We ow eed a ew cout: cout( i m ; f e) the expected umber of times the word i positio i of the TL strig e is coected to the word i the positio of the SL strig f give that their legths are m respectively. cout( i m ; f e) = a p ( a e f) * δ( a( i)) 0
11 IBM Model 2 So here we shall look at groups of setece pairs Who satisfy the m ad criterio. The we look at the aligmet probabilities. Example: m = 4 = 3 aami bari achchhi I am goig home > 2>0 3>3 4>2 tumi ki khachchho What are you eatig >2 2>0 3> 4>3 kaal tomar aam ki What is your ame >3 2>0 3> 4>2 kothay chhile Where were you yesterday >2 2>3 3>0 4>
12 IBM Model 2 The by aalogy with the Model we get: ) ; ( ) ( f e m i cout m i p im a µ = for a sigle traslatio ad S ) ; ( ) ( ) ( ) ( s s S s im a m i cout m i p f e = = µ for a set of traslatios 2
13 IBM Model 2 Although apparetly the expressio for cout is complicated we ca make it simple as i the case of Model : cout( i m ; f e) = t( e i f ) p a ( i m ) t( e i 0 f ) p ( i m ) t( e f ) p ( i m ) 0 a i a 3
14 IBM Model 2 Oe ca ow desig a algorithm for Expectatio Maximizatio as i case of Model 4
15 IBM Model 3-5 5
16 Itermodel Iterlude Models ad 2 have bee created o the basis of the followig geeralized priciple: p( e a f) = p( m f) * m i = p( a( i) a()..( i ) e Note that each a() takes value betwee 0 to It works o the followig: * p( e i.. i m f) a()...( i) e... i m f) 6
17 Itermodel Iterlude It represets the oit likelihood of e ad a as A product of coditioal probabilities. Each product correspods to a geerative process for developig e ad a from f. - Choose the legth of the traslatio e - Decide which positio i f correspods to e ad the idetity of e is. - Do the same for positios 2 to m 7
18 IBM Model 3-5 Here we cosider the fertility of a word i couctio with the Word model. How may e-words a sigle f-word will produce is NOT determiistic. E.g. Noostate (It) >> despite eve though i spite of (E) Cosequetly correspodig to each word of f we get a radom variable φ f which gives the fertility of f icludig 0. I all models 3-5 fertility is explicitly modeled 8
19 IBM Model 3-5 Example: Dovete adare il gioro dopo (It) May have may possible Eglish traslatios: - You must go ext day - You have to go the followig day - You ought to be there o the followig day - You will have to go there o the followig day - I ask you to go there o the followig day Ca we ow get the aligmet? Look at the fertility of the source words. 9
20 IBM Model 3-5 Fertility: ) we ca assume that the fertility of each Word is govered by a probability distributio p( f) 2) Deals explicitly with droppig of iput words by puttig = 0. 3) Similarly we ca cotrol words i TL setece that have NO equivalet i f callig them NULL words. Thus models 3-5 are Geerative Process give a f-strig we first decide the fertility of each word ad a list of e-words to coect to it. This list is called a Tablet. 20
21 IBM Model 3-5 Defiitios: Tablet: Give a f word the list of words that may coect to it. Tableau: A collectio of Tablets. Notatio: T tableau for f. T tablet for the th f-word. T k k th e-word i th tablet T. 2
22 IBM Model 3-5 Example: Come ti chiami (It ) Tableau Come ti chiami Tablet (T ) Tablet 2 (T 2 ) Tablet 3 (T 3 ) T = Like T 2 = you T 3 = call T 2 = What T 22 = yourself T 32 = Address T 3 = As T 23 = thyself T 4 = How 22
23 IBM Model 3-5 = ) ( f p π τ * ) ( ) ( 0 = f p f p φ φ φ φ * ) p( f φ τ τ τ φ The geeratio is as per the followig formula 23 * ) ( f p k k k φ τ τ τ = = * ) ( 0 0. f p i k k k φ τ π π π φ = = = ) ( φ φ τ π π π k l l l k k f p
24 IBM Model 3-5 After choosig the Tableau the words are Permuted to geerate e. This permutatio is a radom variable п The positio i e of the k th word of the th Tablet is called п k. I these models the geerative process is expressed as a oit likelihood for a tableau τ ad a permutatio π i The followig way: 24
25 First Step: Compute φ = IBM Model 3-5 p( φ φ f f) p( φ 0 φ ) - Determie the umber of tokes that f will produce. - This will deped upo the o. of words produced by f f -. - Determie φ 0 the umber of words geerated out of NULL. 25
26 IBM Model 3-5 Secod Step: Compute φ = 0 k= τ k p( τ τ τ φ k. k 0 0 f) - Determie the k th word produced by f - This depeds o all the words produced by f f -. ad all the words produced so far by f 26
27 Third Step: Compute IBM Model 3-5 φ = k= p( π π π. τ0 φ 0 k f) - Determie π the positio i e of the k th k word produced by f. - This depeds o the positios of all the words produced so far. k i 27
28 IBM Model 3-5 Fourth Step: Compute φ 0 k= p( π π π τ φ f) 0 k 0 k Determie π the positio i e of the k th 0k word produced by NULL.. - This depeds o the positios of all the words produced so far. Thus the fial expressio is a product of 4 expressios: 28
29 IBM Model 3-5 p( τ π f)= = p( φ φ φ = 0 k = f) p( φ φ 0 f) p( τ τ τ k. k 0 φ 0 φ = k= * f) * p( π π π. τ k k i 0 φ 0 φ 0 k= p( π f) π π τ φ f) 0 k 0 k l 0 l 0 l * 29
30 IBM Model 3-5 This obviously is very difficult to maipulate. Hece cocessios are made for differet models. The cocessios come i the form of assumptios. Let us first look at the IBM Model 3. 30
31 IBM Model 3 3
32 IBM Model 3 Assumptios. For betwee ad p( φ φ f ) depeds oly o ad f. φ 2. For betwee ad k depeds oly o ad f. τ k τ τ τ φ p( f). k For betwee ad p( πk π. k π i τ0 φ0 f) depeds oly o m. π k This reduces the umber of variables 32
33 IBM Model 3 Thus parameters for Model 3 are:. A set of Fertility probabilities: which is equal to p φ φ ) ( f η ( φ f ) 2. A set of Trasitio probabilities t( e f ) which is equal to p( τ k = e τ. k τ0 φ 0 f) 3. A set of Distortio probabilities d( i m ) which is equal to p π = i π π τ φ ) ( k. k i 0 0 f 33
34 IBM Model 3 The distortio ad fertility probabilities for f 0 (NULL) are treated i a differet way: These are meat for hadlig the words i TL setece Which caot be accouted for. Obviously they are plugged-i oce all the words Are place for =... φ So = m ϕ ϕ ϕ ϕ 2 0 We have to estimate these probabilities. 34
35 IBM Model 3 It is assumed that each of the Tableau word ca produce at most oe NULL word. Assume each Tableau word produces a NULL word with Prob. p ad does ot produce oe with Prob. p 0 Hece p(φ 0 ) = = ϕ + ϕ2 + + ϕ ϕ 0 pϕ0 ϕ ϕ ϕ 2 ϕ 0 p0 + m 2ϕ 0 ϕ0 2ϕ 0 0 ϕ p p m
36 IBM Model 3 As with Models ad 2 a aligmet of (e f ) is Determied by specifyig a(i) for each positio of the TL strig. The fertilities φ = 0.. are fuctios of the a( ) s.: φ is equal to the umber of i s such that a(i) =. Hece P (e f ) ca be obtaied as summig over All the aligmets: P (e f ) =.... p ( e a f a ( ) = 0 a ( m ) = 0 ) 36
37 = m m 2ϕ 0... p ϕ ( ) 0 a m) 0 ϕ 0 a IBM Model 3 m i= t( e i f 2ϕ = ( = = = a( i) ) d( i a( i) m ) With the followig costraits p m φ! η ( φ f ) *. t ( e f ) = 2. d ( i m ) = e η( f ) = 3. φ 4. p 0 + p = φ i 37
38 IBM Model 3 Remarks:. Here also we have expoetial umber of aligmets. 2. Cout collectio is too high eve for moderate legth setece. 3. Samplig is used from the space of possible aligmets 4. Samplig should be such that most probable oes are icluded. 38
39 IBM Model 3 Remarks: 5. Still it is much harder for Model Hece Hill-climbig type heuristics are used. 7. Typically they start from Model solutio. 8. From there go to eighborig aligmetswhere distace betwee two aligmets is measured o the o. of poits they differ. 39
40 IBM Model 4 40
41 IBM Model 4 Model 3 has bee foud to be a very powerful oe. It takes care of all the maor aspects: - word traslatio - reorderig - isertio of words - droppig of words - oe to may traslatio But it has oe maor shortcomig: formulatio of distortio probabilities d(i m ) 4
42 IBM Model 4 Model 3 does ot take ito accout the followig fact: ofte a group of words are traslated together ad therefore whe they move they move together. E.g Ridig a bicycle >> i sella a ua bicicletta Comig here ridig a bicycle is dagerous >> veire qui i sella a ua bicicletta è pericoloso Try may seteces with the phrase ridig a bicycle oe ca otice that the phrase i sella a ua bicicletta will remai together. But Model 3 cosiders the distortio probabilities i isolatio. 42
43 IBM Model 4 Model 4 itroduces the cocept of Relative Distortio It assumes that the placemet of the traslatio of a Iput word is based o the placemet of the precedig iput word. It is however difficult to coceptualize: as words are beig added dropped coverted from oe-to-may. Model 4 is based aroud the cocept of cept. 43
44 IBM Model 4 Defiitio: each iput word that is aliged to at least oe output word is called a cept. Typically represeted by [ ] or π. Defiitio: the ceilig of the average of the positios is called the ceter of a cept. We shall deote as C. For each output word the Relative Distortio is defied With the help of cepts. Let us first see a example: 44
45 IBM Model 4 Cosider the followig Begali-Eglish pair: φ lambaa moto chhele cycle-e chore aaschhe A tall boy is comig ridig a bicycle Note: The does ot alig with aythig! moto a orametatio is ot a cept. 45
46 IBM Model 4 cept л л2 л3 л4 л5 Foreig Word Positio Foreig Word lambaa chheleta cycle-e chore aaschhe Eglish Word Tall boy a bicycle ridig is comig Eglish word positio Ceter of cept
47 Relative Distortio: IBM Model 4 -Words geerated by φ are Uiformly distributed. - The positio of the first word of a cept is defied w.r.t. the cetre of the previous cept. d ( C - ) Cosider for example : the word ridig it is geerated by cept 4 (л4) its Eglish positio is: 6 Cetere of the precedig cept is 8. Thus there is a distorio of -2. This shows a forward movemet of the word. Normally the distortio will be + 47
48 IBM Model 4 Relative Distortio: - For subsequet words of a cept the positio is defied w.r.t. the positio of the previous word of the same cept. d > ( л k- ) Where л k- refers to the k th word of the th cept. For example i a bicycle is comig the distortio Probability of the secod word is calculated i relatio with the previous word. 48
49 IBM Model 5 49
50 IBM Model 5 The key term for Model 5 is Deficiecy. Models 3 & 4 do ot take care of whether two words Are beig put i the same place. Thus it puts positive probabilities o some impossible Traslatios. I Model 5 the distortio probabilities are calculated By cosiderig cepts (as before) plus vacacies. Also it takes care of the problem of multiple tableaus. This makes it a better word-based model. 50
51 IBM Model 5 Model 5 keeps track of the vacacies i the m-word log e setece. Let - v max be the maximum o. of vacacies possible. - v be the o. of vacacies available i the setece e i the positios [ ] Hece the distortio probabilities are fuctios of 3 quatities: d ( v C - v max ) Similarly the relative distortio of the subsequet words i the cept are: d > ( v vл k- v max ) 5
52 Coclusio Still we go by word based traslatio. Ca we do better? Because lookig at traslatios As word-by-word is ot the best thig. E.G The trai is i. The trai is i motio. The trai is i statio. The trai is dager. Proper traslatio demads that we eed to see the Word alog with the cotext. This gives us the cocept of Phrase-based Traslatio 52
53 Thak You 53
( ) = p and P( i = b) = q.
MATH 540 Radom Walks Part 1 A radom walk X is special stochastic process that measures the height (or value) of a particle that radomly moves upward or dowward certai fixed amouts o each uit icremet of
More informationThe Random Walk For Dummies
The Radom Walk For Dummies Richard A Mote Abstract We look at the priciples goverig the oe-dimesioal discrete radom walk First we review five basic cocepts of probability theory The we cosider the Beroulli
More informationSequences, Mathematical Induction, and Recursion. CSE 2353 Discrete Computational Structures Spring 2018
CSE 353 Discrete Computatioal Structures Sprig 08 Sequeces, Mathematical Iductio, ad Recursio (Chapter 5, Epp) Note: some course slides adopted from publisher-provided material Overview May mathematical
More informationMath 155 (Lecture 3)
Math 55 (Lecture 3) September 8, I this lecture, we ll cosider the aswer to oe of the most basic coutig problems i combiatorics Questio How may ways are there to choose a -elemet subset of the set {,,,
More informationCS284A: Representations and Algorithms in Molecular Biology
CS284A: Represetatios ad Algorithms i Molecular Biology Scribe Notes o Lectures 3 & 4: Motif Discovery via Eumeratio & Motif Represetatio Usig Positio Weight Matrix Joshua Gervi Based o presetatios by
More informationSequences and Series of Functions
Chapter 6 Sequeces ad Series of Fuctios 6.1. Covergece of a Sequece of Fuctios Poitwise Covergece. Defiitio 6.1. Let, for each N, fuctio f : A R be defied. If, for each x A, the sequece (f (x)) coverges
More information6 Integers Modulo n. integer k can be written as k = qn + r, with q,r, 0 r b. So any integer.
6 Itegers Modulo I Example 2.3(e), we have defied the cogruece of two itegers a,b with respect to a modulus. Let us recall that a b (mod ) meas a b. We have proved that cogruece is a equivalece relatio
More information4.3 Growth Rates of Solutions to Recurrences
4.3. GROWTH RATES OF SOLUTIONS TO RECURRENCES 81 4.3 Growth Rates of Solutios to Recurreces 4.3.1 Divide ad Coquer Algorithms Oe of the most basic ad powerful algorithmic techiques is divide ad coquer.
More informationLecture 11: Pseudorandom functions
COM S 6830 Cryptography Oct 1, 2009 Istructor: Rafael Pass 1 Recap Lecture 11: Pseudoradom fuctios Scribe: Stefao Ermo Defiitio 1 (Ge, Ec, Dec) is a sigle message secure ecryptio scheme if for all uppt
More informationProbability, Expectation Value and Uncertainty
Chapter 1 Probability, Expectatio Value ad Ucertaity We have see that the physically observable properties of a quatum system are represeted by Hermitea operators (also referred to as observables ) such
More informationInfinite Sequences and Series
Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet
More informationw (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ.
2 5. Weighted umber of late jobs 5.1. Release dates ad due dates: maximimizig the weight of o-time jobs Oce we add release dates, miimizig the umber of late jobs becomes a sigificatly harder problem. For
More informationCSE 527, Additional notes on MLE & EM
CSE 57 Lecture Notes: MLE & EM CSE 57, Additioal otes o MLE & EM Based o earlier otes by C. Grat & M. Narasimha Itroductio Last lecture we bega a examiatio of model based clusterig. This lecture will be
More informationDiscrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22
CS 70 Discrete Mathematics for CS Sprig 2007 Luca Trevisa Lecture 22 Aother Importat Distributio The Geometric Distributio Questio: A biased coi with Heads probability p is tossed repeatedly util the first
More informationSEQUENCES AND SERIES
Sequeces ad 6 Sequeces Ad SEQUENCES AND SERIES Successio of umbers of which oe umber is desigated as the first, other as the secod, aother as the third ad so o gives rise to what is called a sequece. Sequeces
More informationPrinciple Of Superposition
ecture 5: PREIMINRY CONCEP O RUCUR NYI Priciple Of uperpositio Mathematically, the priciple of superpositio is stated as ( a ) G( a ) G( ) G a a or for a liear structural system, the respose at a give
More informationDS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10
DS 00: Priciples ad Techiques of Data Sciece Date: April 3, 208 Name: Hypothesis Testig Discussio #0. Defie these terms below as they relate to hypothesis testig. a) Data Geeratio Model: Solutio: A set
More informationSINGLE-CHANNEL QUEUING PROBLEMS APPROACH
SINGLE-CHANNEL QUEUING ROBLEMS AROACH Abdurrzzag TAMTAM, Doctoral Degree rogramme () Dept. of Telecommuicatios, FEEC, BUT E-mail: xtamta@stud.feec.vutbr.cz Supervised by: Dr. Karol Molár ABSTRACT The paper
More information(b) What is the probability that a particle reaches the upper boundary n before the lower boundary m?
MATH 529 The Boudary Problem The drukard s walk (or boudary problem) is oe of the most famous problems i the theory of radom walks. Oe versio of the problem is described as follows: Suppose a particle
More informationBertrand s Postulate
Bertrad s Postulate Lola Thompso Ross Program July 3, 2009 Lola Thompso (Ross Program Bertrad s Postulate July 3, 2009 1 / 33 Bertrad s Postulate I ve said it oce ad I ll say it agai: There s always a
More information62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +
62. Power series Defiitio 16. (Power series) Give a sequece {c }, the series c x = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + is called a power series i the variable x. The umbers c are called the coefficiets of
More informationGeneralized Semi- Markov Processes (GSMP)
Geeralized Semi- Markov Processes (GSMP) Summary Some Defiitios Markov ad Semi-Markov Processes The Poisso Process Properties of the Poisso Process Iterarrival times Memoryless property ad the residual
More informationChapter 4. Fourier Series
Chapter 4. Fourier Series At this poit we are ready to ow cosider the caoical equatios. Cosider, for eample the heat equatio u t = u, < (4.) subject to u(, ) = si, u(, t) = u(, t) =. (4.) Here,
More informationIt is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function.
MATH 532 Measurable Fuctios Dr. Neal, WKU Throughout, let ( X, F, µ) be a measure space ad let (!, F, P ) deote the special case of a probability space. We shall ow begi to study real-valued fuctios defied
More informationLecture Overview. 2 Permutations and Combinations. n(n 1) (n (k 1)) = n(n 1) (n k + 1) =
COMPSCI 230: Discrete Mathematics for Computer Sciece April 8, 2019 Lecturer: Debmalya Paigrahi Lecture 22 Scribe: Kevi Su 1 Overview I this lecture, we begi studyig the fudametals of coutig discrete objects.
More informationLecture 19: Convergence
Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may
More informationShannon s noiseless coding theorem
18.310 lecture otes May 4, 2015 Shao s oiseless codig theorem Lecturer: Michel Goemas I these otes we discuss Shao s oiseless codig theorem, which is oe of the foudig results of the field of iformatio
More informationDISTRIBUTION LAW Okunev I.V.
1 DISTRIBUTION LAW Okuev I.V. Distributio law belogs to a umber of the most complicated theoretical laws of mathematics. But it is also a very importat practical law. Nothig ca help uderstad complicated
More informationRiemann Sums y = f (x)
Riema Sums Recall that we have previously discussed the area problem I its simplest form we ca state it this way: The Area Problem Let f be a cotiuous, o-egative fuctio o the closed iterval [a, b] Fid
More informationIP Reference guide for integer programming formulations.
IP Referece guide for iteger programmig formulatios. by James B. Orli for 15.053 ad 15.058 This documet is iteded as a compact (or relatively compact) guide to the formulatio of iteger programs. For more
More informationSquare-Congruence Modulo n
Square-Cogruece Modulo Abstract This paper is a ivestigatio of a equivalece relatio o the itegers that was itroduced as a exercise i our Discrete Math class. Part I - Itro Defiitio Two itegers are Square-Cogruet
More informationModel Theory 2016, Exercises, Second batch, covering Weeks 5-7, with Solutions
Model Theory 2016, Exercises, Secod batch, coverig Weeks 5-7, with Solutios 3 Exercises from the Notes Exercise 7.6. Show that if T is a theory i a coutable laguage L, haso fiite model, ad is ℵ 0 -categorical,
More informationStatistics 511 Additional Materials
Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability
More informationAsymptotic Coupling and Its Applications in Information Theory
Asymptotic Couplig ad Its Applicatios i Iformatio Theory Vicet Y. F. Ta Joit Work with Lei Yu Departmet of Electrical ad Computer Egieerig, Departmet of Mathematics, Natioal Uiversity of Sigapore IMS-APRM
More informationA widely used display of protein shapes is based on the coordinates of the alpha carbons - - C α
Nice plottig of proteis: I A widely used display of protei shapes is based o the coordiates of the alpha carbos - - C α -s. The coordiates of the C α -s are coected by a cotiuous curve that roughly follows
More informationThe axial dispersion model for tubular reactors at steady state can be described by the following equations: dc dz R n cn = 0 (1) (2) 1 d 2 c.
5.4 Applicatio of Perturbatio Methods to the Dispersio Model for Tubular Reactors The axial dispersio model for tubular reactors at steady state ca be described by the followig equatios: d c Pe dz z =
More informationMa 530 Infinite Series I
Ma 50 Ifiite Series I Please ote that i additio to the material below this lecture icorporated material from the Visual Calculus web site. The material o sequeces is at Visual Sequeces. (To use this li
More informationThe Minimum Distance Energy for Polygonal Unknots
The Miimum Distace Eergy for Polygoal Ukots By:Johaa Tam Advisor: Rollad Trapp Abstract This paper ivestigates the eergy U MD of polygoal ukots It provides equatios for fidig the eergy for ay plaar regular
More informationApplication to Random Graphs
A Applicatio to Radom Graphs Brachig processes have a umber of iterestig ad importat applicatios. We shall cosider oe of the most famous of them, the Erdős-Réyi radom graph theory. 1 Defiitio A.1. Let
More informationTwo or more points can be used to describe a rigid body. This will eliminate the need to define rotational coordinates for the body!
OINTCOORDINATE FORMULATION Two or more poits ca be used to describe a rigid body. This will elimiate the eed to defie rotatioal coordiates for the body i z r i i, j r j j rimary oits: The coordiates of
More information6.867 Machine learning, lecture 7 (Jaakkola) 1
6.867 Machie learig, lecture 7 (Jaakkola) 1 Lecture topics: Kerel form of liear regressio Kerels, examples, costructio, properties Liear regressio ad kerels Cosider a slightly simpler model where we omit
More informationSEQUENCES AND SERIES
9 SEQUENCES AND SERIES INTRODUCTION Sequeces have may importat applicatios i several spheres of huma activities Whe a collectio of objects is arraged i a defiite order such that it has a idetified first
More informationAdvanced Stochastic Processes.
Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios.
More informationMa 530 Introduction to Power Series
Ma 530 Itroductio to Power Series Please ote that there is material o power series at Visual Calculus. Some of this material was used as part of the presetatio of the topics that follow. What is a Power
More informationSupport vector machine revisited
6.867 Machie learig, lecture 8 (Jaakkola) 1 Lecture topics: Support vector machie ad kerels Kerel optimizatio, selectio Support vector machie revisited Our task here is to first tur the support vector
More informationFREE VIBRATION RESPONSE OF A SYSTEM WITH COULOMB DAMPING
Mechaical Vibratios FREE VIBRATION RESPONSE OF A SYSTEM WITH COULOMB DAMPING A commo dampig mechaism occurrig i machies is caused by slidig frictio or dry frictio ad is called Coulomb dampig. Coulomb dampig
More informationNICK DUFRESNE. 1 1 p(x). To determine some formulas for the generating function of the Schröder numbers, r(x) = a(x) =
AN INTRODUCTION TO SCHRÖDER AND UNKNOWN NUMBERS NICK DUFRESNE Abstract. I this article we will itroduce two types of lattice paths, Schröder paths ad Ukow paths. We will examie differet properties of each,
More informationMachine Learning for Data Science (CS 4786)
Machie Learig for Data Sciece CS 4786) Lecture 9: Pricipal Compoet Aalysis The text i black outlies mai ideas to retai from the lecture. The text i blue give a deeper uderstadig of how we derive or get
More informationChimica Inorganica 3
himica Iorgaica Irreducible Represetatios ad haracter Tables Rather tha usig geometrical operatios, it is ofte much more coveiet to employ a ew set of group elemets which are matrices ad to make the rule
More informationTHE KALMAN FILTER RAUL ROJAS
THE KALMAN FILTER RAUL ROJAS Abstract. This paper provides a getle itroductio to the Kalma filter, a umerical method that ca be used for sesor fusio or for calculatio of trajectories. First, we cosider
More informationAxis Aligned Ellipsoid
Machie Learig for Data Sciece CS 4786) Lecture 6,7 & 8: Ellipsoidal Clusterig, Gaussia Mixture Models ad Geeral Mixture Models The text i black outlies high level ideas. The text i blue provides simple
More informationExpectation and Variance of a random variable
Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio
More informationMath 10A final exam, December 16, 2016
Please put away all books, calculators, cell phoes ad other devices. You may cosult a sigle two-sided sheet of otes. Please write carefully ad clearly, USING WORDS (ot just symbols). Remember that the
More informationConvergence of random variables. (telegram style notes) P.J.C. Spreij
Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space
More information(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3
MATH 337 Sequeces Dr. Neal, WKU Let X be a metric space with distace fuctio d. We shall defie the geeral cocept of sequece ad limit i a metric space, the apply the results i particular to some special
More informationAnalytic Continuation
Aalytic Cotiuatio The stadard example of this is give by Example Let h (z) = 1 + z + z 2 + z 3 +... kow to coverge oly for z < 1. I fact h (z) = 1/ (1 z) for such z. Yet H (z) = 1/ (1 z) is defied for
More informationThe picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled
1 Lecture : Area Area ad distace traveled Approximatig area by rectagles Summatio The area uder a parabola 1.1 Area ad distace Suppose we have the followig iformatio about the velocity of a particle, how
More informationVector Quantization: a Limiting Case of EM
. Itroductio & defiitios Assume that you are give a data set X = { x j }, j { 2,,, }, of d -dimesioal vectors. The vector quatizatio (VQ) problem requires that we fid a set of prototype vectors Z = { z
More informationSECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES
SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,
More informationDiscrete Mathematics for CS Spring 2005 Clancy/Wagner Notes 21. Some Important Distributions
CS 70 Discrete Mathematics for CS Sprig 2005 Clacy/Wager Notes 21 Some Importat Distributios Questio: A biased coi with Heads probability p is tossed repeatedly util the first Head appears. What is the
More informationA statistical method to determine sample size to estimate characteristic value of soil parameters
A statistical method to determie sample size to estimate characteristic value of soil parameters Y. Hojo, B. Setiawa 2 ad M. Suzuki 3 Abstract Sample size is a importat factor to be cosidered i determiig
More informationSequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence
Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece 1, 1, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet
More informationChapter 6 Infinite Series
Chapter 6 Ifiite Series I the previous chapter we cosidered itegrals which were improper i the sese that the iterval of itegratio was ubouded. I this chapter we are goig to discuss a topic which is somewhat
More informationDavid Vella, Skidmore College.
David Vella, Skidmore College dvella@skidmore.edu Geeratig Fuctios ad Expoetial Geeratig Fuctios Give a sequece {a } we ca associate to it two fuctios determied by power series: Its (ordiary) geeratig
More informationROLL CUTTING PROBLEMS UNDER STOCHASTIC DEMAND
Pacific-Asia Joural of Mathematics, Volume 5, No., Jauary-Jue 20 ROLL CUTTING PROBLEMS UNDER STOCHASTIC DEMAND SHAKEEL JAVAID, Z. H. BAKHSHI & M. M. KHALID ABSTRACT: I this paper, the roll cuttig problem
More informationIf, for instance, we were required to test whether the population mean μ could be equal to a certain value μ
STATISTICAL INFERENCE INTRODUCTION Statistical iferece is that brach of Statistics i which oe typically makes a statemet about a populatio based upo the results of a sample. I oesample testig, we essetially
More informationCS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5
CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio
More informationDiscrete Mathematics and Probability Theory Summer 2014 James Cook Note 15
CS 70 Discrete Mathematics ad Probability Theory Summer 2014 James Cook Note 15 Some Importat Distributios I this ote we will itroduce three importat probability distributios that are widely used to model
More informationMath 4400/6400 Homework #7 solutions
MATH 4400 problems. Math 4400/6400 Homewor #7 solutios 1. Let p be a prime umber. Show that the order of 1 + p modulo p 2 is exactly p. Hit: Expad (1 + p) p by the biomial theorem, ad recall from MATH
More informationMathematical Induction
Mathematical Iductio Itroductio Mathematical iductio, or just iductio, is a proof techique. Suppose that for every atural umber, P() is a statemet. We wish to show that all statemets P() are true. I a
More informationDiscrete Mathematics for CS Spring 2008 David Wagner Note 22
CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig
More informationAppendix: The Laplace Transform
Appedix: The Laplace Trasform The Laplace trasform is a powerful method that ca be used to solve differetial equatio, ad other mathematical problems. Its stregth lies i the fact that it allows the trasformatio
More informationThe Pendulum. Purpose
The Pedulum Purpose To carry out a example illustratig how physics approaches ad solves problems. The example used here is to explore the differet factors that determie the period of motio of a pedulum.
More informationStatistical Pattern Recognition
Statistical Patter Recogitio Classificatio: No-Parametric Modelig Hamid R. Rabiee Jafar Muhammadi Sprig 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Ageda Parametric Modelig No-Parametric Modelig
More informationCHAPTER 10 INFINITE SEQUENCES AND SERIES
CHAPTER 10 INFINITE SEQUENCES AND SERIES 10.1 Sequeces 10.2 Ifiite Series 10.3 The Itegral Tests 10.4 Compariso Tests 10.5 The Ratio ad Root Tests 10.6 Alteratig Series: Absolute ad Coditioal Covergece
More informationRandomized Algorithms I, Spring 2018, Department of Computer Science, University of Helsinki Homework 1: Solutions (Discussed January 25, 2018)
Radomized Algorithms I, Sprig 08, Departmet of Computer Sciece, Uiversity of Helsiki Homework : Solutios Discussed Jauary 5, 08). Exercise.: Cosider the followig balls-ad-bi game. We start with oe black
More informationSets. Sets. Operations on Sets Laws of Algebra of Sets Cardinal Number of a Finite and Infinite Set. Representation of Sets Power Set Venn Diagram
Sets MILESTONE Sets Represetatio of Sets Power Set Ve Diagram Operatios o Sets Laws of lgebra of Sets ardial Number of a Fiite ad Ifiite Set I Mathematical laguage all livig ad o-livig thigs i uiverse
More informationA sequence of numbers is a function whose domain is the positive integers. We can see that the sequence
Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece,, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet as
More informationECE 901 Lecture 12: Complexity Regularization and the Squared Loss
ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality
More informationLinear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d
Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y
More information11 Hidden Markov Models
Hidde Markov Models Hidde Markov Models are a popular machie learig approach i bioiformatics. Machie learig algorithms are preseted with traiig data, which are used to derive importat isights about the
More informationIntroduction to Computational Biology Homework 2 Solution
Itroductio to Computatioal Biology Homework 2 Solutio Problem 1: Cocave gap pealty fuctio Let γ be a gap pealty fuctio defied over o-egative itegers. The fuctio γ is called sub-additive iff it satisfies
More informationLecture 10: Mathematical Preliminaries
Lecture : Mathematical Prelimiaries Obective: Reviewig mathematical cocepts ad tools that are frequetly used i the aalysis of algorithms. Lecture # Slide # I this
More informationIntro to Learning Theory
Lecture 1, October 18, 2016 Itro to Learig Theory Ruth Urer 1 Machie Learig ad Learig Theory Comig soo 2 Formal Framework 21 Basic otios I our formal model for machie learig, the istaces to be classified
More informationResampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.
Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator
More informationSome examples of vector spaces
Roberto s Notes o Liear Algebra Chapter 11: Vector spaces Sectio 2 Some examples of vector spaces What you eed to kow already: The te axioms eeded to idetify a vector space. What you ca lear here: Some
More informationMassachusetts Institute of Technology
6.0/6.3: Probabilistic Systems Aalysis (Fall 00) Problem Set 8: Solutios. (a) We cosider a Markov chai with states 0,,, 3,, 5, where state i idicates that there are i shoes available at the frot door i
More informationChapter 9: Numerical Differentiation
178 Chapter 9: Numerical Differetiatio Numerical Differetiatio Formulatio of equatios for physical problems ofte ivolve derivatives (rate-of-chage quatities, such as velocity ad acceleratio). Numerical
More informationNumber of fatalities X Sunday 4 Monday 6 Tuesday 2 Wednesday 0 Thursday 3 Friday 5 Saturday 8 Total 28. Day
LECTURE # 8 Mea Deviatio, Stadard Deviatio ad Variace & Coefficiet of variatio Mea Deviatio Stadard Deviatio ad Variace Coefficiet of variatio First, we will discuss it for the case of raw data, ad the
More information7.1 Convergence of sequences of random variables
Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite
More informationOn forward improvement iteration for stopping problems
O forward improvemet iteratio for stoppig problems Mathematical Istitute, Uiversity of Kiel, Ludewig-Mey-Str. 4, D-24098 Kiel, Germay irle@math.ui-iel.de Albrecht Irle Abstract. We cosider the optimal
More informationTrue Nature of Potential Energy of a Hydrogen Atom
True Nature of Potetial Eergy of a Hydroge Atom Koshu Suto Key words: Bohr Radius, Potetial Eergy, Rest Mass Eergy, Classical Electro Radius PACS codes: 365Sq, 365-w, 33+p Abstract I cosiderig the potetial
More informationAlgorithms for Clustering
CR2: Statistical Learig & Applicatios Algorithms for Clusterig Lecturer: J. Salmo Scribe: A. Alcolei Settig: give a data set X R p where is the umber of observatio ad p is the umber of features, we wat
More informationMath 113, Calculus II Winter 2007 Final Exam Solutions
Math, Calculus II Witer 7 Fial Exam Solutios (5 poits) Use the limit defiitio of the defiite itegral ad the sum formulas to compute x x + dx The check your aswer usig the Evaluatio Theorem Solutio: I this
More informationANALYSIS OF EXPERIMENTAL ERRORS
ANALYSIS OF EXPERIMENTAL ERRORS All physical measuremets ecoutered i the verificatio of physics theories ad cocepts are subject to ucertaities that deped o the measurig istrumets used ad the coditios uder
More informationOPTIMAL ALGORITHMS -- SUPPLEMENTAL NOTES
OPTIMAL ALGORITHMS -- SUPPLEMENTAL NOTES Peter M. Maurer Why Hashig is θ(). As i biary search, hashig assumes that keys are stored i a array which is idexed by a iteger. However, hashig attempts to bypass
More informationLecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting
Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would
More informationRoberto s Notes on Series Chapter 2: Convergence tests Section 7. Alternating series
Roberto s Notes o Series Chapter 2: Covergece tests Sectio 7 Alteratig series What you eed to kow already: All basic covergece tests for evetually positive series. What you ca lear here: A test for series
More information1 of 7 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 6. Order Statistics Defiitios Suppose agai that we have a basic radom experimet, ad that X is a real-valued radom variable
More informationRecurrence Relations
Recurrece Relatios Aalysis of recursive algorithms, such as: it factorial (it ) { if (==0) retur ; else retur ( * factorial(-)); } Let t be the umber of multiplicatios eeded to calculate factorial(). The
More information