Statistical Machine Translation

Size: px
Start display at page:

Download "Statistical Machine Translation"

Transcription

1 Statistical Machie Traslatio LECTURE 5 HIGHER IBM MODELS APRIL 6 200

2 Brief Outlie - IBM Model 2 - IBM Model 3 - IBM Model 4 - IBM Model 5 Ref: The Mathematics of Statistical Machie Traslatio: Parameter Estimatio - Peter F Brow et.al. Computatioal Liguistics Vol 9 No

3 IBM Model 2 3

4 IBM Model 2 Model takes o otice of where the words appear i the traslatio: E.g. questa casa è bella aturalmete Of course this house is beautiful This house beautiful is of course Are equally probable uder Model Model 2 takes care of this. 4

5 Aligmet Model: IBM Model 2 The assumptio is that the traslatio of f to e i depeds upo aligmet probability: P a ( i m ) Q. What does it mea?? Q2. How to compute?? 5

6 IBM Model 2 Thus traslatio is a two-step process: Lexical Traslatio step: modeled by t(e p f q ) Aligmet Step: modeled by P a ( i m ) e.g. Questa casa è bella aturalmete this house is beautiful aturally Naturally this house is beautiful Traslatio Step Aligmet Step 6

7 IBM Model 2 Uder this model we have: p( e a f) = c m i= t( e Hece : p( e f) = p ( e a f ) = c = c a... t ( e a( ) = 0 a( m) = 0 = m = i= 0 t ( e i m f f i a ) ( i) ) pa ( a( i) i m ) p a f a ( ) ) pa ( a ( ) m ) ( a( ) m ) 7

8 IBM Model 2 We eed to maximize this p Subect to the followig costraits: ( e f ). i t ( e f ) = = i 2. p ( i m ) a = 0 = i = m Thus we have a larger set of Lagragia costats 8

9 IBM Model 2 The auxiliary fuctio becomes: h( t p λ µ ) a = c m.... t ( e f a ( ) a( ) = 0 a( m) = 0 = ) p a ( a( ) m ) λ ( t ( e f ) ) µ ( im i i i p a ( i m ) ) To fid the extremum we eed to differetiate 9

10 IBM Model 2 We ow eed a ew cout: cout( i m ; f e) the expected umber of times the word i positio i of the TL strig e is coected to the word i the positio of the SL strig f give that their legths are m respectively. cout( i m ; f e) = a p ( a e f) * δ( a( i)) 0

11 IBM Model 2 So here we shall look at groups of setece pairs Who satisfy the m ad criterio. The we look at the aligmet probabilities. Example: m = 4 = 3 aami bari achchhi I am goig home > 2>0 3>3 4>2 tumi ki khachchho What are you eatig >2 2>0 3> 4>3 kaal tomar aam ki What is your ame >3 2>0 3> 4>2 kothay chhile Where were you yesterday >2 2>3 3>0 4>

12 IBM Model 2 The by aalogy with the Model we get: ) ; ( ) ( f e m i cout m i p im a µ = for a sigle traslatio ad S ) ; ( ) ( ) ( ) ( s s S s im a m i cout m i p f e = = µ for a set of traslatios 2

13 IBM Model 2 Although apparetly the expressio for cout is complicated we ca make it simple as i the case of Model : cout( i m ; f e) = t( e i f ) p a ( i m ) t( e i 0 f ) p ( i m ) t( e f ) p ( i m ) 0 a i a 3

14 IBM Model 2 Oe ca ow desig a algorithm for Expectatio Maximizatio as i case of Model 4

15 IBM Model 3-5 5

16 Itermodel Iterlude Models ad 2 have bee created o the basis of the followig geeralized priciple: p( e a f) = p( m f) * m i = p( a( i) a()..( i ) e Note that each a() takes value betwee 0 to It works o the followig: * p( e i.. i m f) a()...( i) e... i m f) 6

17 Itermodel Iterlude It represets the oit likelihood of e ad a as A product of coditioal probabilities. Each product correspods to a geerative process for developig e ad a from f. - Choose the legth of the traslatio e - Decide which positio i f correspods to e ad the idetity of e is. - Do the same for positios 2 to m 7

18 IBM Model 3-5 Here we cosider the fertility of a word i couctio with the Word model. How may e-words a sigle f-word will produce is NOT determiistic. E.g. Noostate (It) >> despite eve though i spite of (E) Cosequetly correspodig to each word of f we get a radom variable φ f which gives the fertility of f icludig 0. I all models 3-5 fertility is explicitly modeled 8

19 IBM Model 3-5 Example: Dovete adare il gioro dopo (It) May have may possible Eglish traslatios: - You must go ext day - You have to go the followig day - You ought to be there o the followig day - You will have to go there o the followig day - I ask you to go there o the followig day Ca we ow get the aligmet? Look at the fertility of the source words. 9

20 IBM Model 3-5 Fertility: ) we ca assume that the fertility of each Word is govered by a probability distributio p( f) 2) Deals explicitly with droppig of iput words by puttig = 0. 3) Similarly we ca cotrol words i TL setece that have NO equivalet i f callig them NULL words. Thus models 3-5 are Geerative Process give a f-strig we first decide the fertility of each word ad a list of e-words to coect to it. This list is called a Tablet. 20

21 IBM Model 3-5 Defiitios: Tablet: Give a f word the list of words that may coect to it. Tableau: A collectio of Tablets. Notatio: T tableau for f. T tablet for the th f-word. T k k th e-word i th tablet T. 2

22 IBM Model 3-5 Example: Come ti chiami (It ) Tableau Come ti chiami Tablet (T ) Tablet 2 (T 2 ) Tablet 3 (T 3 ) T = Like T 2 = you T 3 = call T 2 = What T 22 = yourself T 32 = Address T 3 = As T 23 = thyself T 4 = How 22

23 IBM Model 3-5 = ) ( f p π τ * ) ( ) ( 0 = f p f p φ φ φ φ * ) p( f φ τ τ τ φ The geeratio is as per the followig formula 23 * ) ( f p k k k φ τ τ τ = = * ) ( 0 0. f p i k k k φ τ π π π φ = = = ) ( φ φ τ π π π k l l l k k f p

24 IBM Model 3-5 After choosig the Tableau the words are Permuted to geerate e. This permutatio is a radom variable п The positio i e of the k th word of the th Tablet is called п k. I these models the geerative process is expressed as a oit likelihood for a tableau τ ad a permutatio π i The followig way: 24

25 First Step: Compute φ = IBM Model 3-5 p( φ φ f f) p( φ 0 φ ) - Determie the umber of tokes that f will produce. - This will deped upo the o. of words produced by f f -. - Determie φ 0 the umber of words geerated out of NULL. 25

26 IBM Model 3-5 Secod Step: Compute φ = 0 k= τ k p( τ τ τ φ k. k 0 0 f) - Determie the k th word produced by f - This depeds o all the words produced by f f -. ad all the words produced so far by f 26

27 Third Step: Compute IBM Model 3-5 φ = k= p( π π π. τ0 φ 0 k f) - Determie π the positio i e of the k th k word produced by f. - This depeds o the positios of all the words produced so far. k i 27

28 IBM Model 3-5 Fourth Step: Compute φ 0 k= p( π π π τ φ f) 0 k 0 k Determie π the positio i e of the k th 0k word produced by NULL.. - This depeds o the positios of all the words produced so far. Thus the fial expressio is a product of 4 expressios: 28

29 IBM Model 3-5 p( τ π f)= = p( φ φ φ = 0 k = f) p( φ φ 0 f) p( τ τ τ k. k 0 φ 0 φ = k= * f) * p( π π π. τ k k i 0 φ 0 φ 0 k= p( π f) π π τ φ f) 0 k 0 k l 0 l 0 l * 29

30 IBM Model 3-5 This obviously is very difficult to maipulate. Hece cocessios are made for differet models. The cocessios come i the form of assumptios. Let us first look at the IBM Model 3. 30

31 IBM Model 3 3

32 IBM Model 3 Assumptios. For betwee ad p( φ φ f ) depeds oly o ad f. φ 2. For betwee ad k depeds oly o ad f. τ k τ τ τ φ p( f). k For betwee ad p( πk π. k π i τ0 φ0 f) depeds oly o m. π k This reduces the umber of variables 32

33 IBM Model 3 Thus parameters for Model 3 are:. A set of Fertility probabilities: which is equal to p φ φ ) ( f η ( φ f ) 2. A set of Trasitio probabilities t( e f ) which is equal to p( τ k = e τ. k τ0 φ 0 f) 3. A set of Distortio probabilities d( i m ) which is equal to p π = i π π τ φ ) ( k. k i 0 0 f 33

34 IBM Model 3 The distortio ad fertility probabilities for f 0 (NULL) are treated i a differet way: These are meat for hadlig the words i TL setece Which caot be accouted for. Obviously they are plugged-i oce all the words Are place for =... φ So = m ϕ ϕ ϕ ϕ 2 0 We have to estimate these probabilities. 34

35 IBM Model 3 It is assumed that each of the Tableau word ca produce at most oe NULL word. Assume each Tableau word produces a NULL word with Prob. p ad does ot produce oe with Prob. p 0 Hece p(φ 0 ) = = ϕ + ϕ2 + + ϕ ϕ 0 pϕ0 ϕ ϕ ϕ 2 ϕ 0 p0 + m 2ϕ 0 ϕ0 2ϕ 0 0 ϕ p p m

36 IBM Model 3 As with Models ad 2 a aligmet of (e f ) is Determied by specifyig a(i) for each positio of the TL strig. The fertilities φ = 0.. are fuctios of the a( ) s.: φ is equal to the umber of i s such that a(i) =. Hece P (e f ) ca be obtaied as summig over All the aligmets: P (e f ) =.... p ( e a f a ( ) = 0 a ( m ) = 0 ) 36

37 = m m 2ϕ 0... p ϕ ( ) 0 a m) 0 ϕ 0 a IBM Model 3 m i= t( e i f 2ϕ = ( = = = a( i) ) d( i a( i) m ) With the followig costraits p m φ! η ( φ f ) *. t ( e f ) = 2. d ( i m ) = e η( f ) = 3. φ 4. p 0 + p = φ i 37

38 IBM Model 3 Remarks:. Here also we have expoetial umber of aligmets. 2. Cout collectio is too high eve for moderate legth setece. 3. Samplig is used from the space of possible aligmets 4. Samplig should be such that most probable oes are icluded. 38

39 IBM Model 3 Remarks: 5. Still it is much harder for Model Hece Hill-climbig type heuristics are used. 7. Typically they start from Model solutio. 8. From there go to eighborig aligmetswhere distace betwee two aligmets is measured o the o. of poits they differ. 39

40 IBM Model 4 40

41 IBM Model 4 Model 3 has bee foud to be a very powerful oe. It takes care of all the maor aspects: - word traslatio - reorderig - isertio of words - droppig of words - oe to may traslatio But it has oe maor shortcomig: formulatio of distortio probabilities d(i m ) 4

42 IBM Model 4 Model 3 does ot take ito accout the followig fact: ofte a group of words are traslated together ad therefore whe they move they move together. E.g Ridig a bicycle >> i sella a ua bicicletta Comig here ridig a bicycle is dagerous >> veire qui i sella a ua bicicletta è pericoloso Try may seteces with the phrase ridig a bicycle oe ca otice that the phrase i sella a ua bicicletta will remai together. But Model 3 cosiders the distortio probabilities i isolatio. 42

43 IBM Model 4 Model 4 itroduces the cocept of Relative Distortio It assumes that the placemet of the traslatio of a Iput word is based o the placemet of the precedig iput word. It is however difficult to coceptualize: as words are beig added dropped coverted from oe-to-may. Model 4 is based aroud the cocept of cept. 43

44 IBM Model 4 Defiitio: each iput word that is aliged to at least oe output word is called a cept. Typically represeted by [ ] or π. Defiitio: the ceilig of the average of the positios is called the ceter of a cept. We shall deote as C. For each output word the Relative Distortio is defied With the help of cepts. Let us first see a example: 44

45 IBM Model 4 Cosider the followig Begali-Eglish pair: φ lambaa moto chhele cycle-e chore aaschhe A tall boy is comig ridig a bicycle Note: The does ot alig with aythig! moto a orametatio is ot a cept. 45

46 IBM Model 4 cept л л2 л3 л4 л5 Foreig Word Positio Foreig Word lambaa chheleta cycle-e chore aaschhe Eglish Word Tall boy a bicycle ridig is comig Eglish word positio Ceter of cept

47 Relative Distortio: IBM Model 4 -Words geerated by φ are Uiformly distributed. - The positio of the first word of a cept is defied w.r.t. the cetre of the previous cept. d ( C - ) Cosider for example : the word ridig it is geerated by cept 4 (л4) its Eglish positio is: 6 Cetere of the precedig cept is 8. Thus there is a distorio of -2. This shows a forward movemet of the word. Normally the distortio will be + 47

48 IBM Model 4 Relative Distortio: - For subsequet words of a cept the positio is defied w.r.t. the positio of the previous word of the same cept. d > ( л k- ) Where л k- refers to the k th word of the th cept. For example i a bicycle is comig the distortio Probability of the secod word is calculated i relatio with the previous word. 48

49 IBM Model 5 49

50 IBM Model 5 The key term for Model 5 is Deficiecy. Models 3 & 4 do ot take care of whether two words Are beig put i the same place. Thus it puts positive probabilities o some impossible Traslatios. I Model 5 the distortio probabilities are calculated By cosiderig cepts (as before) plus vacacies. Also it takes care of the problem of multiple tableaus. This makes it a better word-based model. 50

51 IBM Model 5 Model 5 keeps track of the vacacies i the m-word log e setece. Let - v max be the maximum o. of vacacies possible. - v be the o. of vacacies available i the setece e i the positios [ ] Hece the distortio probabilities are fuctios of 3 quatities: d ( v C - v max ) Similarly the relative distortio of the subsequet words i the cept are: d > ( v vл k- v max ) 5

52 Coclusio Still we go by word based traslatio. Ca we do better? Because lookig at traslatios As word-by-word is ot the best thig. E.G The trai is i. The trai is i motio. The trai is i statio. The trai is dager. Proper traslatio demads that we eed to see the Word alog with the cotext. This gives us the cocept of Phrase-based Traslatio 52

53 Thak You 53

( ) = p and P( i = b) = q.

( ) = p and P( i = b) = q. MATH 540 Radom Walks Part 1 A radom walk X is special stochastic process that measures the height (or value) of a particle that radomly moves upward or dowward certai fixed amouts o each uit icremet of

More information

The Random Walk For Dummies

The Random Walk For Dummies The Radom Walk For Dummies Richard A Mote Abstract We look at the priciples goverig the oe-dimesioal discrete radom walk First we review five basic cocepts of probability theory The we cosider the Beroulli

More information

Sequences, Mathematical Induction, and Recursion. CSE 2353 Discrete Computational Structures Spring 2018

Sequences, Mathematical Induction, and Recursion. CSE 2353 Discrete Computational Structures Spring 2018 CSE 353 Discrete Computatioal Structures Sprig 08 Sequeces, Mathematical Iductio, ad Recursio (Chapter 5, Epp) Note: some course slides adopted from publisher-provided material Overview May mathematical

More information

Math 155 (Lecture 3)

Math 155 (Lecture 3) Math 55 (Lecture 3) September 8, I this lecture, we ll cosider the aswer to oe of the most basic coutig problems i combiatorics Questio How may ways are there to choose a -elemet subset of the set {,,,

More information

CS284A: Representations and Algorithms in Molecular Biology

CS284A: Representations and Algorithms in Molecular Biology CS284A: Represetatios ad Algorithms i Molecular Biology Scribe Notes o Lectures 3 & 4: Motif Discovery via Eumeratio & Motif Represetatio Usig Positio Weight Matrix Joshua Gervi Based o presetatios by

More information

Sequences and Series of Functions

Sequences and Series of Functions Chapter 6 Sequeces ad Series of Fuctios 6.1. Covergece of a Sequece of Fuctios Poitwise Covergece. Defiitio 6.1. Let, for each N, fuctio f : A R be defied. If, for each x A, the sequece (f (x)) coverges

More information

6 Integers Modulo n. integer k can be written as k = qn + r, with q,r, 0 r b. So any integer.

6 Integers Modulo n. integer k can be written as k = qn + r, with q,r, 0 r b. So any integer. 6 Itegers Modulo I Example 2.3(e), we have defied the cogruece of two itegers a,b with respect to a modulus. Let us recall that a b (mod ) meas a b. We have proved that cogruece is a equivalece relatio

More information

4.3 Growth Rates of Solutions to Recurrences

4.3 Growth Rates of Solutions to Recurrences 4.3. GROWTH RATES OF SOLUTIONS TO RECURRENCES 81 4.3 Growth Rates of Solutios to Recurreces 4.3.1 Divide ad Coquer Algorithms Oe of the most basic ad powerful algorithmic techiques is divide ad coquer.

More information

Lecture 11: Pseudorandom functions

Lecture 11: Pseudorandom functions COM S 6830 Cryptography Oct 1, 2009 Istructor: Rafael Pass 1 Recap Lecture 11: Pseudoradom fuctios Scribe: Stefao Ermo Defiitio 1 (Ge, Ec, Dec) is a sigle message secure ecryptio scheme if for all uppt

More information

Probability, Expectation Value and Uncertainty

Probability, Expectation Value and Uncertainty Chapter 1 Probability, Expectatio Value ad Ucertaity We have see that the physically observable properties of a quatum system are represeted by Hermitea operators (also referred to as observables ) such

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

w (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ.

w (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ. 2 5. Weighted umber of late jobs 5.1. Release dates ad due dates: maximimizig the weight of o-time jobs Oce we add release dates, miimizig the umber of late jobs becomes a sigificatly harder problem. For

More information

CSE 527, Additional notes on MLE & EM

CSE 527, Additional notes on MLE & EM CSE 57 Lecture Notes: MLE & EM CSE 57, Additioal otes o MLE & EM Based o earlier otes by C. Grat & M. Narasimha Itroductio Last lecture we bega a examiatio of model based clusterig. This lecture will be

More information

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22 CS 70 Discrete Mathematics for CS Sprig 2007 Luca Trevisa Lecture 22 Aother Importat Distributio The Geometric Distributio Questio: A biased coi with Heads probability p is tossed repeatedly util the first

More information

SEQUENCES AND SERIES

SEQUENCES AND SERIES Sequeces ad 6 Sequeces Ad SEQUENCES AND SERIES Successio of umbers of which oe umber is desigated as the first, other as the secod, aother as the third ad so o gives rise to what is called a sequece. Sequeces

More information

Principle Of Superposition

Principle Of Superposition ecture 5: PREIMINRY CONCEP O RUCUR NYI Priciple Of uperpositio Mathematically, the priciple of superpositio is stated as ( a ) G( a ) G( ) G a a or for a liear structural system, the respose at a give

More information

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10 DS 00: Priciples ad Techiques of Data Sciece Date: April 3, 208 Name: Hypothesis Testig Discussio #0. Defie these terms below as they relate to hypothesis testig. a) Data Geeratio Model: Solutio: A set

More information

SINGLE-CHANNEL QUEUING PROBLEMS APPROACH

SINGLE-CHANNEL QUEUING PROBLEMS APPROACH SINGLE-CHANNEL QUEUING ROBLEMS AROACH Abdurrzzag TAMTAM, Doctoral Degree rogramme () Dept. of Telecommuicatios, FEEC, BUT E-mail: xtamta@stud.feec.vutbr.cz Supervised by: Dr. Karol Molár ABSTRACT The paper

More information

(b) What is the probability that a particle reaches the upper boundary n before the lower boundary m?

(b) What is the probability that a particle reaches the upper boundary n before the lower boundary m? MATH 529 The Boudary Problem The drukard s walk (or boudary problem) is oe of the most famous problems i the theory of radom walks. Oe versio of the problem is described as follows: Suppose a particle

More information

Bertrand s Postulate

Bertrand s Postulate Bertrad s Postulate Lola Thompso Ross Program July 3, 2009 Lola Thompso (Ross Program Bertrad s Postulate July 3, 2009 1 / 33 Bertrad s Postulate I ve said it oce ad I ll say it agai: There s always a

More information

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + 62. Power series Defiitio 16. (Power series) Give a sequece {c }, the series c x = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + is called a power series i the variable x. The umbers c are called the coefficiets of

More information

Generalized Semi- Markov Processes (GSMP)

Generalized Semi- Markov Processes (GSMP) Geeralized Semi- Markov Processes (GSMP) Summary Some Defiitios Markov ad Semi-Markov Processes The Poisso Process Properties of the Poisso Process Iterarrival times Memoryless property ad the residual

More information

Chapter 4. Fourier Series

Chapter 4. Fourier Series Chapter 4. Fourier Series At this poit we are ready to ow cosider the caoical equatios. Cosider, for eample the heat equatio u t = u, < (4.) subject to u(, ) = si, u(, t) = u(, t) =. (4.) Here,

More information

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function.

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function. MATH 532 Measurable Fuctios Dr. Neal, WKU Throughout, let ( X, F, µ) be a measure space ad let (!, F, P ) deote the special case of a probability space. We shall ow begi to study real-valued fuctios defied

More information

Lecture Overview. 2 Permutations and Combinations. n(n 1) (n (k 1)) = n(n 1) (n k + 1) =

Lecture Overview. 2 Permutations and Combinations. n(n 1) (n (k 1)) = n(n 1) (n k + 1) = COMPSCI 230: Discrete Mathematics for Computer Sciece April 8, 2019 Lecturer: Debmalya Paigrahi Lecture 22 Scribe: Kevi Su 1 Overview I this lecture, we begi studyig the fudametals of coutig discrete objects.

More information

Lecture 19: Convergence

Lecture 19: Convergence Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may

More information

Shannon s noiseless coding theorem

Shannon s noiseless coding theorem 18.310 lecture otes May 4, 2015 Shao s oiseless codig theorem Lecturer: Michel Goemas I these otes we discuss Shao s oiseless codig theorem, which is oe of the foudig results of the field of iformatio

More information

DISTRIBUTION LAW Okunev I.V.

DISTRIBUTION LAW Okunev I.V. 1 DISTRIBUTION LAW Okuev I.V. Distributio law belogs to a umber of the most complicated theoretical laws of mathematics. But it is also a very importat practical law. Nothig ca help uderstad complicated

More information

Riemann Sums y = f (x)

Riemann Sums y = f (x) Riema Sums Recall that we have previously discussed the area problem I its simplest form we ca state it this way: The Area Problem Let f be a cotiuous, o-egative fuctio o the closed iterval [a, b] Fid

More information

IP Reference guide for integer programming formulations.

IP Reference guide for integer programming formulations. IP Referece guide for iteger programmig formulatios. by James B. Orli for 15.053 ad 15.058 This documet is iteded as a compact (or relatively compact) guide to the formulatio of iteger programs. For more

More information

Square-Congruence Modulo n

Square-Congruence Modulo n Square-Cogruece Modulo Abstract This paper is a ivestigatio of a equivalece relatio o the itegers that was itroduced as a exercise i our Discrete Math class. Part I - Itro Defiitio Two itegers are Square-Cogruet

More information

Model Theory 2016, Exercises, Second batch, covering Weeks 5-7, with Solutions

Model Theory 2016, Exercises, Second batch, covering Weeks 5-7, with Solutions Model Theory 2016, Exercises, Secod batch, coverig Weeks 5-7, with Solutios 3 Exercises from the Notes Exercise 7.6. Show that if T is a theory i a coutable laguage L, haso fiite model, ad is ℵ 0 -categorical,

More information

Statistics 511 Additional Materials

Statistics 511 Additional Materials Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability

More information

Asymptotic Coupling and Its Applications in Information Theory

Asymptotic Coupling and Its Applications in Information Theory Asymptotic Couplig ad Its Applicatios i Iformatio Theory Vicet Y. F. Ta Joit Work with Lei Yu Departmet of Electrical ad Computer Egieerig, Departmet of Mathematics, Natioal Uiversity of Sigapore IMS-APRM

More information

A widely used display of protein shapes is based on the coordinates of the alpha carbons - - C α

A widely used display of protein shapes is based on the coordinates of the alpha carbons - - C α Nice plottig of proteis: I A widely used display of protei shapes is based o the coordiates of the alpha carbos - - C α -s. The coordiates of the C α -s are coected by a cotiuous curve that roughly follows

More information

The axial dispersion model for tubular reactors at steady state can be described by the following equations: dc dz R n cn = 0 (1) (2) 1 d 2 c.

The axial dispersion model for tubular reactors at steady state can be described by the following equations: dc dz R n cn = 0 (1) (2) 1 d 2 c. 5.4 Applicatio of Perturbatio Methods to the Dispersio Model for Tubular Reactors The axial dispersio model for tubular reactors at steady state ca be described by the followig equatios: d c Pe dz z =

More information

Ma 530 Infinite Series I

Ma 530 Infinite Series I Ma 50 Ifiite Series I Please ote that i additio to the material below this lecture icorporated material from the Visual Calculus web site. The material o sequeces is at Visual Sequeces. (To use this li

More information

The Minimum Distance Energy for Polygonal Unknots

The Minimum Distance Energy for Polygonal Unknots The Miimum Distace Eergy for Polygoal Ukots By:Johaa Tam Advisor: Rollad Trapp Abstract This paper ivestigates the eergy U MD of polygoal ukots It provides equatios for fidig the eergy for ay plaar regular

More information

Application to Random Graphs

Application to Random Graphs A Applicatio to Radom Graphs Brachig processes have a umber of iterestig ad importat applicatios. We shall cosider oe of the most famous of them, the Erdős-Réyi radom graph theory. 1 Defiitio A.1. Let

More information

Two or more points can be used to describe a rigid body. This will eliminate the need to define rotational coordinates for the body!

Two or more points can be used to describe a rigid body. This will eliminate the need to define rotational coordinates for the body! OINTCOORDINATE FORMULATION Two or more poits ca be used to describe a rigid body. This will elimiate the eed to defie rotatioal coordiates for the body i z r i i, j r j j rimary oits: The coordiates of

More information

6.867 Machine learning, lecture 7 (Jaakkola) 1

6.867 Machine learning, lecture 7 (Jaakkola) 1 6.867 Machie learig, lecture 7 (Jaakkola) 1 Lecture topics: Kerel form of liear regressio Kerels, examples, costructio, properties Liear regressio ad kerels Cosider a slightly simpler model where we omit

More information

SEQUENCES AND SERIES

SEQUENCES AND SERIES 9 SEQUENCES AND SERIES INTRODUCTION Sequeces have may importat applicatios i several spheres of huma activities Whe a collectio of objects is arraged i a defiite order such that it has a idetified first

More information

Advanced Stochastic Processes.

Advanced Stochastic Processes. Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios.

More information

Ma 530 Introduction to Power Series

Ma 530 Introduction to Power Series Ma 530 Itroductio to Power Series Please ote that there is material o power series at Visual Calculus. Some of this material was used as part of the presetatio of the topics that follow. What is a Power

More information

Support vector machine revisited

Support vector machine revisited 6.867 Machie learig, lecture 8 (Jaakkola) 1 Lecture topics: Support vector machie ad kerels Kerel optimizatio, selectio Support vector machie revisited Our task here is to first tur the support vector

More information

FREE VIBRATION RESPONSE OF A SYSTEM WITH COULOMB DAMPING

FREE VIBRATION RESPONSE OF A SYSTEM WITH COULOMB DAMPING Mechaical Vibratios FREE VIBRATION RESPONSE OF A SYSTEM WITH COULOMB DAMPING A commo dampig mechaism occurrig i machies is caused by slidig frictio or dry frictio ad is called Coulomb dampig. Coulomb dampig

More information

NICK DUFRESNE. 1 1 p(x). To determine some formulas for the generating function of the Schröder numbers, r(x) = a(x) =

NICK DUFRESNE. 1 1 p(x). To determine some formulas for the generating function of the Schröder numbers, r(x) = a(x) = AN INTRODUCTION TO SCHRÖDER AND UNKNOWN NUMBERS NICK DUFRESNE Abstract. I this article we will itroduce two types of lattice paths, Schröder paths ad Ukow paths. We will examie differet properties of each,

More information

Machine Learning for Data Science (CS 4786)

Machine Learning for Data Science (CS 4786) Machie Learig for Data Sciece CS 4786) Lecture 9: Pricipal Compoet Aalysis The text i black outlies mai ideas to retai from the lecture. The text i blue give a deeper uderstadig of how we derive or get

More information

Chimica Inorganica 3

Chimica Inorganica 3 himica Iorgaica Irreducible Represetatios ad haracter Tables Rather tha usig geometrical operatios, it is ofte much more coveiet to employ a ew set of group elemets which are matrices ad to make the rule

More information

THE KALMAN FILTER RAUL ROJAS

THE KALMAN FILTER RAUL ROJAS THE KALMAN FILTER RAUL ROJAS Abstract. This paper provides a getle itroductio to the Kalma filter, a umerical method that ca be used for sesor fusio or for calculatio of trajectories. First, we cosider

More information

Axis Aligned Ellipsoid

Axis Aligned Ellipsoid Machie Learig for Data Sciece CS 4786) Lecture 6,7 & 8: Ellipsoidal Clusterig, Gaussia Mixture Models ad Geeral Mixture Models The text i black outlies high level ideas. The text i blue provides simple

More information

Expectation and Variance of a random variable

Expectation and Variance of a random variable Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio

More information

Math 10A final exam, December 16, 2016

Math 10A final exam, December 16, 2016 Please put away all books, calculators, cell phoes ad other devices. You may cosult a sigle two-sided sheet of otes. Please write carefully ad clearly, USING WORDS (ot just symbols). Remember that the

More information

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Convergence of random variables. (telegram style notes) P.J.C. Spreij Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space

More information

(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3

(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3 MATH 337 Sequeces Dr. Neal, WKU Let X be a metric space with distace fuctio d. We shall defie the geeral cocept of sequece ad limit i a metric space, the apply the results i particular to some special

More information

Analytic Continuation

Analytic Continuation Aalytic Cotiuatio The stadard example of this is give by Example Let h (z) = 1 + z + z 2 + z 3 +... kow to coverge oly for z < 1. I fact h (z) = 1/ (1 z) for such z. Yet H (z) = 1/ (1 z) is defied for

More information

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled 1 Lecture : Area Area ad distace traveled Approximatig area by rectagles Summatio The area uder a parabola 1.1 Area ad distace Suppose we have the followig iformatio about the velocity of a particle, how

More information

Vector Quantization: a Limiting Case of EM

Vector Quantization: a Limiting Case of EM . Itroductio & defiitios Assume that you are give a data set X = { x j }, j { 2,,, }, of d -dimesioal vectors. The vector quatizatio (VQ) problem requires that we fid a set of prototype vectors Z = { z

More information

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,

More information

Discrete Mathematics for CS Spring 2005 Clancy/Wagner Notes 21. Some Important Distributions

Discrete Mathematics for CS Spring 2005 Clancy/Wagner Notes 21. Some Important Distributions CS 70 Discrete Mathematics for CS Sprig 2005 Clacy/Wager Notes 21 Some Importat Distributios Questio: A biased coi with Heads probability p is tossed repeatedly util the first Head appears. What is the

More information

A statistical method to determine sample size to estimate characteristic value of soil parameters

A statistical method to determine sample size to estimate characteristic value of soil parameters A statistical method to determie sample size to estimate characteristic value of soil parameters Y. Hojo, B. Setiawa 2 ad M. Suzuki 3 Abstract Sample size is a importat factor to be cosidered i determiig

More information

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece 1, 1, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet

More information

Chapter 6 Infinite Series

Chapter 6 Infinite Series Chapter 6 Ifiite Series I the previous chapter we cosidered itegrals which were improper i the sese that the iterval of itegratio was ubouded. I this chapter we are goig to discuss a topic which is somewhat

More information

David Vella, Skidmore College.

David Vella, Skidmore College. David Vella, Skidmore College dvella@skidmore.edu Geeratig Fuctios ad Expoetial Geeratig Fuctios Give a sequece {a } we ca associate to it two fuctios determied by power series: Its (ordiary) geeratig

More information

ROLL CUTTING PROBLEMS UNDER STOCHASTIC DEMAND

ROLL CUTTING PROBLEMS UNDER STOCHASTIC DEMAND Pacific-Asia Joural of Mathematics, Volume 5, No., Jauary-Jue 20 ROLL CUTTING PROBLEMS UNDER STOCHASTIC DEMAND SHAKEEL JAVAID, Z. H. BAKHSHI & M. M. KHALID ABSTRACT: I this paper, the roll cuttig problem

More information

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ STATISTICAL INFERENCE INTRODUCTION Statistical iferece is that brach of Statistics i which oe typically makes a statemet about a populatio based upo the results of a sample. I oesample testig, we essetially

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5 CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio

More information

Discrete Mathematics and Probability Theory Summer 2014 James Cook Note 15

Discrete Mathematics and Probability Theory Summer 2014 James Cook Note 15 CS 70 Discrete Mathematics ad Probability Theory Summer 2014 James Cook Note 15 Some Importat Distributios I this ote we will itroduce three importat probability distributios that are widely used to model

More information

Math 4400/6400 Homework #7 solutions

Math 4400/6400 Homework #7 solutions MATH 4400 problems. Math 4400/6400 Homewor #7 solutios 1. Let p be a prime umber. Show that the order of 1 + p modulo p 2 is exactly p. Hit: Expad (1 + p) p by the biomial theorem, ad recall from MATH

More information

Mathematical Induction

Mathematical Induction Mathematical Iductio Itroductio Mathematical iductio, or just iductio, is a proof techique. Suppose that for every atural umber, P() is a statemet. We wish to show that all statemets P() are true. I a

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Discrete Mathematics for CS Spring 2008 David Wagner Note 22 CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig

More information

Appendix: The Laplace Transform

Appendix: The Laplace Transform Appedix: The Laplace Trasform The Laplace trasform is a powerful method that ca be used to solve differetial equatio, ad other mathematical problems. Its stregth lies i the fact that it allows the trasformatio

More information

The Pendulum. Purpose

The Pendulum. Purpose The Pedulum Purpose To carry out a example illustratig how physics approaches ad solves problems. The example used here is to explore the differet factors that determie the period of motio of a pedulum.

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Patter Recogitio Classificatio: No-Parametric Modelig Hamid R. Rabiee Jafar Muhammadi Sprig 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Ageda Parametric Modelig No-Parametric Modelig

More information

CHAPTER 10 INFINITE SEQUENCES AND SERIES

CHAPTER 10 INFINITE SEQUENCES AND SERIES CHAPTER 10 INFINITE SEQUENCES AND SERIES 10.1 Sequeces 10.2 Ifiite Series 10.3 The Itegral Tests 10.4 Compariso Tests 10.5 The Ratio ad Root Tests 10.6 Alteratig Series: Absolute ad Coditioal Covergece

More information

Randomized Algorithms I, Spring 2018, Department of Computer Science, University of Helsinki Homework 1: Solutions (Discussed January 25, 2018)

Randomized Algorithms I, Spring 2018, Department of Computer Science, University of Helsinki Homework 1: Solutions (Discussed January 25, 2018) Radomized Algorithms I, Sprig 08, Departmet of Computer Sciece, Uiversity of Helsiki Homework : Solutios Discussed Jauary 5, 08). Exercise.: Cosider the followig balls-ad-bi game. We start with oe black

More information

Sets. Sets. Operations on Sets Laws of Algebra of Sets Cardinal Number of a Finite and Infinite Set. Representation of Sets Power Set Venn Diagram

Sets. Sets. Operations on Sets Laws of Algebra of Sets Cardinal Number of a Finite and Infinite Set. Representation of Sets Power Set Venn Diagram Sets MILESTONE Sets Represetatio of Sets Power Set Ve Diagram Operatios o Sets Laws of lgebra of Sets ardial Number of a Fiite ad Ifiite Set I Mathematical laguage all livig ad o-livig thigs i uiverse

More information

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece,, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet as

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y

More information

11 Hidden Markov Models

11 Hidden Markov Models Hidde Markov Models Hidde Markov Models are a popular machie learig approach i bioiformatics. Machie learig algorithms are preseted with traiig data, which are used to derive importat isights about the

More information

Introduction to Computational Biology Homework 2 Solution

Introduction to Computational Biology Homework 2 Solution Itroductio to Computatioal Biology Homework 2 Solutio Problem 1: Cocave gap pealty fuctio Let γ be a gap pealty fuctio defied over o-egative itegers. The fuctio γ is called sub-additive iff it satisfies

More information

Lecture 10: Mathematical Preliminaries

Lecture 10: Mathematical Preliminaries Lecture : Mathematical Prelimiaries Obective: Reviewig mathematical cocepts ad tools that are frequetly used i the aalysis of algorithms. Lecture # Slide # I this

More information

Intro to Learning Theory

Intro to Learning Theory Lecture 1, October 18, 2016 Itro to Learig Theory Ruth Urer 1 Machie Learig ad Learig Theory Comig soo 2 Formal Framework 21 Basic otios I our formal model for machie learig, the istaces to be classified

More information

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

Some examples of vector spaces

Some examples of vector spaces Roberto s Notes o Liear Algebra Chapter 11: Vector spaces Sectio 2 Some examples of vector spaces What you eed to kow already: The te axioms eeded to idetify a vector space. What you ca lear here: Some

More information

Massachusetts Institute of Technology

Massachusetts Institute of Technology 6.0/6.3: Probabilistic Systems Aalysis (Fall 00) Problem Set 8: Solutios. (a) We cosider a Markov chai with states 0,,, 3,, 5, where state i idicates that there are i shoes available at the frot door i

More information

Chapter 9: Numerical Differentiation

Chapter 9: Numerical Differentiation 178 Chapter 9: Numerical Differetiatio Numerical Differetiatio Formulatio of equatios for physical problems ofte ivolve derivatives (rate-of-chage quatities, such as velocity ad acceleratio). Numerical

More information

Number of fatalities X Sunday 4 Monday 6 Tuesday 2 Wednesday 0 Thursday 3 Friday 5 Saturday 8 Total 28. Day

Number of fatalities X Sunday 4 Monday 6 Tuesday 2 Wednesday 0 Thursday 3 Friday 5 Saturday 8 Total 28. Day LECTURE # 8 Mea Deviatio, Stadard Deviatio ad Variace & Coefficiet of variatio Mea Deviatio Stadard Deviatio ad Variace Coefficiet of variatio First, we will discuss it for the case of raw data, ad the

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

On forward improvement iteration for stopping problems

On forward improvement iteration for stopping problems O forward improvemet iteratio for stoppig problems Mathematical Istitute, Uiversity of Kiel, Ludewig-Mey-Str. 4, D-24098 Kiel, Germay irle@math.ui-iel.de Albrecht Irle Abstract. We cosider the optimal

More information

True Nature of Potential Energy of a Hydrogen Atom

True Nature of Potential Energy of a Hydrogen Atom True Nature of Potetial Eergy of a Hydroge Atom Koshu Suto Key words: Bohr Radius, Potetial Eergy, Rest Mass Eergy, Classical Electro Radius PACS codes: 365Sq, 365-w, 33+p Abstract I cosiderig the potetial

More information

Algorithms for Clustering

Algorithms for Clustering CR2: Statistical Learig & Applicatios Algorithms for Clusterig Lecturer: J. Salmo Scribe: A. Alcolei Settig: give a data set X R p where is the umber of observatio ad p is the umber of features, we wat

More information

Math 113, Calculus II Winter 2007 Final Exam Solutions

Math 113, Calculus II Winter 2007 Final Exam Solutions Math, Calculus II Witer 7 Fial Exam Solutios (5 poits) Use the limit defiitio of the defiite itegral ad the sum formulas to compute x x + dx The check your aswer usig the Evaluatio Theorem Solutio: I this

More information

ANALYSIS OF EXPERIMENTAL ERRORS

ANALYSIS OF EXPERIMENTAL ERRORS ANALYSIS OF EXPERIMENTAL ERRORS All physical measuremets ecoutered i the verificatio of physics theories ad cocepts are subject to ucertaities that deped o the measurig istrumets used ad the coditios uder

More information

OPTIMAL ALGORITHMS -- SUPPLEMENTAL NOTES

OPTIMAL ALGORITHMS -- SUPPLEMENTAL NOTES OPTIMAL ALGORITHMS -- SUPPLEMENTAL NOTES Peter M. Maurer Why Hashig is θ(). As i biary search, hashig assumes that keys are stored i a array which is idexed by a iteger. However, hashig attempts to bypass

More information

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would

More information

Roberto s Notes on Series Chapter 2: Convergence tests Section 7. Alternating series

Roberto s Notes on Series Chapter 2: Convergence tests Section 7. Alternating series Roberto s Notes o Series Chapter 2: Covergece tests Sectio 7 Alteratig series What you eed to kow already: All basic covergece tests for evetually positive series. What you ca lear here: A test for series

More information

1 of 7 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 6. Order Statistics Defiitios Suppose agai that we have a basic radom experimet, ad that X is a real-valued radom variable

More information

Recurrence Relations

Recurrence Relations Recurrece Relatios Aalysis of recursive algorithms, such as: it factorial (it ) { if (==0) retur ; else retur ( * factorial(-)); } Let t be the umber of multiplicatios eeded to calculate factorial(). The

More information