N-Grams and Corpus Linguistics

Size: px
Start display at page:

Download "N-Grams and Corpus Linguistics"

Transcription

1 N-Grams ad Corpus Liguistics Lecture #5 Trasitio Up to this poit e ve mostly bee discussig ords i isolatio No e re sitchig to sequeces of ords Ad e re goig to orry about assigig probabilities biliti to sequeces of ords September 9 Who Cares? Why ould you at to assig a probability to a setece or Why ould you at to predict the ext ord Lots of applicatios Real-Word Spellig Errors Metal cofusios Their/they re/there To/too/to Weather/hether eace/piece You re/your Typos that result i real ords Lave for Have 3 4 Real Word Spellig Errors Collect a set of commo pairs of cofusios Wheever a member of this set is ecoutered compute the probability of the setece i hich it appears Substitute the other possibilities ad compute the probability of the resultig setece Choose the higher oe Next Word redictio From a NY Times story... Stocks... Stocks pluged this. Stocks pluged this morig, despite a cut i iterest rates Stocks pluged this morig, despite a cut i iterest rates by the Federal Reserve, as Wall... Stocks pluged this morig, despite a cut i iterest rates by the Federal Reserve, as Wall Street bega 5 6

2 Stocks pluged this morig, despite a cut i iterest rates by the Federal Reserve, as Wall Street bega tradig for the first time sice last Stocks pluged this morig, despite a cut i iterest rates by the Federal Reserve, as Wall Street bega tradig for the first time sice last Tuesday's terrorist attacks. Huma Word redictio Clearly, at least some of us have the ability to predict future ords i a utterace. Ho? Domai koledge Sytactic ti koledge Lexical koledge 7 8 Claim A useful part of the koledge eeded to allo Word redictio ca be captured usig simple statistical techiques I particular, e'll rely o the otio of the probability of a sequece a phrase, a setece Applicatios Why do e at to predict a ord, give some precedig ords? Rak the likelihood of sequeces cotaiig various alterative hypotheses, e.g. for ASR Theatre oers say popcor/uicor sales have doubled... Assess the likelihood/goodess of a setece, e.g. for text geeratio or machie traslatio The doctor recommeded a cat sca. El doctor recommedó ua exploració del gato. 9 N-Gram Models of Laguage Use the previous N- ords i a sequece to predict the ext ord Laguage Model LM uigrams, bigrams, trigrams, Ho do e trai these models? Very large corpora Coutig Words i Corpora What is a ord? e.g., are cat ad cats the same ord? September ad Sept? zero ad oh? Is _ a ord? *?? Ho may ords are there i do t? Goa? I Japaese ad Chiese text -- ho do e idetify a ord?

3 Termiology Setece: uit of ritte laguage Utterace: uit of spoke laguage Word Form: the iflected form that appears i the corpus Lemma: a abstract form, shared by ord forms havig the same stem, part of speech, ad ord sese Types: umber of distict ords i a corpus vocabulary size Tokes: total umber of ords Corpora Corpora are olie collectios of text ad speech Bro Corpus Wall Street Joural A es Hasards DARA/NIST text/speech corpora Call Home, ATIS, sitchboard, Broadcast Nes, TDT, Commuicator TRAINS, Radio Nes 3 4 Chai Rule Example Recall the defiitio of coditioal probabilities Reritig Or Or A^ B A B B A ^ B A B B The big red dog The*big the*red the big*dog the big red Better The <Begiig of setece> ritte as The <S> The big big the the The big the big the 5 6 Geeral Case Ufortuately The ord sequece from positio to is So the probability of a sequece is That does t help sice its ulikely e ll ever gather the right statistics for the prefixes. k 3... k k 7 8 3

4 Markov Assumptio Assume that the etire prefix history is t ecessary. I other ords, a evet does t deped o all of its history, just a fixed legth ear history Markov Assumptio So for each compoet i the product replace each ith the approximatio assumig a prefix of N N 9 N-Grams The big red dog Uigrams: dog Bigrams: dog red Trigrams: dog big red Four-grams: dog the big red I geeral, e ll be dealig ith Word Some fixed prefix Caveat The formulatio Word Some fixed prefix is ot really appropriate i may applicatios. It is if e re dealig ith real time speech here e oly have access to prefixes. But if e re dealig ith text e already have the right ad left cotexts. There s o a priori reaso to stick to left cotexts. Traiig ad Testig N-Gram probabilities come from a traiig corpus overly arro corpus: probabilities do't geeralize overly geeral corpus: probabilities do't reflect task or domai A separate test corpus is used to evaluate the model, typically usig stadard metrics held out test set; developmet test set cross validatio results tested for statistical sigificace A Simple Example I at to eat Chiese food = I <start> at I to at eat to Chiese eat food Chiese 3 4 4

5 eat o A Bigram Grammar Fragmet from BER eat some eat luch eat Thai eat breakfast eat i.3.3. <start> I <start> I d <start> Tell <start> I m I at at some at Thai to eat to have to sped eat dier.5 eat Chiese. I ould.9 to be. eat at eat a eat Idia eat today eat Mexica eat tomorro eat dessert eat British...7. I do t I have at to at a British food British restaurat British cuisie British luch I at to eat British food = I <start> at I to at eat to British eat food British =.5*.3*.65*.6*.*.6 =.8 vs. I at to eat Chiese food =.5 robabilities seem to capture ``sytactic'' facts, ``orld koledge'' eat is ofte folloed by a N British food is ot too popular N-gram models ca be traied by coutig ad ormalizatio A Aside o Logs You do t really do all those multiplies. The umbers are too small ad lead to uderflos Covert the probabilities to logs ad the do additios. To get the real probability bilit if you eed it go back to the atilog. 7 8 Ho do e get the N-gram probabilities? BER Bigram Couts I at to eat Chiese food luch N-gram models ca be traied by coutig ad ormalizatio I at to eat 9 5 Chiese food 9 7 luch

6 BER Bigram robabilities Normalizatio: divide each ro's couts by appropriate uigram couts for - BER Table: Bigram robabilities I at to eat Chiese food luch Computig the bigram probability of I I CI,I/Call I p I I = 8 / 3437 =.3 Maximum Likelihood Estimatio MLE: relative frequecy of e.g. freq, freq 3 3 What do e lear about the laguage? What's beig captured ith... at I =.3 to at =.65 eat to =.6 food Chiese =.56 luch eat =.55 What about... I I =.3 I at =.5 I food =.3 I I =.3 I I I I at I at =.5 I at I at I food =.3 the kid of food I at is Geeratio just a test Choose N-Grams accordig to their probabilities ad strig them together For bigrams start by geeratig a ord that has a high probability of startig a setece, the choose a bigram that is high give the first ord selected, ad so o. See e get better ith higher-order -grams Approximatig Shakespeare As e icrease the value of N, the accuracy of the - gram model icreases, sice choice of ext ord becomes icreasigly costraied Geeratig seteces ith radom uigrams... Every eter o severally so, let Hill he late speaks; or! a more to leg less first you eter With bigrams... What meas, sir. I cofess she? the all sorts, he is trim, captai. Why dost stad forth thy caopy, forsooth; he is this palpable hit the Kig Hery

7 Trigrams Seet price, Falstaff shall die. This shall forbid it should be braded, if reo made it empty. Quadrigrams What! I ill go seek the traitor Gloucester. Will you ot tell me ho I am? There are 884,647 tokes, ith 9,66 ord form types, i about a oe millio ord Shakespeare corpus Shakespeare produced 3, bigram types out of 844 millio possible bigrams: so, 99.96% of the possible bigrams ere ever see have zero etries i the table Quadrigrams orse: What's comig out looks like Shakespeare because it is Shakespeare N-Gram Traiig Sesitivity If e repeated the Shakespeare experimet but traied our -grams o a Wall Street Joural corpus, hat ould e get? This has major implicatios for corpus selectio or desig Some Useful Observatios A small umber of evets occur ith high frequecy You ca collect reliable statistics o these evets ith relatively small samples A large umber of evets occur ith small frequecy You might have to ait a log time to gather statistics o the lo frequecy evets 39 4 Some Useful Observatios Some zeroes are really zeroes Meaig that they represet evets that ca t or should t occur O the other had, some zeroes are t really zeroes They represet lo frequecy evets that simply did t occur i the corpus Smoothig Techiques Every -gram traiig matrix is sparse, eve for very large corpora Zipf s la Solutio: estimate the likelihood of usee -grams roblems: ho do you adjust the rest of the corpus to accommodate these phatom -grams? 4 4 7

8 roblem Let s assume e re usig N-grams Ho ca e assig a probability to a sequece here oe of the compoet -grams has a value of zero Assume all the ords are ko ad have bee see Go to a loer order -gram Back off from bigrams to uigrams Replace the zero ith somethig else Add-Oe Laplace Make the zero couts. Ratioale: They re just evets you have t see yet. If you had see them, chaces are you ould oly have see them oce so make the cout equal to Add-oe Smoothig Origial BER Couts For uigrams: Add to every ord type cout Normalize by N tokes /N tokes +V types Smoothed cout adjusted for additios to N is N ci N VV Normalize by N to get the e uigram probability: For bigrams: N c i p* i V Add to every bigram c - + Icr uigram cout by vocabulary size c - + V BER Table: Bigram robabilities BER After Add-Oe Was

9 Add-Oe Smoothed BER Recostituted Discout: ratio of e couts to old e.g. add-oe smoothig chages the BER cout to at from 786 to 33 d c =.4 ad pto at from.65 to.8 roblem: add oe smoothig chages couts drastically: too much eight give to usee grams i practice, usmoothed bigrams ofte ork better! 49 5 Witte-Bell Discoutig A zero gram is just a gram you have t see yet but every gram i the corpus as usee oce so... Ho may times did e see a gram for the first time? Oce for each gram type T Est. total probability mass of usee bigrams as We ca divide the probability mass equally amog usee bigrams.or e ca coditio the probability of a usee bigram o the first ord of the bigram Discout values for Witte-Bell are much more reasoable tha Add-Oe N T T Vie traiig corpus as series of evets, oe for each toke N ad oe for each e type T 5 5 Witte-Bell Thik about the occurrece of a usee item ord, bigram, etc as a evet. The probability of such a evet ca be measured i a corpus by just lookig at ho ofte it happes. Just take the sigle ord case first. Assume a corpus of N tokes ad T types. Ho may times as a as yet usee type ecoutered? Witte Bell First compute the probability of a usee evet occurrig The distribute that probability mass amog the as yet usee types the oes ith zero couts

10 robability of a Usee Evet Distributig Evely Simple case of uigrams T is the umber of evets that are see for the first time i the corpus This is just the umber of types sice each type had to occur for a first time oce N is just the umber of observatios T N T The amout to be distributed is The umber of evets ith cout zero So distributig evely gets us T N T Z T Z N T Caveat The uigram case is eird Z is the umber of thigs ith cout zero Ok, so that s the umber of thigs e did t see at all. Huh? Fortuately it makes more sese i the N-gram case. Take Shakespeare Recall that he produced oly 9, types. So there are potetially 9,^ bigrams. Of hich oly 3k occur, so Z is 9,^ 3k Witte-Bell I the case of bigrams, ot all coditioig evets are equally promiscuous x the vs x goig So distribute the mass assiged to the zero cout bigrams accordig to their promiscuity This meas coditio the redistributio o ho may differet types occurred ith a give prefix Distributig Amog the Zeros Origial BER Couts If a bigram x i has a zero cout T x i x Z x N x T x Number of bigram types startig ith x Number of bigrams startig ith x that ere ot see Actual frequecy of bigrams begiig ith x 59 6

11 Witte-Bell Smoothed ad Recostituted Couts Good-Turig Discoutig Re-estimate amout of probability mass for zero or lo cout grams by lookig at grams ith higher couts N Estimate c* c c N c E.g. N s adjusted cout is a fuctio of the cout of grams that occur oce, N Assumes: ord bigrams follo a biomial distributio We ko umber of usee bigrams VxV-see 6 6 Backoff methods e.g. Katz 87 For e.g. a trigram model Compute uigram, bigram ad trigram probabilities I use: Where trigram uavailable back off to bigram if available, o.. uigram probability E.g A omivorous uicor Summary N-gram probabilities ca be used to estimate the likelihood Of a ord occurrig i a cotext N- Of a setece occurrig at all Smoothig techiques deal ith problems of usee ords i a corpus 63 64

Lecture 3 Language Modeling with N-Grams

Lecture 3 Language Modeling with N-Grams atural Laguage Processig CS 6320 Lecture 3 Laguage Modelig ith -Grams Istructor: Sada Harabagiu The problem Usig the otio of ord predictio for processig laguage Example: What ord is most likely to follo:

More information

( ) = is larger than. the variance of X V

( ) = is larger than. the variance of X V Stat 400, sectio 6. Methods of Poit Estimatio otes by Tim Pilachoski A oit estimate of a arameter is a sigle umber that ca be regarded as a sesible value for The selected statistic is called the oit estimator

More information

Discrete Mathematics for CS Spring 2005 Clancy/Wagner Notes 21. Some Important Distributions

Discrete Mathematics for CS Spring 2005 Clancy/Wagner Notes 21. Some Important Distributions CS 70 Discrete Mathematics for CS Sprig 2005 Clacy/Wager Notes 21 Some Importat Distributios Questio: A biased coi with Heads probability p is tossed repeatedly util the first Head appears. What is the

More information

Probability and Information Theory for Language Modeling. Statistical Linguistics. Statistical Linguistics: Adult Monolingual Speaker

Probability and Information Theory for Language Modeling. Statistical Linguistics. Statistical Linguistics: Adult Monolingual Speaker Probability ad Iformatio Theory for Laguage Modelig Statistical vs. Symbolic NLP Elemetary Probability Theory Laguage Modelig Iformatio Theory Statistical Liguistics Statistical approaches are clearly

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5 CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio

More information

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece,, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet as

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece 1, 1, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet

More information

CS276A Practice Problem Set 1 Solutions

CS276A Practice Problem Set 1 Solutions CS76A Practice Problem Set Solutios Problem. (i) (ii) 8 (iii) 6 Compute the gamma-codes for the followig itegers: (i) (ii) 8 (iii) 6 Problem. For this problem, we will be dealig with a collectio of millio

More information

CS284A: Representations and Algorithms in Molecular Biology

CS284A: Representations and Algorithms in Molecular Biology CS284A: Represetatios ad Algorithms i Molecular Biology Scribe Notes o Lectures 3 & 4: Motif Discovery via Eumeratio & Motif Represetatio Usig Positio Weight Matrix Joshua Gervi Based o presetatios by

More information

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,

More information

Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman:

Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman: Math 224 Fall 2017 Homework 4 Drew Armstrog Problems from 9th editio of Probability ad Statistical Iferece by Hogg, Tais ad Zimmerma: Sectio 2.3, Exercises 16(a,d),18. Sectio 2.4, Exercises 13, 14. Sectio

More information

Lecture 4 February 16, 2016

Lecture 4 February 16, 2016 MIT 6.854/18.415: Advaced Algorithms Sprig 16 Prof. Akur Moitra Lecture 4 February 16, 16 Scribe: Be Eysebach, Devi Neal 1 Last Time Cosistet Hashig - hash fuctios that evolve well Radom Trees - routig

More information

Chapter 18 Summary Sampling Distribution Models

Chapter 18 Summary Sampling Distribution Models Uit 5 Itroductio to Iferece Chapter 18 Summary Samplig Distributio Models What have we leared? Sample proportios ad meas will vary from sample to sample that s samplig error (samplig variability). Samplig

More information

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22 CS 70 Discrete Mathematics for CS Sprig 2007 Luca Trevisa Lecture 22 Aother Importat Distributio The Geometric Distributio Questio: A biased coi with Heads probability p is tossed repeatedly util the first

More information

4.3 Growth Rates of Solutions to Recurrences

4.3 Growth Rates of Solutions to Recurrences 4.3. GROWTH RATES OF SOLUTIONS TO RECURRENCES 81 4.3 Growth Rates of Solutios to Recurreces 4.3.1 Divide ad Coquer Algorithms Oe of the most basic ad powerful algorithmic techiques is divide ad coquer.

More information

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10 DS 00: Priciples ad Techiques of Data Sciece Date: April 3, 208 Name: Hypothesis Testig Discussio #0. Defie these terms below as they relate to hypothesis testig. a) Data Geeratio Model: Solutio: A set

More information

Lecture 12: November 13, 2018

Lecture 12: November 13, 2018 Mathematical Toolkit Autum 2018 Lecturer: Madhur Tulsiai Lecture 12: November 13, 2018 1 Radomized polyomial idetity testig We will use our kowledge of coditioal probability to prove the followig lemma,

More information

Math 140 Introductory Statistics

Math 140 Introductory Statistics 8.2 Testig a Proportio Math 1 Itroductory Statistics Professor B. Abrego Lecture 15 Sectios 8.2 People ofte make decisios with data by comparig the results from a sample to some predetermied stadard. These

More information

Discrete Mathematics and Probability Theory Spring 2012 Alistair Sinclair Note 15

Discrete Mathematics and Probability Theory Spring 2012 Alistair Sinclair Note 15 CS 70 Discrete Mathematics ad Probability Theory Sprig 2012 Alistair Siclair Note 15 Some Importat Distributios The first importat distributio we leared about i the last Lecture Note is the biomial distributio

More information

Lecture 24 Floods and flood frequency

Lecture 24 Floods and flood frequency Lecture 4 Floods ad flood frequecy Oe of the thigs we wat to kow most about rivers is what s the probability that a flood of size will happe this year? I 100 years? There are two ways to do this empirically,

More information

11 Hidden Markov Models

11 Hidden Markov Models Hidde Markov Models Hidde Markov Models are a popular machie learig approach i bioiformatics. Machie learig algorithms are preseted with traiig data, which are used to derive importat isights about the

More information

n outcome is (+1,+1, 1,..., 1). Let the r.v. X denote our position (relative to our starting point 0) after n moves. Thus X = X 1 + X 2 + +X n,

n outcome is (+1,+1, 1,..., 1). Let the r.v. X denote our position (relative to our starting point 0) after n moves. Thus X = X 1 + X 2 + +X n, CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 9 Variace Questio: At each time step, I flip a fair coi. If it comes up Heads, I walk oe step to the right; if it comes up Tails, I walk oe

More information

Optimally Sparse SVMs

Optimally Sparse SVMs A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but

More information

Frequentist Inference

Frequentist Inference Frequetist Iferece The topics of the ext three sectios are useful applicatios of the Cetral Limit Theorem. Without kowig aythig about the uderlyig distributio of a sequece of radom variables {X i }, for

More information

Math 155 (Lecture 3)

Math 155 (Lecture 3) Math 55 (Lecture 3) September 8, I this lecture, we ll cosider the aswer to oe of the most basic coutig problems i combiatorics Questio How may ways are there to choose a -elemet subset of the set {,,,

More information

Math 475, Problem Set #12: Answers

Math 475, Problem Set #12: Answers Math 475, Problem Set #12: Aswers A. Chapter 8, problem 12, parts (b) ad (d). (b) S # (, 2) = 2 2, sice, from amog the 2 ways of puttig elemets ito 2 distiguishable boxes, exactly 2 of them result i oe

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Aalysis ad Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasii/teachig.html Suhasii Subba Rao Review of testig: Example The admistrator of a ursig home wats to do a time ad motio

More information

Math 10A final exam, December 16, 2016

Math 10A final exam, December 16, 2016 Please put away all books, calculators, cell phoes ad other devices. You may cosult a sigle two-sided sheet of otes. Please write carefully ad clearly, USING WORDS (ot just symbols). Remember that the

More information

Understanding Samples

Understanding Samples 1 Will Moroe CS 109 Samplig ad Bootstrappig Lecture Notes #17 August 2, 2017 Based o a hadout by Chris Piech I this chapter we are goig to talk about statistics calculated o samples from a populatio. We

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Patter Recogitio Classificatio: No-Parametric Modelig Hamid R. Rabiee Jafar Muhammadi Sprig 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Ageda Parametric Modelig No-Parametric Modelig

More information

Massachusetts Institute of Technology

Massachusetts Institute of Technology 6.0/6.3: Probabilistic Systems Aalysis (Fall 00) Problem Set 8: Solutios. (a) We cosider a Markov chai with states 0,,, 3,, 5, where state i idicates that there are i shoes available at the frot door i

More information

Power and Type II Error

Power and Type II Error Statistical Methods I (EXST 7005) Page 57 Power ad Type II Error Sice we do't actually kow the value of the true mea (or we would't be hypothesizig somethig else), we caot kow i practice the type II error

More information

5. Likelihood Ratio Tests

5. Likelihood Ratio Tests 1 of 5 7/29/2009 3:16 PM Virtual Laboratories > 9. Hy pothesis Testig > 1 2 3 4 5 6 7 5. Likelihood Ratio Tests Prelimiaries As usual, our startig poit is a radom experimet with a uderlyig sample space,

More information

CHAPTER 10 INFINITE SEQUENCES AND SERIES

CHAPTER 10 INFINITE SEQUENCES AND SERIES CHAPTER 10 INFINITE SEQUENCES AND SERIES 10.1 Sequeces 10.2 Ifiite Series 10.3 The Itegral Tests 10.4 Compariso Tests 10.5 The Ratio ad Root Tests 10.6 Alteratig Series: Absolute ad Coditioal Covergece

More information

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4 MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.

More information

Lecture 10: Universal coding and prediction

Lecture 10: Universal coding and prediction 0-704: Iformatio Processig ad Learig Sprig 0 Lecture 0: Uiversal codig ad predictio Lecturer: Aarti Sigh Scribes: Georg M. Goerg Disclaimer: These otes have ot bee subjected to the usual scrutiy reserved

More information

Shannon s noiseless coding theorem

Shannon s noiseless coding theorem 18.310 lecture otes May 4, 2015 Shao s oiseless codig theorem Lecturer: Michel Goemas I these otes we discuss Shao s oiseless codig theorem, which is oe of the foudig results of the field of iformatio

More information

Recitation 4: Lagrange Multipliers and Integration

Recitation 4: Lagrange Multipliers and Integration Math 1c TA: Padraic Bartlett Recitatio 4: Lagrage Multipliers ad Itegratio Week 4 Caltech 211 1 Radom Questio Hey! So, this radom questio is pretty tightly tied to today s lecture ad the cocept of cotet

More information

Support vector machine revisited

Support vector machine revisited 6.867 Machie learig, lecture 8 (Jaakkola) 1 Lecture topics: Support vector machie ad kerels Kerel optimizatio, selectio Support vector machie revisited Our task here is to first tur the support vector

More information

Discrete Mathematics and Probability Theory Summer 2014 James Cook Note 15

Discrete Mathematics and Probability Theory Summer 2014 James Cook Note 15 CS 70 Discrete Mathematics ad Probability Theory Summer 2014 James Cook Note 15 Some Importat Distributios I this ote we will itroduce three importat probability distributios that are widely used to model

More information

Lecture Chapter 6: Convergence of Random Sequences

Lecture Chapter 6: Convergence of Random Sequences ECE5: Aalysis of Radom Sigals Fall 6 Lecture Chapter 6: Covergece of Radom Sequeces Dr Salim El Rouayheb Scribe: Abhay Ashutosh Doel, Qibo Zhag, Peiwe Tia, Pegzhe Wag, Lu Liu Radom sequece Defiitio A ifiite

More information

Lecture 4 The Simple Random Walk

Lecture 4 The Simple Random Walk Lecture 4: The Simple Radom Walk 1 of 9 Course: M36K Itro to Stochastic Processes Term: Fall 014 Istructor: Gorda Zitkovic Lecture 4 The Simple Radom Walk We have defied ad costructed a radom walk {X }

More information

15-780: Graduate Artificial Intelligence. Density estimation

15-780: Graduate Artificial Intelligence. Density estimation 5-780: Graduate Artificial Itelligece Desity estimatio Coditioal Probability Tables (CPT) But where do we get them? P(B)=.05 B P(E)=. E P(A B,E) )=.95 P(A B, E) =.85 P(A B,E) )=.5 P(A B, E) =.05 A P(J

More information

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen) Goodess-of-Fit Tests ad Categorical Data Aalysis (Devore Chapter Fourtee) MATH-252-01: Probability ad Statistics II Sprig 2019 Cotets 1 Chi-Squared Tests with Kow Probabilities 1 1.1 Chi-Squared Testig................

More information

Lecture 1 Probability and Statistics

Lecture 1 Probability and Statistics Wikipedia: Lecture 1 Probability ad Statistics Bejami Disraeli, British statesma ad literary figure (1804 1881): There are three kids of lies: lies, damed lies, ad statistics. popularized i US by Mark

More information

Intro to Learning Theory

Intro to Learning Theory Lecture 1, October 18, 2016 Itro to Learig Theory Ruth Urer 1 Machie Learig ad Learig Theory Comig soo 2 Formal Framework 21 Basic otios I our formal model for machie learig, the istaces to be classified

More information

Limit Theorems. Convergence in Probability. Let X be the number of heads observed in n tosses. Then, E[X] = np and Var[X] = np(1-p).

Limit Theorems. Convergence in Probability. Let X be the number of heads observed in n tosses. Then, E[X] = np and Var[X] = np(1-p). Limit Theorems Covergece i Probability Let X be the umber of heads observed i tosses. The, E[X] = p ad Var[X] = p(-p). L O This P x p NM QP P x p should be close to uity for large if our ituitio is correct.

More information

Axioms of Measure Theory

Axioms of Measure Theory MATH 532 Axioms of Measure Theory Dr. Neal, WKU I. The Space Throughout the course, we shall let X deote a geeric o-empty set. I geeral, we shall ot assume that ay algebraic structure exists o X so that

More information

IP Reference guide for integer programming formulations.

IP Reference guide for integer programming formulations. IP Referece guide for iteger programmig formulatios. by James B. Orli for 15.053 ad 15.058 This documet is iteded as a compact (or relatively compact) guide to the formulatio of iteger programs. For more

More information

Statistics 511 Additional Materials

Statistics 511 Additional Materials Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability

More information

Lecture 5: April 17, 2013

Lecture 5: April 17, 2013 TTIC/CMSC 350 Mathematical Toolkit Sprig 203 Madhur Tulsiai Lecture 5: April 7, 203 Scribe: Somaye Hashemifar Cheroff bouds recap We recall the Cheroff/Hoeffdig bouds we derived i the last lecture idepedet

More information

1 Review of Probability & Statistics

1 Review of Probability & Statistics 1 Review of Probability & Statistics a. I a group of 000 people, it has bee reported that there are: 61 smokers 670 over 5 960 people who imbibe (drik alcohol) 86 smokers who imbibe 90 imbibers over 5

More information

Let us consider the following problem to warm up towards a more general statement.

Let us consider the following problem to warm up towards a more general statement. Lecture 4: Sequeces with repetitios, distributig idetical objects amog distict parties, the biomial theorem, ad some properties of biomial coefficiets Refereces: Relevat parts of chapter 15 of the Math

More information

( ) = p and P( i = b) = q.

( ) = p and P( i = b) = q. MATH 540 Radom Walks Part 1 A radom walk X is special stochastic process that measures the height (or value) of a particle that radomly moves upward or dowward certai fixed amouts o each uit icremet of

More information

Central Limit Theorem the Meaning and the Usage

Central Limit Theorem the Meaning and the Usage Cetral Limit Theorem the Meaig ad the Usage Covetio about otatio. N, We are usig otatio X is variable with mea ad stadard deviatio. i lieu of sayig that X is a ormal radom Assume a sample of measuremets

More information

Pixel Recurrent Neural Networks

Pixel Recurrent Neural Networks Pixel Recurret Neural Networks Aa ro va de Oord, Nal Kalchbreer, Koray Kavukcuoglu Google DeepMid August 2016 Preseter - Neha M Example problem (completig a image) Give the first half of the image, create

More information

Injections, Surjections, and the Pigeonhole Principle

Injections, Surjections, and the Pigeonhole Principle Ijectios, Surjectios, ad the Pigeohole Priciple 1 (10 poits Here we will come up with a sloppy boud o the umber of parethesisestigs (a (5 poits Describe a ijectio from the set of possible ways to est pairs

More information

tests 17.1 Simple versus compound

tests 17.1 Simple versus compound PAS204: Lecture 17. tests UMP ad asymtotic I this lecture, we will idetify UMP tests, wherever they exist, for comarig a simle ull hyothesis with a comoud alterative. We also look at costructig tests based

More information

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population A quick activity - Cetral Limit Theorem ad Proportios Lecture 21: Testig Proportios Statistics 10 Coli Rudel Flip a coi 30 times this is goig to get loud! Record the umber of heads you obtaied ad calculate

More information

4. Partial Sums and the Central Limit Theorem

4. Partial Sums and the Central Limit Theorem 1 of 10 7/16/2009 6:05 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 4. Partial Sums ad the Cetral Limit Theorem The cetral limit theorem ad the law of large umbers are the two fudametal theorems

More information

Quiz #2 TEXT SIMILARITY. Class feedback. Class presentations 3/21/11

Quiz #2 TEXT SIMILARITY. Class feedback. Class presentations 3/21/11 Quiz #2 Out of 30 poits High: 28.75 Ave: 23 Will drop lowest quiz I do ot grade based o absolutes TEXT SIMILARITY David Kauchak CS159 Sprig 2011 Class feedback Class presetatios Thaks! Specific commets:

More information

6.867 Machine learning, lecture 7 (Jaakkola) 1

6.867 Machine learning, lecture 7 (Jaakkola) 1 6.867 Machie learig, lecture 7 (Jaakkola) 1 Lecture topics: Kerel form of liear regressio Kerels, examples, costructio, properties Liear regressio ad kerels Cosider a slightly simpler model where we omit

More information

WHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? ABSTRACT

WHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? ABSTRACT WHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? Harold G. Loomis Hoolulu, HI ABSTRACT Most coastal locatios have few if ay records of tsuami wave heights obtaied over various time periods. Still

More information

Lecture 1 Probability and Statistics

Lecture 1 Probability and Statistics Wikipedia: Lecture 1 Probability ad Statistics Bejami Disraeli, British statesma ad literary figure (1804 1881): There are three kids of lies: lies, damed lies, ad statistics. popularized i US by Mark

More information

Recurrence Relations

Recurrence Relations Recurrece Relatios Aalysis of recursive algorithms, such as: it factorial (it ) { if (==0) retur ; else retur ( * factorial(-)); } Let t be the umber of multiplicatios eeded to calculate factorial(). The

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

Notes for Lecture 11

Notes for Lecture 11 U.C. Berkeley CS78: Computatioal Complexity Hadout N Professor Luca Trevisa 3/4/008 Notes for Lecture Eigevalues, Expasio, ad Radom Walks As usual by ow, let G = (V, E) be a udirected d-regular graph with

More information

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i

More information

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence Chapter 8: Estimatig with Cofidece Sectio 8.2 The Practice of Statistics, 4 th editio For AP* STARNES, YATES, MOORE Chapter 8 Estimatig with Cofidece 8.1 Cofidece Itervals: The Basics 8.2 8.3 Estimatig

More information

Sequences, Mathematical Induction, and Recursion. CSE 2353 Discrete Computational Structures Spring 2018

Sequences, Mathematical Induction, and Recursion. CSE 2353 Discrete Computational Structures Spring 2018 CSE 353 Discrete Computatioal Structures Sprig 08 Sequeces, Mathematical Iductio, ad Recursio (Chapter 5, Epp) Note: some course slides adopted from publisher-provided material Overview May mathematical

More information

µ and π p i.e. Point Estimation x And, more generally, the population proportion is approximately equal to a sample proportion

µ and π p i.e. Point Estimation x And, more generally, the population proportion is approximately equal to a sample proportion Poit Estimatio Poit estimatio is the rather simplistic (ad obvious) process of usig the kow value of a sample statistic as a approximatio to the ukow value of a populatio parameter. So we could for example

More information

Randomized Algorithms I, Spring 2018, Department of Computer Science, University of Helsinki Homework 1: Solutions (Discussed January 25, 2018)

Randomized Algorithms I, Spring 2018, Department of Computer Science, University of Helsinki Homework 1: Solutions (Discussed January 25, 2018) Radomized Algorithms I, Sprig 08, Departmet of Computer Sciece, Uiversity of Helsiki Homework : Solutios Discussed Jauary 5, 08). Exercise.: Cosider the followig balls-ad-bi game. We start with oe black

More information

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function.

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function. MATH 532 Measurable Fuctios Dr. Neal, WKU Throughout, let ( X, F, µ) be a measure space ad let (!, F, P ) deote the special case of a probability space. We shall ow begi to study real-valued fuctios defied

More information

1 Hash tables. 1.1 Implementation

1 Hash tables. 1.1 Implementation Lecture 8 Hash Tables, Uiversal Hash Fuctios, Balls ad Bis Scribes: Luke Johsto, Moses Charikar, G. Valiat Date: Oct 18, 2017 Adapted From Virgiia Williams lecture otes 1 Hash tables A hash table is a

More information

NUMERICAL METHODS FOR SOLVING EQUATIONS

NUMERICAL METHODS FOR SOLVING EQUATIONS Mathematics Revisio Guides Numerical Methods for Solvig Equatios Page 1 of 11 M.K. HOME TUITION Mathematics Revisio Guides Level: GCSE Higher Tier NUMERICAL METHODS FOR SOLVING EQUATIONS Versio:. Date:

More information

As stated by Laplace, Probability is common sense reduced to calculation.

As stated by Laplace, Probability is common sense reduced to calculation. Note: Hadouts DO NOT replace the book. I most cases, they oly provide a guidelie o topics ad a ituitive feel. The math details will be covered i class, so it is importat to atted class ad also you MUST

More information

Kinetics of Complex Reactions

Kinetics of Complex Reactions Kietics of Complex Reactios by Flick Colema Departmet of Chemistry Wellesley College Wellesley MA 28 wcolema@wellesley.edu Copyright Flick Colema 996. All rights reserved. You are welcome to use this documet

More information

Section 5.1 The Basics of Counting

Section 5.1 The Basics of Counting 1 Sectio 5.1 The Basics of Coutig Combiatorics, the study of arragemets of objects, is a importat part of discrete mathematics. I this chapter, we will lear basic techiques of coutig which has a lot of

More information

Anna Janicka Mathematical Statistics 2018/2019 Lecture 1, Parts 1 & 2

Anna Janicka Mathematical Statistics 2018/2019 Lecture 1, Parts 1 & 2 Aa Jaicka Mathematical Statistics 18/19 Lecture 1, Parts 1 & 1. Descriptive Statistics By the term descriptive statistics we will mea the tools used for quatitative descriptio of the properties of a sample

More information

Queuing Theory. Basic properties, Markovian models, Networks of queues, General service time distributions, Finite source models, Multiserver queues

Queuing Theory. Basic properties, Markovian models, Networks of queues, General service time distributions, Finite source models, Multiserver queues Queuig Theory Basic properties, Markovia models, Networks of queues, Geeral service time distributios, Fiite source models, Multiserver queues Chapter 8 Kedall s Notatio for Queuig Systems A/B/X/Y/Z: A

More information

ECO 312 Fall 2013 Chris Sims LIKELIHOOD, POSTERIORS, DIAGNOSING NON-NORMALITY

ECO 312 Fall 2013 Chris Sims LIKELIHOOD, POSTERIORS, DIAGNOSING NON-NORMALITY ECO 312 Fall 2013 Chris Sims LIKELIHOOD, POSTERIORS, DIAGNOSING NON-NORMALITY (1) A distributio that allows asymmetry differet probabilities for egative ad positive outliers is the asymmetric double expoetial,

More information

There is no straightforward approach for choosing the warmup period l.

There is no straightforward approach for choosing the warmup period l. B. Maddah INDE 504 Discrete-Evet Simulatio Output Aalysis () Statistical Aalysis for Steady-State Parameters I a otermiatig simulatio, the iterest is i estimatig the log ru steady state measures of performace.

More information

c. Explain the basic Newsvendor model. Why is it useful for SC models? e. What additional research do you believe will be helpful in this area?

c. Explain the basic Newsvendor model. Why is it useful for SC models? e. What additional research do you believe will be helpful in this area? 1. Research Methodology a. What is meat by the supply chai (SC) coordiatio problem ad does it apply to all types of SC s? Does the Bullwhip effect relate to all types of SC s? Also does it relate to SC

More information

AMS570 Lecture Notes #2

AMS570 Lecture Notes #2 AMS570 Lecture Notes # Review of Probability (cotiued) Probability distributios. () Biomial distributio Biomial Experimet: ) It cosists of trials ) Each trial results i of possible outcomes, S or F 3)

More information

Sequences. Notation. Convergence of a Sequence

Sequences. Notation. Convergence of a Sequence Sequeces A sequece is essetially just a list. Defiitio (Sequece of Real Numbers). A sequece of real umbers is a fuctio Z (, ) R for some real umber. Do t let the descriptio of the domai cofuse you; it

More information

Vector Quantization: a Limiting Case of EM

Vector Quantization: a Limiting Case of EM . Itroductio & defiitios Assume that you are give a data set X = { x j }, j { 2,,, }, of d -dimesioal vectors. The vector quatizatio (VQ) problem requires that we fid a set of prototype vectors Z = { z

More information

Analysis of Algorithms. Introduction. Contents

Analysis of Algorithms. Introduction. Contents Itroductio The focus of this module is mathematical aspects of algorithms. Our mai focus is aalysis of algorithms, which meas evaluatig efficiecy of algorithms by aalytical ad mathematical methods. We

More information

Section 1.1. Calculus: Areas And Tangents. Difference Equations to Differential Equations

Section 1.1. Calculus: Areas And Tangents. Difference Equations to Differential Equations Differece Equatios to Differetial Equatios Sectio. Calculus: Areas Ad Tagets The study of calculus begis with questios about chage. What happes to the velocity of a swigig pedulum as its positio chages?

More information

Simulation. Two Rule For Inverting A Distribution Function

Simulation. Two Rule For Inverting A Distribution Function Simulatio Two Rule For Ivertig A Distributio Fuctio Rule 1. If F(x) = u is costat o a iterval [x 1, x 2 ), the the uiform value u is mapped oto x 2 through the iversio process. Rule 2. If there is a jump

More information

6.867 Machine learning

6.867 Machine learning 6.867 Machie learig Mid-term exam October, ( poits) Your ame ad MIT ID: Problem We are iterested here i a particular -dimesioal liear regressio problem. The dataset correspodig to this problem has examples

More information

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments: Recall: STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS Commets:. So far we have estimates of the parameters! 0 ad!, but have o idea how good these estimates are. Assumptio: E(Y x)! 0 +! x (liear coditioal

More information

Notes for Lecture 5. 1 Grover Search. 1.1 The Setting. 1.2 Motivation. Lecture 5 (September 26, 2018)

Notes for Lecture 5. 1 Grover Search. 1.1 The Setting. 1.2 Motivation. Lecture 5 (September 26, 2018) COS 597A: Quatum Cryptography Lecture 5 (September 6, 08) Lecturer: Mark Zhadry Priceto Uiversity Scribe: Fermi Ma Notes for Lecture 5 Today we ll move o from the slightly cotrived applicatios of quatum

More information

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals

More information

Statistical Machine Translation

Statistical Machine Translation Statistical Machie Traslatio LECTURE 5 HIGHER IBM MODELS APRIL 6 200 Brief Outlie - IBM Model 2 - IBM Model 3 - IBM Model 4 - IBM Model 5 Ref: The Mathematics of Statistical Machie Traslatio: Parameter

More information

KLMED8004 Medical statistics. Part I, autumn Estimation. We have previously learned: Population and sample. New questions

KLMED8004 Medical statistics. Part I, autumn Estimation. We have previously learned: Population and sample. New questions We have previously leared: KLMED8004 Medical statistics Part I, autum 00 How kow probability distributios (e.g. biomial distributio, ormal distributio) with kow populatio parameters (mea, variace) ca give

More information

Revision Topic 1: Number and algebra

Revision Topic 1: Number and algebra Revisio Topic : Number ad algebra Chapter : Number Differet types of umbers You eed to kow that there are differet types of umbers ad recogise which group a particular umber belogs to: Type of umber Symbol

More information

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates. 5. Data, Estimates, ad Models: quatifyig the accuracy of estimates. 5. Estimatig a Normal Mea 5.2 The Distributio of the Normal Sample Mea 5.3 Normal data, cofidece iterval for, kow 5.4 Normal data, cofidece

More information

Probability, Expectation Value and Uncertainty

Probability, Expectation Value and Uncertainty Chapter 1 Probability, Expectatio Value ad Ucertaity We have see that the physically observable properties of a quatum system are represeted by Hermitea operators (also referred to as observables ) such

More information

Economics Spring 2015

Economics Spring 2015 1 Ecoomics 400 -- Sprig 015 /17/015 pp. 30-38; Ch. 7.1.4-7. New Stata Assigmet ad ew MyStatlab assigmet, both due Feb 4th Midterm Exam Thursday Feb 6th, Chapters 1-7 of Groeber text ad all relevat lectures

More information