Probability and Information Theory for Language Modeling. Statistical Linguistics. Statistical Linguistics: Adult Monolingual Speaker

Size: px
Start display at page:

Download "Probability and Information Theory for Language Modeling. Statistical Linguistics. Statistical Linguistics: Adult Monolingual Speaker"

Transcription

1 Probability ad Iformatio Theory for Laguage Modelig Statistical vs. Symbolic NLP Elemetary Probability Theory Laguage Modelig Iformatio Theory Statistical Liguistics Statistical approaches are clearly useful for egieerig tasks. Are statistical approaches appropriate for scietific study? Laguage Acquisitio Laguage Chage Laguage Variatio 1 2 Statistical Liguistics: Adult Mooligual Speaker Elemetary Probability Theory Error tolerace Laguage comprehesio: A average setece has a huge umber of "possible" sytactic structures. Termiology: Experimet Sample a repeatable process a possible outcome Sample space all samples for a experimet Defiig which aalysis is correct is ot a computatioal problem. Possible Couter: These are "performace issues" But ote: "performace issues" is ot the same as "computatioal issues" Evet a set of samples Probability assigs a probability distributio to each sample. P(A) P x x A Uiform All samples are Distributio equi-probable. 3 4

2 Elemetary Probability Theory (2) A B Prior probability of evet A is P(A) We are told that B is true Now the probability of A is P(A ad B)/P(B) This is the coditioal probability of A give B: P(A B) = P(A B)/P(B) Elemetary Probability Theory (3) Multiplicatio Rule: P(A B) = P(A B)P(B) = P(B A)P(A) Bayes rule: P(A B) = P(B A)P(A)/P(B) Idepedace: P(A B) = P(A) P(B A) = P(B) P(A B) = P(A)P(B) 5 6 Bayes Theorem: Example Laguage Modelig Evet A = a setece cotais a certai liguistic costructio P(A) = Evet B = our program reports that it foud the liguistic costructio P(B A) = 0.9 P(B A) = 0.1 (true positive) (false positive) Suppose the program fids the costructio. What is the chace that it is correct? Goal: Fid the probability of a "text" "text" ca be a word, a utterace, a documet, etc. Texts are geerated by a ukow probability distributio Use laguage model to capture a priori iformatio about the likelihood of a text. We are more likely to predict a text with a higher a priori probability. P(A B) = P(B A)P(A)/P(B) 7 8

3 Statistical Iferece Why Do Laguage Modelig? Use traiig data to make ifereces about the ukow distributio. Defie a class of cadidate probability distributios. Divide texts ito equivalace classes. Examples: P w = P w.type P w i = P w i.type w i 1.type Select the "best" cadidate Use traiig data to evaluate each cadidate probability distributio. Speech recogitio Predict word sequeces that are more likely. Spellig correctio Suggest words that are more likely. Machie traslatio Suggest traslatios that are more likely. Geeratio Laguage models help us geerate "likely" seteces Output Forms Fidig the Source Text Each text geerates a output form, which we ca directly observe. Speech recogitio: a sequece of souds Spellig correctio: a sequece of characters Bayes Rule: Laguage Model P text P output text P text output = P output Machie traslatio: a source laguage text Texts ad their output forms are geerated by a ukow distributio: P text,output Fid the most likely text for a output form: P text output 11 Recoverig the uderlyig form: P text output = P text P output text P output P text P output text 12 =

4 Divide & Coquor Equivalace Classes P(output) has a very large sample space. P(w w 1,..., w -1 ) has a large sample space. We ca ot directly model P(output). Reduce the problem of modelig P(output) to simpler estimatio problems, whose solutios ca be combied. Estimate the probability of a text oe word at a time: P w 1, w 2,...,w = P w i w 1,...,w i1 i=1 Divide P(w w 1,..., w -1 ) ito equivalace classes Example: P(w w 1,..., w -1 ) P(w w -1 ) Estimate the probability of each equivalace class. Cout the umber of traiig istaces i each equivalace class. Use these couts to estimate the probability for each equivalace class Maximum Likelihood Estimatio Problems with MLE Predict the probability of a equivalace class usig its relative frequecy i the traiig data: P x = C x N C(x) = frequecy of x i the traiig data N = umber of traiig istaces Uderestimates the probability for usee data C(x)=0 Maybe we just did't have eough traiig data. Overestimates the probability for rare data C(x)=1 Estimates based o oe traiig sample are ureliable

5 Statistical Estimators Statistical Estimators: ltk Use the traiig data to form a more advaced estimate of P(x) ltk.probability.probdisti defies a iterface for probability distributios. Laplace, Lidstoe, ad ELE: Reserve a small portio of the probability distributio for usee data. See Maig pp Held out estimatio: Use a small amout of held-out data to decide the probability of usee data. Probability distributios are typically costructed from frequecy distributios: >>> pdist = ELEProbDist(fdist) >>> prit pdist.prob('the') 0.02 >>> prit pdist.cod_prob('the', SetEvet('the', 'a')) 0.6 See the ltk referece documetatio for more iformatio Noisy Chael Model Noisy Chael Model (2) ecoder chael I O decoder P(o i) Î ecoder chael I O decoder P(o i) Î Chael itroduces "oise." Task: optimize throughput ad accuracy More redudacy -> More accuracy Less redudacy -> More throughput May statistical NLP problems ca be thought of as decodig problems. No cotrol over ecodig Example: optical character recogitio i o P(i) = actual text = text with mistakes = laguage model P(o i) = model of OCR errors See Maig Table 2.2 (p. 71) 19 20

6 Etropy Example: Simplified Polyesia The iformatio cotet of a probability distributio. Letter p t k a i u P(Letter) 1/8 1/4 1/8 1/4 1/8 1/8 The average legth of the message eeded to trasmit the outcome. H X = P x log P x x Per-letter etropy: H X = P x logp x = log1 8 plus2 1 4 log1 4 Example: fair 8-sided die H X = x= 1 8 P x logp x 8 = x=1 1 8 log1 8 = 2.5bits = log1 8 = log 1 8 = Optimal Biary Ecodigs Example: Simplified Polyesia (2) O average, how may yes/o questios do you eed to ask to fid the outcome? Example questios: Is the letter t or a? Is the letter a cosoat? Biary ecodig: each bit is a yes/o questio. Etropy is the average message legth, usig this ecodig. t p, k, u, or i? y a? u or i? y y a p k? Optimal ecodig for Simplified Polyesia: Letter p t k a i u Ecodig y k i u? y u 23 24

7 Relative Etropy Readigs Relative etropy measures the differece betwee two probability distributios. D p q = p x log p x x X q x If we ecode the actual data usig the laguage model, how may extra bits do we use, o average? Use relative etropy to measure performace: p = actual distributio (from test data) q = laguage model performace = D(p q) Maig 1, 2 Abey #

Entropies & Information Theory

Entropies & Information Theory Etropies & Iformatio Theory LECTURE I Nilajaa Datta Uiversity of Cambridge,U.K. For more details: see lecture otes (Lecture 1- Lecture 5) o http://www.qi.damtp.cam.ac.uk/ode/223 Quatum Iformatio Theory

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Patter Recogitio Classificatio: No-Parametric Modelig Hamid R. Rabiee Jafar Muhammadi Sprig 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Ageda Parametric Modelig No-Parametric Modelig

More information

Information Theory and Coding

Information Theory and Coding Sol. Iformatio Theory ad Codig. The capacity of a bad-limited additive white Gaussia (AWGN) chael is give by C = Wlog 2 ( + σ 2 W ) bits per secod(bps), where W is the chael badwidth, is the average power

More information

Lecture 10: Universal coding and prediction

Lecture 10: Universal coding and prediction 0-704: Iformatio Processig ad Learig Sprig 0 Lecture 0: Uiversal codig ad predictio Lecturer: Aarti Sigh Scribes: Georg M. Goerg Disclaimer: These otes have ot bee subjected to the usual scrutiy reserved

More information

The Maximum-Likelihood Decoding Performance of Error-Correcting Codes

The Maximum-Likelihood Decoding Performance of Error-Correcting Codes The Maximum-Lielihood Decodig Performace of Error-Correctig Codes Hery D. Pfister ECE Departmet Texas A&M Uiversity August 27th, 2007 (rev. 0) November 2st, 203 (rev. ) Performace of Codes. Notatio X,

More information

Intro to Learning Theory

Intro to Learning Theory Lecture 1, October 18, 2016 Itro to Learig Theory Ruth Urer 1 Machie Learig ad Learig Theory Comig soo 2 Formal Framework 21 Basic otios I our formal model for machie learig, the istaces to be classified

More information

1 Review of Probability & Statistics

1 Review of Probability & Statistics 1 Review of Probability & Statistics a. I a group of 000 people, it has bee reported that there are: 61 smokers 670 over 5 960 people who imbibe (drik alcohol) 86 smokers who imbibe 90 imbibers over 5

More information

Shannon s noiseless coding theorem

Shannon s noiseless coding theorem 18.310 lecture otes May 4, 2015 Shao s oiseless codig theorem Lecturer: Michel Goemas I these otes we discuss Shao s oiseless codig theorem, which is oe of the foudig results of the field of iformatio

More information

Introduction to Automata Theory. Reading: Chapter 1

Introduction to Automata Theory. Reading: Chapter 1 Itroductio to Automata Theory Readig: Chapter 1 1 What is Automata Theory? Study of abstract computig devices, or machies Automato = a abstract computig device Note: A device eed ot eve be a physical hardware!

More information

Lectures on Stochastic System Analysis and Bayesian Updating

Lectures on Stochastic System Analysis and Bayesian Updating Lectures o Stochastic System Aalysis ad Bayesia Updatig Jue 29-July 13 2005 James L. Beck, Califoria Istitute of Techology Jiaye Chig, Natioal Taiwa Uiversity of Sciece & Techology Siu-Kui (Iva) Au, Nayag

More information

Probability and MLE.

Probability and MLE. 10-701 Probability ad MLE http://www.cs.cmu.edu/~pradeepr/701 (brief) itro to probability Basic otatios Radom variable - referrig to a elemet / evet whose status is ukow: A = it will rai tomorrow Domai

More information

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 016 MODULE : Statistical Iferece Time allowed: Three hours Cadidates should aswer FIVE questios. All questios carry equal marks. The umber

More information

CS284A: Representations and Algorithms in Molecular Biology

CS284A: Representations and Algorithms in Molecular Biology CS284A: Represetatios ad Algorithms i Molecular Biology Scribe Notes o Lectures 3 & 4: Motif Discovery via Eumeratio & Motif Represetatio Usig Positio Weight Matrix Joshua Gervi Based o presetatios by

More information

Lecture 6: Source coding, Typicality, and Noisy channels and capacity

Lecture 6: Source coding, Typicality, and Noisy channels and capacity 15-859: Iformatio Theory ad Applicatios i TCS CMU: Sprig 2013 Lecture 6: Source codig, Typicality, ad Noisy chaels ad capacity Jauary 31, 2013 Lecturer: Mahdi Cheraghchi Scribe: Togbo Huag 1 Recap Uiversal

More information

As stated by Laplace, Probability is common sense reduced to calculation.

As stated by Laplace, Probability is common sense reduced to calculation. Note: Hadouts DO NOT replace the book. I most cases, they oly provide a guidelie o topics ad a ituitive feel. The math details will be covered i class, so it is importat to atted class ad also you MUST

More information

The Bayesian Learning Framework. Back to Maximum Likelihood. Naïve Bayes. Simple Example: Coin Tosses. Given a generative model

The Bayesian Learning Framework. Back to Maximum Likelihood. Naïve Bayes. Simple Example: Coin Tosses. Given a generative model Back to Maximum Likelihood Give a geerative model f (x, y = k) =π k f k (x) Usig a geerative modellig approach, we assume a parametric form for f k (x) =f (x; k ) ad compute the MLE θ of θ =(π k, k ) k=

More information

Chapter 6 Principles of Data Reduction

Chapter 6 Principles of Data Reduction Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a

More information

Channel coding, linear block codes, Hamming and cyclic codes Lecture - 8

Channel coding, linear block codes, Hamming and cyclic codes Lecture - 8 Digital Commuicatio Chael codig, liear block codes, Hammig ad cyclic codes Lecture - 8 Ir. Muhamad Asial, MSc., PhD Ceter for Iformatio ad Commuicatio Egieerig Research (CICER) Electrical Egieerig Departmet

More information

Quick Review of Probability

Quick Review of Probability Quick Review of Probability Berli Che Departmet of Computer Sciece & Iformatio Egieerig Natioal Taiwa Normal Uiversity Refereces: 1. W. Navidi. Statistics for Egieerig ad Scietists. Chapter & Teachig Material.

More information

Instructor: Judith Canner Spring 2010 CONFIDENCE INTERVALS How do we make inferences about the population parameters?

Instructor: Judith Canner Spring 2010 CONFIDENCE INTERVALS How do we make inferences about the population parameters? CONFIDENCE INTERVALS How do we make ifereces about the populatio parameters? The samplig distributio allows us to quatify the variability i sample statistics icludig how they differ from the parameter

More information

Activity 3: Length Measurements with the Four-Sided Meter Stick

Activity 3: Length Measurements with the Four-Sided Meter Stick Activity 3: Legth Measuremets with the Four-Sided Meter Stick OBJECTIVE: The purpose of this experimet is to study errors ad the propagatio of errors whe experimetal data derived usig a four-sided meter

More information

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015 ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],

More information

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4 MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.

More information

Expectation-Maximization Algorithm.

Expectation-Maximization Algorithm. Expectatio-Maximizatio Algorithm. Petr Pošík Czech Techical Uiversity i Prague Faculty of Electrical Egieerig Dept. of Cyberetics MLE 2 Likelihood.........................................................................................................

More information

Advanced Stochastic Processes.

Advanced Stochastic Processes. Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios.

More information

Expectation and Variance of a random variable

Expectation and Variance of a random variable Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio

More information

15-780: Graduate Artificial Intelligence. Density estimation

15-780: Graduate Artificial Intelligence. Density estimation 5-780: Graduate Artificial Itelligece Desity estimatio Coditioal Probability Tables (CPT) But where do we get them? P(B)=.05 B P(E)=. E P(A B,E) )=.95 P(A B, E) =.85 P(A B,E) )=.5 P(A B, E) =.05 A P(J

More information

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,

More information

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND. XI-1 (1074) MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND. R. E. D. WOOLSEY AND H. S. SWANSON XI-2 (1075) STATISTICAL DECISION MAKING Advaced

More information

Quick Review of Probability

Quick Review of Probability Quick Review of Probability Berli Che Departmet of Computer Sciece & Iformatio Egieerig Natioal Taiwa Normal Uiversity Refereces: 1. W. Navidi. Statistics for Egieerig ad Scietists. Chapter 2 & Teachig

More information

Lecture 14: Graph Entropy

Lecture 14: Graph Entropy 15-859: Iformatio Theory ad Applicatios i TCS Sprig 2013 Lecture 14: Graph Etropy March 19, 2013 Lecturer: Mahdi Cheraghchi Scribe: Euiwoog Lee 1 Recap Bergma s boud o the permaet Shearer s Lemma Number

More information

Discrete Mathematics and Probability Theory Spring 2013 Anant Sahai Lecture 18

Discrete Mathematics and Probability Theory Spring 2013 Anant Sahai Lecture 18 EECS 70 Discrete Mathematics ad Probability Theory Sprig 2013 Aat Sahai Lecture 18 Iferece Oe of the major uses of probability is to provide a systematic framework to perform iferece uder ucertaity. A

More information

Topics Machine learning: lecture 2. Review: the learning problem. Hypotheses and estimation. Estimation criterion cont d. Estimation criterion

Topics Machine learning: lecture 2. Review: the learning problem. Hypotheses and estimation. Estimation criterion cont d. Estimation criterion .87 Machie learig: lecture Tommi S. Jaakkola MIT CSAIL tommi@csail.mit.edu Topics The learig problem hypothesis class, estimatio algorithm loss ad estimatio criterio samplig, empirical ad epected losses

More information

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc. Chapter 22 Comparig Two Proportios Copyright 2010, 2007, 2004 Pearso Educatio, Ic. Comparig Two Proportios Read the first two paragraphs of pg 504. Comparisos betwee two percetages are much more commo

More information

Mathematical Statistics - MS

Mathematical Statistics - MS Paper Specific Istructios. The examiatio is of hours duratio. There are a total of 60 questios carryig 00 marks. The etire paper is divided ito three sectios, A, B ad C. All sectios are compulsory. Questios

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

UC Berkeley CS 170: Efficient Algorithms and Intractable Problems Handout 17 Lecturer: David Wagner April 3, Notes 17 for CS 170

UC Berkeley CS 170: Efficient Algorithms and Intractable Problems Handout 17 Lecturer: David Wagner April 3, Notes 17 for CS 170 UC Berkeley CS 170: Efficiet Algorithms ad Itractable Problems Hadout 17 Lecturer: David Wager April 3, 2003 Notes 17 for CS 170 1 The Lempel-Ziv algorithm There is a sese i which the Huffma codig was

More information

Run-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE

Run-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE Geeral e Image Coder Structure Motio Video (s 1,s 2,t) or (s 1,s 2 ) Natural Image Samplig A form of data compressio; usually lossless, but ca be lossy Redudacy Removal Lossless compressio: predictive

More information

Binary codes from graphs on triples and permutation decoding

Binary codes from graphs on triples and permutation decoding Biary codes from graphs o triples ad permutatio decodig J. D. Key Departmet of Mathematical Scieces Clemso Uiversity Clemso SC 29634 U.S.A. J. Moori ad B. G. Rodrigues School of Mathematics Statistics

More information

Information Theory Model for Radiation

Information Theory Model for Radiation Joural of Applied Mathematics ad Physics, 26, 4, 6-66 Published Olie August 26 i SciRes. http://www.scirp.org/joural/jamp http://dx.doi.org/.426/jamp.26.487 Iformatio Theory Model for Radiatio Philipp

More information

DISTRIBUTION LAW Okunev I.V.

DISTRIBUTION LAW Okunev I.V. 1 DISTRIBUTION LAW Okuev I.V. Distributio law belogs to a umber of the most complicated theoretical laws of mathematics. But it is also a very importat practical law. Nothig ca help uderstad complicated

More information

Information-based Feature Selection

Information-based Feature Selection Iformatio-based Feature Selectio Farza Faria, Abbas Kazeroui, Afshi Babveyh Email: {faria,abbask,afshib}@staford.edu 1 Itroductio Feature selectio is a topic of great iterest i applicatios dealig with

More information

EE 6885 Statistical Pattern Recognition

EE 6885 Statistical Pattern Recognition EE 6885 Statistical Patter Recogitio Fall 5 Prof. Shih-Fu Chag http://www.ee.columbia.edu/~sfchag Lecture 6 (9/8/5 EE6887-Chag 6- Readig EM for Missig Features Textboo, DHS 3.9 Bayesia Parameter Estimatio

More information

Lecture 16: Achieving and Estimating the Fundamental Limit

Lecture 16: Achieving and Estimating the Fundamental Limit EE378A tatistical igal Processig Lecture 6-05/25/207 Lecture 6: Achievig ad Estimatig the Fudametal Limit Lecturer: Jiatao Jiao cribe: William Clary I this lecture, we formally defie the two distict problems

More information

6.867 Machine learning, lecture 7 (Jaakkola) 1

6.867 Machine learning, lecture 7 (Jaakkola) 1 6.867 Machie learig, lecture 7 (Jaakkola) 1 Lecture topics: Kerel form of liear regressio Kerels, examples, costructio, properties Liear regressio ad kerels Cosider a slightly simpler model where we omit

More information

BIOSTATISTICS. Lecture 5 Interval Estimations for Mean and Proportion. dr. Petr Nazarov

BIOSTATISTICS. Lecture 5 Interval Estimations for Mean and Proportion. dr. Petr Nazarov Microarray Ceter BIOSTATISTICS Lecture 5 Iterval Estimatios for Mea ad Proportio dr. Petr Nazarov 15-03-013 petr.azarov@crp-sate.lu Lecture 5. Iterval estimatio for mea ad proportio OUTLINE Iterval estimatios

More information

Sets and Probabilistic Models

Sets and Probabilistic Models ets ad Probabilistic Models Berli Che Departmet of Computer ciece & Iformatio Egieerig Natioal Taiwa Normal Uiversity Referece: - D. P. Bertsekas, J. N. Tsitsiklis, Itroductio to Probability, ectios 1.1-1.2

More information

Math 140 Introductory Statistics

Math 140 Introductory Statistics 8.2 Testig a Proportio Math 1 Itroductory Statistics Professor B. Abrego Lecture 15 Sectios 8.2 People ofte make decisios with data by comparig the results from a sample to some predetermied stadard. These

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Aalysis ad Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasii/teachig.html Suhasii Subba Rao Review of testig: Example The admistrator of a ursig home wats to do a time ad motio

More information

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to: STA 2023 Module 10 Comparig Two Proportios Learig Objectives Upo completig this module, you should be able to: 1. Perform large-sample ifereces (hypothesis test ad cofidece itervals) to compare two populatio

More information

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10 DS 00: Priciples ad Techiques of Data Sciece Date: April 3, 208 Name: Hypothesis Testig Discussio #0. Defie these terms below as they relate to hypothesis testig. a) Data Geeratio Model: Solutio: A set

More information

Lecture 9: Hierarchy Theorems

Lecture 9: Hierarchy Theorems IAS/PCMI Summer Sessio 2000 Clay Mathematics Udergraduate Program Basic Course o Computatioal Complexity Lecture 9: Hierarchy Theorems David Mix Barrigto ad Alexis Maciel July 27, 2000 Most of this lecture

More information

ENGI 4421 Probability and Statistics Faculty of Engineering and Applied Science Problem Set 1 Solutions Descriptive Statistics. None at all!

ENGI 4421 Probability and Statistics Faculty of Engineering and Applied Science Problem Set 1 Solutions Descriptive Statistics. None at all! ENGI 44 Probability ad Statistics Faculty of Egieerig ad Applied Sciece Problem Set Solutios Descriptive Statistics. If, i the set of values {,, 3, 4, 5, 6, 7 } a error causes the value 5 to be replaced

More information

Topic 5: Basics of Probability

Topic 5: Basics of Probability Topic 5: Jue 1, 2011 1 Itroductio Mathematical structures lie Euclidea geometry or algebraic fields are defied by a set of axioms. Mathematical reality is the developed through the itroductio of cocepts

More information

Mathematical Notation Math Introduction to Applied Statistics

Mathematical Notation Math Introduction to Applied Statistics Mathematical Notatio Math 113 - Itroductio to Applied Statistics Name : Use Word or WordPerfect to recreate the followig documets. Each article is worth 10 poits ad ca be prited ad give to the istructor

More information

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

Overview of Gaussian MIMO (Vector) BC

Overview of Gaussian MIMO (Vector) BC Overview of Gaussia MIMO (Vector) BC Gwamo Ku Adaptive Sigal Processig ad Iformatio Theory Research Group Nov. 30, 2012 Outlie / Capacity Regio of Gaussia MIMO BC System Structure Kow Capacity Regios -

More information

1. Universal v.s. non-universal: know the source distribution or not.

1. Universal v.s. non-universal: know the source distribution or not. 28. Radom umber geerators Let s play the followig game: Give a stream of Ber( p) bits, with ukow p, we wat to tur them ito pure radom bits, i.e., idepedet fair coi flips Ber( / 2 ). Our goal is to fid

More information

System Diagnostics using Kalman Filter Estimation Error

System Diagnostics using Kalman Filter Estimation Error System Diagostics usig Kalma Filter Estimatio Error Prof. Seugchul Lee, Seugtae Park, Heechag Kim, Hyusuk Huh ICMR2015 (11/25/2015) Machie Health Diagostics Desig Desig DNA + + Blue Prit phagocytes lymphocytes

More information

This is an introductory course in Analysis of Variance and Design of Experiments.

This is an introductory course in Analysis of Variance and Design of Experiments. 1 Notes for M 384E, Wedesday, Jauary 21, 2009 (Please ote: I will ot pass out hard-copy class otes i future classes. If there are writte class otes, they will be posted o the web by the ight before class

More information

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance Hypothesis Testig Empirically evaluatig accuracy of hypotheses: importat activity i ML. Three questios: Give observed accuracy over a sample set, how well does this estimate apply over additioal samples?

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

Introductory statistics

Introductory statistics CM9S: Machie Learig for Bioiformatics Lecture - 03/3/06 Itroductory statistics Lecturer: Sriram Sakararama Scribe: Sriram Sakararama We will provide a overview of statistical iferece focussig o the key

More information

( ) = p and P( i = b) = q.

( ) = p and P( i = b) = q. MATH 540 Radom Walks Part 1 A radom walk X is special stochastic process that measures the height (or value) of a particle that radomly moves upward or dowward certai fixed amouts o each uit icremet of

More information

Machine Learning Theory (CS 6783)

Machine Learning Theory (CS 6783) Machie Learig Theory (CS 6783) Lecture 2 : Learig Frameworks, Examples Settig up learig problems. X : istace space or iput space Examples: Computer Visio: Raw M N image vectorized X = 0, 255 M N, SIFT

More information

LECTURE NOTES 9. 1 Point Estimation. 1.1 The Method of Moments

LECTURE NOTES 9. 1 Point Estimation. 1.1 The Method of Moments LECTURE NOTES 9 Poit Estimatio Uder the hypothesis that the sample was geerated from some parametric statistical model, a atural way to uderstad the uderlyig populatio is by estimatig the parameters of

More information

Computing Confidence Intervals for Sample Data

Computing Confidence Intervals for Sample Data Computig Cofidece Itervals for Sample Data Topics Use of Statistics Sources of errors Accuracy, precisio, resolutio A mathematical model of errors Cofidece itervals For meas For variaces For proportios

More information

Information Theory and Statistics Lecture 4: Lempel-Ziv code

Information Theory and Statistics Lecture 4: Lempel-Ziv code Iformatio Theory ad Statistics Lecture 4: Lempel-Ziv code Łukasz Dębowski ldebowsk@ipipa.waw.pl Ph. D. Programme 203/204 Etropy rate is the limitig compressio rate Theorem For a statioary process (X i)

More information

The Expectation-Maximization (EM) Algorithm

The Expectation-Maximization (EM) Algorithm The Expectatio-Maximizatio (EM) Algorithm Readig Assigmets T. Mitchell, Machie Learig, McGraw-Hill, 997 (sectio 6.2, hard copy). S. Gog et al. Dyamic Visio: From Images to Face Recogitio, Imperial College

More information

On Modeling On Minimum Description Length Modeling. M-closed

On Modeling On Minimum Description Length Modeling. M-closed O Modelig O Miiu Descriptio Legth Modelig M M-closed M-ope Do you believe that the data geeratig echais really is i your odel class M? 7 73 Miiu Descriptio Legth Priciple o-m-closed predictive iferece

More information

Clustering. CM226: Machine Learning for Bioinformatics. Fall Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar.

Clustering. CM226: Machine Learning for Bioinformatics. Fall Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar. Clusterig CM226: Machie Learig for Bioiformatics. Fall 216 Sriram Sakararama Ackowledgmets: Fei Sha, Ameet Talwalkar Clusterig 1 / 42 Admiistratio HW 1 due o Moday. Email/post o CCLE if you have questios.

More information

Chapter 1. Probability

Chapter 1. Probability Chapter. Probability. Set defiitios 2. Set operatios 3. Probability itroduced through sets ad relative frequecy 4. Joit ad coditioal probability 5. Idepedet evets 6. Combied experimets 7. Beroulli trials

More information

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc. Chapter 22 Comparig Two Proportios Copyright 2010 Pearso Educatio, Ic. Comparig Two Proportios Comparisos betwee two percetages are much more commo tha questios about isolated percetages. Ad they are more

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

Lecture 2: April 3, 2013

Lecture 2: April 3, 2013 TTIC/CMSC 350 Mathematical Toolkit Sprig 203 Madhur Tulsiai Lecture 2: April 3, 203 Scribe: Shubhedu Trivedi Coi tosses cotiued We retur to the coi tossig example from the last lecture agai: Example. Give,

More information

STAC51: Categorical data Analysis

STAC51: Categorical data Analysis STAC51: Categorical data Aalysis Mahida Samarakoo Jauary 28, 2016 Mahida Samarakoo STAC51: Categorical data Aalysis 1 / 35 Table of cotets Iferece for Proportios 1 Iferece for Proportios Mahida Samarakoo

More information

10-701/ Machine Learning Mid-term Exam Solution

10-701/ Machine Learning Mid-term Exam Solution 0-70/5-78 Machie Learig Mid-term Exam Solutio Your Name: Your Adrew ID: True or False (Give oe setece explaatio) (20%). (F) For a cotiuous radom variable x ad its probability distributio fuctio p(x), it

More information

Cooperative Communication Fundamentals & Coding Techniques

Cooperative Communication Fundamentals & Coding Techniques 3 th ICACT Tutorial Cooperative commuicatio fudametals & codig techiques Cooperative Commuicatio Fudametals & Codig Techiques 0..4 Electroics ad Telecommuicatio Research Istitute Kiug Jug 3 th ICACT Tutorial

More information

Context-free grammars and. Basics of string generation methods

Context-free grammars and. Basics of string generation methods Cotext-free grammars ad laguages Basics of strig geeratio methods What s so great about regular expressios? A regular expressio is a strig represetatio of a regular laguage This allows the storig a whole

More information

Empirical Process Theory and Oracle Inequalities

Empirical Process Theory and Oracle Inequalities Stat 928: Statistical Learig Theory Lecture: 10 Empirical Process Theory ad Oracle Iequalities Istructor: Sham Kakade 1 Risk vs Risk See Lecture 0 for a discussio o termiology. 2 The Uio Boud / Boferoi

More information

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE TERRY SOO Abstract These otes are adapted from whe I taught Math 526 ad meat to give a quick itroductio to cofidece

More information

Are Slepian-Wolf Rates Necessary for Distributed Parameter Estimation?

Are Slepian-Wolf Rates Necessary for Distributed Parameter Estimation? Are Slepia-Wolf Rates Necessary for Distributed Parameter Estimatio? Mostafa El Gamal ad Lifeg Lai Departmet of Electrical ad Computer Egieerig Worcester Polytechic Istitute {melgamal, llai}@wpi.edu arxiv:1508.02765v2

More information

Confidence Intervals รศ.ดร. อน นต ผลเพ ม Assoc.Prof. Anan Phonphoem, Ph.D. Intelligent Wireless Network Group (IWING Lab)

Confidence Intervals รศ.ดร. อน นต ผลเพ ม Assoc.Prof. Anan Phonphoem, Ph.D. Intelligent Wireless Network Group (IWING Lab) Cofidece Itervals รศ.ดร. อน นต ผลเพ ม Assoc.Prof. Aa Phophoem, Ph.D. aa.p@ku.ac.th Itelliget Wireless Network Group (IWING Lab) http://iwig.cpe.ku.ac.th Computer Egieerig Departmet Kasetsart Uiversity,

More information

ELEC1200: A System View of Communications: from Signals to Packets Lecture 3

ELEC1200: A System View of Communications: from Signals to Packets Lecture 3 ELEC2: A System View of Commuicatios: from Sigals to Packets Lecture 3 Commuicatio chaels Discrete time Chael Modelig the chael Liear Time Ivariat Systems Step Respose Respose to sigle bit Respose to geeral

More information

January 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS

January 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS Jauary 25, 207 INTRODUCTION TO MATHEMATICAL STATISTICS Abstract. A basic itroductio to statistics assumig kowledge of probability theory.. Probability I a typical udergraduate problem i probability, we

More information

Chapter 6 Sampling Distributions

Chapter 6 Sampling Distributions Chapter 6 Samplig Distributios 1 I most experimets, we have more tha oe measuremet for ay give variable, each measuremet beig associated with oe radomly selected a member of a populatio. Hece we eed to

More information

Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE. Part 3: Summary of CI for µ Confidence Interval for a Population Proportion p

Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE. Part 3: Summary of CI for µ Confidence Interval for a Population Proportion p Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE Part 3: Summary of CI for µ Cofidece Iterval for a Populatio Proportio p Sectio 8-4 Summary for creatig a 100(1-α)% CI for µ: Whe σ 2 is kow ad paret

More information

CSIE/GINM, NTU 2009/11/30 1

CSIE/GINM, NTU 2009/11/30 1 Itroductio ti to Machie Learig (Part (at1: Statistical Machie Learig Shou de Li CSIE/GINM, NTU sdli@csie.tu.edu.tw 009/11/30 1 Syllabus of a Itro ML course ( Machie Learig, Adrew Ng, Staford, Autum 009

More information

CS322: Network Analysis. Problem Set 2 - Fall 2009

CS322: Network Analysis. Problem Set 2 - Fall 2009 Due October 9 009 i class CS3: Network Aalysis Problem Set - Fall 009 If you have ay questios regardig the problems set, sed a email to the course assistats: simlac@staford.edu ad peleato@staford.edu.

More information

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled 1 Lecture : Area Area ad distace traveled Approximatig area by rectagles Summatio The area uder a parabola 1.1 Area ad distace Suppose we have the followig iformatio about the velocity of a particle, how

More information

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam. Probability ad Statistics FS 07 Secod Sessio Exam 09.0.08 Time Limit: 80 Miutes Name: Studet ID: This exam cotais 9 pages (icludig this cover page) ad 0 questios. A Formulae sheet is provided with the

More information

11 Hidden Markov Models

11 Hidden Markov Models Hidde Markov Models Hidde Markov Models are a popular machie learig approach i bioiformatics. Machie learig algorithms are preseted with traiig data, which are used to derive importat isights about the

More information

Sets and Probabilistic Models

Sets and Probabilistic Models ets ad Probabilistic Models Berli Che Departmet of Computer ciece & Iformatio Egieerig Natioal Taiwa Normal iversity Referece: - D. P. Bertsekas, J. N. Tsitsiklis, Itroductio to Probability, ectios 1.1-1.2

More information

Seed and Sieve of Odd Composite Numbers with Applications in Factorization of Integers

Seed and Sieve of Odd Composite Numbers with Applications in Factorization of Integers IOSR Joural of Mathematics (IOSR-JM) e-issn: 78-578, p-issn: 319-75X. Volume 1, Issue 5 Ver. VIII (Sep. - Oct.01), PP 01-07 www.iosrjourals.org Seed ad Sieve of Odd Composite Numbers with Applicatios i

More information

ECE 564/645 - Digital Communication Systems (Spring 2014) Final Exam Friday, May 2nd, 8:00-10:00am, Marston 220

ECE 564/645 - Digital Communication Systems (Spring 2014) Final Exam Friday, May 2nd, 8:00-10:00am, Marston 220 ECE 564/645 - Digital Commuicatio Systems (Sprig 014) Fial Exam Friday, May d, 8:00-10:00am, Marsto 0 Overview The exam cosists of four (or five) problems for 100 (or 10) poits. The poits for each part

More information

Machine Learning, Spring 2011: Homework 1 Solution

Machine Learning, Spring 2011: Homework 1 Solution 10-701 Machie Learig, Sprig 011: Homework 1 Solutio February 1, 011 Istructios There are 3 questios o this assigmet. The last questio ivolves codig. Attach your code to the writeup. Please submit your

More information

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would

More information

Read through these prior to coming to the test and follow them when you take your test.

Read through these prior to coming to the test and follow them when you take your test. Math 143 Sprig 2012 Test 2 Iformatio 1 Test 2 will be give i class o Thursday April 5. Material Covered The test is cummulative, but will emphasize the recet material (Chapters 6 8, 10 11, ad Sectios 12.1

More information

Queuing Theory. Basic properties, Markovian models, Networks of queues, General service time distributions, Finite source models, Multiserver queues

Queuing Theory. Basic properties, Markovian models, Networks of queues, General service time distributions, Finite source models, Multiserver queues Queuig Theory Basic properties, Markovia models, Networks of queues, Geeral service time distributios, Fiite source models, Multiserver queues Chapter 8 Kedall s Notatio for Queuig Systems A/B/X/Y/Z: A

More information

Problem Set 4 Due Oct, 12

Problem Set 4 Due Oct, 12 EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios

More information