Probability and Information Theory for Language Modeling. Statistical Linguistics. Statistical Linguistics: Adult Monolingual Speaker
|
|
- Abigail Osborne
- 6 years ago
- Views:
Transcription
1 Probability ad Iformatio Theory for Laguage Modelig Statistical vs. Symbolic NLP Elemetary Probability Theory Laguage Modelig Iformatio Theory Statistical Liguistics Statistical approaches are clearly useful for egieerig tasks. Are statistical approaches appropriate for scietific study? Laguage Acquisitio Laguage Chage Laguage Variatio 1 2 Statistical Liguistics: Adult Mooligual Speaker Elemetary Probability Theory Error tolerace Laguage comprehesio: A average setece has a huge umber of "possible" sytactic structures. Termiology: Experimet Sample a repeatable process a possible outcome Sample space all samples for a experimet Defiig which aalysis is correct is ot a computatioal problem. Possible Couter: These are "performace issues" But ote: "performace issues" is ot the same as "computatioal issues" Evet a set of samples Probability assigs a probability distributio to each sample. P(A) P x x A Uiform All samples are Distributio equi-probable. 3 4
2 Elemetary Probability Theory (2) A B Prior probability of evet A is P(A) We are told that B is true Now the probability of A is P(A ad B)/P(B) This is the coditioal probability of A give B: P(A B) = P(A B)/P(B) Elemetary Probability Theory (3) Multiplicatio Rule: P(A B) = P(A B)P(B) = P(B A)P(A) Bayes rule: P(A B) = P(B A)P(A)/P(B) Idepedace: P(A B) = P(A) P(B A) = P(B) P(A B) = P(A)P(B) 5 6 Bayes Theorem: Example Laguage Modelig Evet A = a setece cotais a certai liguistic costructio P(A) = Evet B = our program reports that it foud the liguistic costructio P(B A) = 0.9 P(B A) = 0.1 (true positive) (false positive) Suppose the program fids the costructio. What is the chace that it is correct? Goal: Fid the probability of a "text" "text" ca be a word, a utterace, a documet, etc. Texts are geerated by a ukow probability distributio Use laguage model to capture a priori iformatio about the likelihood of a text. We are more likely to predict a text with a higher a priori probability. P(A B) = P(B A)P(A)/P(B) 7 8
3 Statistical Iferece Why Do Laguage Modelig? Use traiig data to make ifereces about the ukow distributio. Defie a class of cadidate probability distributios. Divide texts ito equivalace classes. Examples: P w = P w.type P w i = P w i.type w i 1.type Select the "best" cadidate Use traiig data to evaluate each cadidate probability distributio. Speech recogitio Predict word sequeces that are more likely. Spellig correctio Suggest words that are more likely. Machie traslatio Suggest traslatios that are more likely. Geeratio Laguage models help us geerate "likely" seteces Output Forms Fidig the Source Text Each text geerates a output form, which we ca directly observe. Speech recogitio: a sequece of souds Spellig correctio: a sequece of characters Bayes Rule: Laguage Model P text P output text P text output = P output Machie traslatio: a source laguage text Texts ad their output forms are geerated by a ukow distributio: P text,output Fid the most likely text for a output form: P text output 11 Recoverig the uderlyig form: P text output = P text P output text P output P text P output text 12 =
4 Divide & Coquor Equivalace Classes P(output) has a very large sample space. P(w w 1,..., w -1 ) has a large sample space. We ca ot directly model P(output). Reduce the problem of modelig P(output) to simpler estimatio problems, whose solutios ca be combied. Estimate the probability of a text oe word at a time: P w 1, w 2,...,w = P w i w 1,...,w i1 i=1 Divide P(w w 1,..., w -1 ) ito equivalace classes Example: P(w w 1,..., w -1 ) P(w w -1 ) Estimate the probability of each equivalace class. Cout the umber of traiig istaces i each equivalace class. Use these couts to estimate the probability for each equivalace class Maximum Likelihood Estimatio Problems with MLE Predict the probability of a equivalace class usig its relative frequecy i the traiig data: P x = C x N C(x) = frequecy of x i the traiig data N = umber of traiig istaces Uderestimates the probability for usee data C(x)=0 Maybe we just did't have eough traiig data. Overestimates the probability for rare data C(x)=1 Estimates based o oe traiig sample are ureliable
5 Statistical Estimators Statistical Estimators: ltk Use the traiig data to form a more advaced estimate of P(x) ltk.probability.probdisti defies a iterface for probability distributios. Laplace, Lidstoe, ad ELE: Reserve a small portio of the probability distributio for usee data. See Maig pp Held out estimatio: Use a small amout of held-out data to decide the probability of usee data. Probability distributios are typically costructed from frequecy distributios: >>> pdist = ELEProbDist(fdist) >>> prit pdist.prob('the') 0.02 >>> prit pdist.cod_prob('the', SetEvet('the', 'a')) 0.6 See the ltk referece documetatio for more iformatio Noisy Chael Model Noisy Chael Model (2) ecoder chael I O decoder P(o i) Î ecoder chael I O decoder P(o i) Î Chael itroduces "oise." Task: optimize throughput ad accuracy More redudacy -> More accuracy Less redudacy -> More throughput May statistical NLP problems ca be thought of as decodig problems. No cotrol over ecodig Example: optical character recogitio i o P(i) = actual text = text with mistakes = laguage model P(o i) = model of OCR errors See Maig Table 2.2 (p. 71) 19 20
6 Etropy Example: Simplified Polyesia The iformatio cotet of a probability distributio. Letter p t k a i u P(Letter) 1/8 1/4 1/8 1/4 1/8 1/8 The average legth of the message eeded to trasmit the outcome. H X = P x log P x x Per-letter etropy: H X = P x logp x = log1 8 plus2 1 4 log1 4 Example: fair 8-sided die H X = x= 1 8 P x logp x 8 = x=1 1 8 log1 8 = 2.5bits = log1 8 = log 1 8 = Optimal Biary Ecodigs Example: Simplified Polyesia (2) O average, how may yes/o questios do you eed to ask to fid the outcome? Example questios: Is the letter t or a? Is the letter a cosoat? Biary ecodig: each bit is a yes/o questio. Etropy is the average message legth, usig this ecodig. t p, k, u, or i? y a? u or i? y y a p k? Optimal ecodig for Simplified Polyesia: Letter p t k a i u Ecodig y k i u? y u 23 24
7 Relative Etropy Readigs Relative etropy measures the differece betwee two probability distributios. D p q = p x log p x x X q x If we ecode the actual data usig the laguage model, how may extra bits do we use, o average? Use relative etropy to measure performace: p = actual distributio (from test data) q = laguage model performace = D(p q) Maig 1, 2 Abey #
Entropies & Information Theory
Etropies & Iformatio Theory LECTURE I Nilajaa Datta Uiversity of Cambridge,U.K. For more details: see lecture otes (Lecture 1- Lecture 5) o http://www.qi.damtp.cam.ac.uk/ode/223 Quatum Iformatio Theory
More informationStatistical Pattern Recognition
Statistical Patter Recogitio Classificatio: No-Parametric Modelig Hamid R. Rabiee Jafar Muhammadi Sprig 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Ageda Parametric Modelig No-Parametric Modelig
More informationInformation Theory and Coding
Sol. Iformatio Theory ad Codig. The capacity of a bad-limited additive white Gaussia (AWGN) chael is give by C = Wlog 2 ( + σ 2 W ) bits per secod(bps), where W is the chael badwidth, is the average power
More informationLecture 10: Universal coding and prediction
0-704: Iformatio Processig ad Learig Sprig 0 Lecture 0: Uiversal codig ad predictio Lecturer: Aarti Sigh Scribes: Georg M. Goerg Disclaimer: These otes have ot bee subjected to the usual scrutiy reserved
More informationThe Maximum-Likelihood Decoding Performance of Error-Correcting Codes
The Maximum-Lielihood Decodig Performace of Error-Correctig Codes Hery D. Pfister ECE Departmet Texas A&M Uiversity August 27th, 2007 (rev. 0) November 2st, 203 (rev. ) Performace of Codes. Notatio X,
More informationIntro to Learning Theory
Lecture 1, October 18, 2016 Itro to Learig Theory Ruth Urer 1 Machie Learig ad Learig Theory Comig soo 2 Formal Framework 21 Basic otios I our formal model for machie learig, the istaces to be classified
More information1 Review of Probability & Statistics
1 Review of Probability & Statistics a. I a group of 000 people, it has bee reported that there are: 61 smokers 670 over 5 960 people who imbibe (drik alcohol) 86 smokers who imbibe 90 imbibers over 5
More informationShannon s noiseless coding theorem
18.310 lecture otes May 4, 2015 Shao s oiseless codig theorem Lecturer: Michel Goemas I these otes we discuss Shao s oiseless codig theorem, which is oe of the foudig results of the field of iformatio
More informationIntroduction to Automata Theory. Reading: Chapter 1
Itroductio to Automata Theory Readig: Chapter 1 1 What is Automata Theory? Study of abstract computig devices, or machies Automato = a abstract computig device Note: A device eed ot eve be a physical hardware!
More informationLectures on Stochastic System Analysis and Bayesian Updating
Lectures o Stochastic System Aalysis ad Bayesia Updatig Jue 29-July 13 2005 James L. Beck, Califoria Istitute of Techology Jiaye Chig, Natioal Taiwa Uiversity of Sciece & Techology Siu-Kui (Iva) Au, Nayag
More informationProbability and MLE.
10-701 Probability ad MLE http://www.cs.cmu.edu/~pradeepr/701 (brief) itro to probability Basic otatios Radom variable - referrig to a elemet / evet whose status is ukow: A = it will rai tomorrow Domai
More informationEXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY
EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 016 MODULE : Statistical Iferece Time allowed: Three hours Cadidates should aswer FIVE questios. All questios carry equal marks. The umber
More informationCS284A: Representations and Algorithms in Molecular Biology
CS284A: Represetatios ad Algorithms i Molecular Biology Scribe Notes o Lectures 3 & 4: Motif Discovery via Eumeratio & Motif Represetatio Usig Positio Weight Matrix Joshua Gervi Based o presetatios by
More informationLecture 6: Source coding, Typicality, and Noisy channels and capacity
15-859: Iformatio Theory ad Applicatios i TCS CMU: Sprig 2013 Lecture 6: Source codig, Typicality, ad Noisy chaels ad capacity Jauary 31, 2013 Lecturer: Mahdi Cheraghchi Scribe: Togbo Huag 1 Recap Uiversal
More informationAs stated by Laplace, Probability is common sense reduced to calculation.
Note: Hadouts DO NOT replace the book. I most cases, they oly provide a guidelie o topics ad a ituitive feel. The math details will be covered i class, so it is importat to atted class ad also you MUST
More informationThe Bayesian Learning Framework. Back to Maximum Likelihood. Naïve Bayes. Simple Example: Coin Tosses. Given a generative model
Back to Maximum Likelihood Give a geerative model f (x, y = k) =π k f k (x) Usig a geerative modellig approach, we assume a parametric form for f k (x) =f (x; k ) ad compute the MLE θ of θ =(π k, k ) k=
More informationChapter 6 Principles of Data Reduction
Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a
More informationChannel coding, linear block codes, Hamming and cyclic codes Lecture - 8
Digital Commuicatio Chael codig, liear block codes, Hammig ad cyclic codes Lecture - 8 Ir. Muhamad Asial, MSc., PhD Ceter for Iformatio ad Commuicatio Egieerig Research (CICER) Electrical Egieerig Departmet
More informationQuick Review of Probability
Quick Review of Probability Berli Che Departmet of Computer Sciece & Iformatio Egieerig Natioal Taiwa Normal Uiversity Refereces: 1. W. Navidi. Statistics for Egieerig ad Scietists. Chapter & Teachig Material.
More informationInstructor: Judith Canner Spring 2010 CONFIDENCE INTERVALS How do we make inferences about the population parameters?
CONFIDENCE INTERVALS How do we make ifereces about the populatio parameters? The samplig distributio allows us to quatify the variability i sample statistics icludig how they differ from the parameter
More informationActivity 3: Length Measurements with the Four-Sided Meter Stick
Activity 3: Legth Measuremets with the Four-Sided Meter Stick OBJECTIVE: The purpose of this experimet is to study errors ad the propagatio of errors whe experimetal data derived usig a four-sided meter
More informationECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015
ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],
More informationMATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4
MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.
More informationExpectation-Maximization Algorithm.
Expectatio-Maximizatio Algorithm. Petr Pošík Czech Techical Uiversity i Prague Faculty of Electrical Egieerig Dept. of Cyberetics MLE 2 Likelihood.........................................................................................................
More informationAdvanced Stochastic Processes.
Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios.
More informationExpectation and Variance of a random variable
Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio
More information15-780: Graduate Artificial Intelligence. Density estimation
5-780: Graduate Artificial Itelligece Desity estimatio Coditioal Probability Tables (CPT) But where do we get them? P(B)=.05 B P(E)=. E P(A B,E) )=.95 P(A B, E) =.85 P(A B,E) )=.5 P(A B, E) =.05 A P(J
More informationSECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES
SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,
More informationMOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.
XI-1 (1074) MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND. R. E. D. WOOLSEY AND H. S. SWANSON XI-2 (1075) STATISTICAL DECISION MAKING Advaced
More informationQuick Review of Probability
Quick Review of Probability Berli Che Departmet of Computer Sciece & Iformatio Egieerig Natioal Taiwa Normal Uiversity Refereces: 1. W. Navidi. Statistics for Egieerig ad Scietists. Chapter 2 & Teachig
More informationLecture 14: Graph Entropy
15-859: Iformatio Theory ad Applicatios i TCS Sprig 2013 Lecture 14: Graph Etropy March 19, 2013 Lecturer: Mahdi Cheraghchi Scribe: Euiwoog Lee 1 Recap Bergma s boud o the permaet Shearer s Lemma Number
More informationDiscrete Mathematics and Probability Theory Spring 2013 Anant Sahai Lecture 18
EECS 70 Discrete Mathematics ad Probability Theory Sprig 2013 Aat Sahai Lecture 18 Iferece Oe of the major uses of probability is to provide a systematic framework to perform iferece uder ucertaity. A
More informationTopics Machine learning: lecture 2. Review: the learning problem. Hypotheses and estimation. Estimation criterion cont d. Estimation criterion
.87 Machie learig: lecture Tommi S. Jaakkola MIT CSAIL tommi@csail.mit.edu Topics The learig problem hypothesis class, estimatio algorithm loss ad estimatio criterio samplig, empirical ad epected losses
More informationChapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.
Chapter 22 Comparig Two Proportios Copyright 2010, 2007, 2004 Pearso Educatio, Ic. Comparig Two Proportios Read the first two paragraphs of pg 504. Comparisos betwee two percetages are much more commo
More informationMathematical Statistics - MS
Paper Specific Istructios. The examiatio is of hours duratio. There are a total of 60 questios carryig 00 marks. The etire paper is divided ito three sectios, A, B ad C. All sectios are compulsory. Questios
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationUC Berkeley CS 170: Efficient Algorithms and Intractable Problems Handout 17 Lecturer: David Wagner April 3, Notes 17 for CS 170
UC Berkeley CS 170: Efficiet Algorithms ad Itractable Problems Hadout 17 Lecturer: David Wager April 3, 2003 Notes 17 for CS 170 1 The Lempel-Ziv algorithm There is a sese i which the Huffma codig was
More informationRun-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE
Geeral e Image Coder Structure Motio Video (s 1,s 2,t) or (s 1,s 2 ) Natural Image Samplig A form of data compressio; usually lossless, but ca be lossy Redudacy Removal Lossless compressio: predictive
More informationBinary codes from graphs on triples and permutation decoding
Biary codes from graphs o triples ad permutatio decodig J. D. Key Departmet of Mathematical Scieces Clemso Uiversity Clemso SC 29634 U.S.A. J. Moori ad B. G. Rodrigues School of Mathematics Statistics
More informationInformation Theory Model for Radiation
Joural of Applied Mathematics ad Physics, 26, 4, 6-66 Published Olie August 26 i SciRes. http://www.scirp.org/joural/jamp http://dx.doi.org/.426/jamp.26.487 Iformatio Theory Model for Radiatio Philipp
More informationDISTRIBUTION LAW Okunev I.V.
1 DISTRIBUTION LAW Okuev I.V. Distributio law belogs to a umber of the most complicated theoretical laws of mathematics. But it is also a very importat practical law. Nothig ca help uderstad complicated
More informationInformation-based Feature Selection
Iformatio-based Feature Selectio Farza Faria, Abbas Kazeroui, Afshi Babveyh Email: {faria,abbask,afshib}@staford.edu 1 Itroductio Feature selectio is a topic of great iterest i applicatios dealig with
More informationEE 6885 Statistical Pattern Recognition
EE 6885 Statistical Patter Recogitio Fall 5 Prof. Shih-Fu Chag http://www.ee.columbia.edu/~sfchag Lecture 6 (9/8/5 EE6887-Chag 6- Readig EM for Missig Features Textboo, DHS 3.9 Bayesia Parameter Estimatio
More informationLecture 16: Achieving and Estimating the Fundamental Limit
EE378A tatistical igal Processig Lecture 6-05/25/207 Lecture 6: Achievig ad Estimatig the Fudametal Limit Lecturer: Jiatao Jiao cribe: William Clary I this lecture, we formally defie the two distict problems
More information6.867 Machine learning, lecture 7 (Jaakkola) 1
6.867 Machie learig, lecture 7 (Jaakkola) 1 Lecture topics: Kerel form of liear regressio Kerels, examples, costructio, properties Liear regressio ad kerels Cosider a slightly simpler model where we omit
More informationBIOSTATISTICS. Lecture 5 Interval Estimations for Mean and Proportion. dr. Petr Nazarov
Microarray Ceter BIOSTATISTICS Lecture 5 Iterval Estimatios for Mea ad Proportio dr. Petr Nazarov 15-03-013 petr.azarov@crp-sate.lu Lecture 5. Iterval estimatio for mea ad proportio OUTLINE Iterval estimatios
More informationSets and Probabilistic Models
ets ad Probabilistic Models Berli Che Departmet of Computer ciece & Iformatio Egieerig Natioal Taiwa Normal Uiversity Referece: - D. P. Bertsekas, J. N. Tsitsiklis, Itroductio to Probability, ectios 1.1-1.2
More informationMath 140 Introductory Statistics
8.2 Testig a Proportio Math 1 Itroductory Statistics Professor B. Abrego Lecture 15 Sectios 8.2 People ofte make decisios with data by comparig the results from a sample to some predetermied stadard. These
More informationData Analysis and Statistical Methods Statistics 651
Data Aalysis ad Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasii/teachig.html Suhasii Subba Rao Review of testig: Example The admistrator of a ursig home wats to do a time ad motio
More informationSTA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:
STA 2023 Module 10 Comparig Two Proportios Learig Objectives Upo completig this module, you should be able to: 1. Perform large-sample ifereces (hypothesis test ad cofidece itervals) to compare two populatio
More informationDS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10
DS 00: Priciples ad Techiques of Data Sciece Date: April 3, 208 Name: Hypothesis Testig Discussio #0. Defie these terms below as they relate to hypothesis testig. a) Data Geeratio Model: Solutio: A set
More informationLecture 9: Hierarchy Theorems
IAS/PCMI Summer Sessio 2000 Clay Mathematics Udergraduate Program Basic Course o Computatioal Complexity Lecture 9: Hierarchy Theorems David Mix Barrigto ad Alexis Maciel July 27, 2000 Most of this lecture
More informationENGI 4421 Probability and Statistics Faculty of Engineering and Applied Science Problem Set 1 Solutions Descriptive Statistics. None at all!
ENGI 44 Probability ad Statistics Faculty of Egieerig ad Applied Sciece Problem Set Solutios Descriptive Statistics. If, i the set of values {,, 3, 4, 5, 6, 7 } a error causes the value 5 to be replaced
More informationTopic 5: Basics of Probability
Topic 5: Jue 1, 2011 1 Itroductio Mathematical structures lie Euclidea geometry or algebraic fields are defied by a set of axioms. Mathematical reality is the developed through the itroductio of cocepts
More informationMathematical Notation Math Introduction to Applied Statistics
Mathematical Notatio Math 113 - Itroductio to Applied Statistics Name : Use Word or WordPerfect to recreate the followig documets. Each article is worth 10 poits ad ca be prited ad give to the istructor
More informationLecture 2: Monte Carlo Simulation
STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?
More informationOverview of Gaussian MIMO (Vector) BC
Overview of Gaussia MIMO (Vector) BC Gwamo Ku Adaptive Sigal Processig ad Iformatio Theory Research Group Nov. 30, 2012 Outlie / Capacity Regio of Gaussia MIMO BC System Structure Kow Capacity Regios -
More information1. Universal v.s. non-universal: know the source distribution or not.
28. Radom umber geerators Let s play the followig game: Give a stream of Ber( p) bits, with ukow p, we wat to tur them ito pure radom bits, i.e., idepedet fair coi flips Ber( / 2 ). Our goal is to fid
More informationSystem Diagnostics using Kalman Filter Estimation Error
System Diagostics usig Kalma Filter Estimatio Error Prof. Seugchul Lee, Seugtae Park, Heechag Kim, Hyusuk Huh ICMR2015 (11/25/2015) Machie Health Diagostics Desig Desig DNA + + Blue Prit phagocytes lymphocytes
More informationThis is an introductory course in Analysis of Variance and Design of Experiments.
1 Notes for M 384E, Wedesday, Jauary 21, 2009 (Please ote: I will ot pass out hard-copy class otes i future classes. If there are writte class otes, they will be posted o the web by the ight before class
More informationHypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance
Hypothesis Testig Empirically evaluatig accuracy of hypotheses: importat activity i ML. Three questios: Give observed accuracy over a sample set, how well does this estimate apply over additioal samples?
More information6.3 Testing Series With Positive Terms
6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial
More informationIntroductory statistics
CM9S: Machie Learig for Bioiformatics Lecture - 03/3/06 Itroductory statistics Lecturer: Sriram Sakararama Scribe: Sriram Sakararama We will provide a overview of statistical iferece focussig o the key
More information( ) = p and P( i = b) = q.
MATH 540 Radom Walks Part 1 A radom walk X is special stochastic process that measures the height (or value) of a particle that radomly moves upward or dowward certai fixed amouts o each uit icremet of
More informationMachine Learning Theory (CS 6783)
Machie Learig Theory (CS 6783) Lecture 2 : Learig Frameworks, Examples Settig up learig problems. X : istace space or iput space Examples: Computer Visio: Raw M N image vectorized X = 0, 255 M N, SIFT
More informationLECTURE NOTES 9. 1 Point Estimation. 1.1 The Method of Moments
LECTURE NOTES 9 Poit Estimatio Uder the hypothesis that the sample was geerated from some parametric statistical model, a atural way to uderstad the uderlyig populatio is by estimatig the parameters of
More informationComputing Confidence Intervals for Sample Data
Computig Cofidece Itervals for Sample Data Topics Use of Statistics Sources of errors Accuracy, precisio, resolutio A mathematical model of errors Cofidece itervals For meas For variaces For proportios
More informationInformation Theory and Statistics Lecture 4: Lempel-Ziv code
Iformatio Theory ad Statistics Lecture 4: Lempel-Ziv code Łukasz Dębowski ldebowsk@ipipa.waw.pl Ph. D. Programme 203/204 Etropy rate is the limitig compressio rate Theorem For a statioary process (X i)
More informationThe Expectation-Maximization (EM) Algorithm
The Expectatio-Maximizatio (EM) Algorithm Readig Assigmets T. Mitchell, Machie Learig, McGraw-Hill, 997 (sectio 6.2, hard copy). S. Gog et al. Dyamic Visio: From Images to Face Recogitio, Imperial College
More informationOn Modeling On Minimum Description Length Modeling. M-closed
O Modelig O Miiu Descriptio Legth Modelig M M-closed M-ope Do you believe that the data geeratig echais really is i your odel class M? 7 73 Miiu Descriptio Legth Priciple o-m-closed predictive iferece
More informationClustering. CM226: Machine Learning for Bioinformatics. Fall Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar.
Clusterig CM226: Machie Learig for Bioiformatics. Fall 216 Sriram Sakararama Ackowledgmets: Fei Sha, Ameet Talwalkar Clusterig 1 / 42 Admiistratio HW 1 due o Moday. Email/post o CCLE if you have questios.
More informationChapter 1. Probability
Chapter. Probability. Set defiitios 2. Set operatios 3. Probability itroduced through sets ad relative frequecy 4. Joit ad coditioal probability 5. Idepedet evets 6. Combied experimets 7. Beroulli trials
More informationChapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.
Chapter 22 Comparig Two Proportios Copyright 2010 Pearso Educatio, Ic. Comparig Two Proportios Comparisos betwee two percetages are much more commo tha questios about isolated percetages. Ad they are more
More informationRandom Variables, Sampling and Estimation
Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig
More informationLecture 2: April 3, 2013
TTIC/CMSC 350 Mathematical Toolkit Sprig 203 Madhur Tulsiai Lecture 2: April 3, 203 Scribe: Shubhedu Trivedi Coi tosses cotiued We retur to the coi tossig example from the last lecture agai: Example. Give,
More informationSTAC51: Categorical data Analysis
STAC51: Categorical data Aalysis Mahida Samarakoo Jauary 28, 2016 Mahida Samarakoo STAC51: Categorical data Aalysis 1 / 35 Table of cotets Iferece for Proportios 1 Iferece for Proportios Mahida Samarakoo
More information10-701/ Machine Learning Mid-term Exam Solution
0-70/5-78 Machie Learig Mid-term Exam Solutio Your Name: Your Adrew ID: True or False (Give oe setece explaatio) (20%). (F) For a cotiuous radom variable x ad its probability distributio fuctio p(x), it
More informationCooperative Communication Fundamentals & Coding Techniques
3 th ICACT Tutorial Cooperative commuicatio fudametals & codig techiques Cooperative Commuicatio Fudametals & Codig Techiques 0..4 Electroics ad Telecommuicatio Research Istitute Kiug Jug 3 th ICACT Tutorial
More informationContext-free grammars and. Basics of string generation methods
Cotext-free grammars ad laguages Basics of strig geeratio methods What s so great about regular expressios? A regular expressio is a strig represetatio of a regular laguage This allows the storig a whole
More informationEmpirical Process Theory and Oracle Inequalities
Stat 928: Statistical Learig Theory Lecture: 10 Empirical Process Theory ad Oracle Iequalities Istructor: Sham Kakade 1 Risk vs Risk See Lecture 0 for a discussio o termiology. 2 The Uio Boud / Boferoi
More informationApril 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE
April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE TERRY SOO Abstract These otes are adapted from whe I taught Math 526 ad meat to give a quick itroductio to cofidece
More informationAre Slepian-Wolf Rates Necessary for Distributed Parameter Estimation?
Are Slepia-Wolf Rates Necessary for Distributed Parameter Estimatio? Mostafa El Gamal ad Lifeg Lai Departmet of Electrical ad Computer Egieerig Worcester Polytechic Istitute {melgamal, llai}@wpi.edu arxiv:1508.02765v2
More informationConfidence Intervals รศ.ดร. อน นต ผลเพ ม Assoc.Prof. Anan Phonphoem, Ph.D. Intelligent Wireless Network Group (IWING Lab)
Cofidece Itervals รศ.ดร. อน นต ผลเพ ม Assoc.Prof. Aa Phophoem, Ph.D. aa.p@ku.ac.th Itelliget Wireless Network Group (IWING Lab) http://iwig.cpe.ku.ac.th Computer Egieerig Departmet Kasetsart Uiversity,
More informationELEC1200: A System View of Communications: from Signals to Packets Lecture 3
ELEC2: A System View of Commuicatios: from Sigals to Packets Lecture 3 Commuicatio chaels Discrete time Chael Modelig the chael Liear Time Ivariat Systems Step Respose Respose to sigle bit Respose to geeral
More informationJanuary 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS
Jauary 25, 207 INTRODUCTION TO MATHEMATICAL STATISTICS Abstract. A basic itroductio to statistics assumig kowledge of probability theory.. Probability I a typical udergraduate problem i probability, we
More informationChapter 6 Sampling Distributions
Chapter 6 Samplig Distributios 1 I most experimets, we have more tha oe measuremet for ay give variable, each measuremet beig associated with oe radomly selected a member of a populatio. Hece we eed to
More informationChapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE. Part 3: Summary of CI for µ Confidence Interval for a Population Proportion p
Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE Part 3: Summary of CI for µ Cofidece Iterval for a Populatio Proportio p Sectio 8-4 Summary for creatig a 100(1-α)% CI for µ: Whe σ 2 is kow ad paret
More informationCSIE/GINM, NTU 2009/11/30 1
Itroductio ti to Machie Learig (Part (at1: Statistical Machie Learig Shou de Li CSIE/GINM, NTU sdli@csie.tu.edu.tw 009/11/30 1 Syllabus of a Itro ML course ( Machie Learig, Adrew Ng, Staford, Autum 009
More informationCS322: Network Analysis. Problem Set 2 - Fall 2009
Due October 9 009 i class CS3: Network Aalysis Problem Set - Fall 009 If you have ay questios regardig the problems set, sed a email to the course assistats: simlac@staford.edu ad peleato@staford.edu.
More informationThe picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled
1 Lecture : Area Area ad distace traveled Approximatig area by rectagles Summatio The area uder a parabola 1.1 Area ad distace Suppose we have the followig iformatio about the velocity of a particle, how
More informationThis exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.
Probability ad Statistics FS 07 Secod Sessio Exam 09.0.08 Time Limit: 80 Miutes Name: Studet ID: This exam cotais 9 pages (icludig this cover page) ad 0 questios. A Formulae sheet is provided with the
More information11 Hidden Markov Models
Hidde Markov Models Hidde Markov Models are a popular machie learig approach i bioiformatics. Machie learig algorithms are preseted with traiig data, which are used to derive importat isights about the
More informationSets and Probabilistic Models
ets ad Probabilistic Models Berli Che Departmet of Computer ciece & Iformatio Egieerig Natioal Taiwa Normal iversity Referece: - D. P. Bertsekas, J. N. Tsitsiklis, Itroductio to Probability, ectios 1.1-1.2
More informationSeed and Sieve of Odd Composite Numbers with Applications in Factorization of Integers
IOSR Joural of Mathematics (IOSR-JM) e-issn: 78-578, p-issn: 319-75X. Volume 1, Issue 5 Ver. VIII (Sep. - Oct.01), PP 01-07 www.iosrjourals.org Seed ad Sieve of Odd Composite Numbers with Applicatios i
More informationECE 564/645 - Digital Communication Systems (Spring 2014) Final Exam Friday, May 2nd, 8:00-10:00am, Marston 220
ECE 564/645 - Digital Commuicatio Systems (Sprig 014) Fial Exam Friday, May d, 8:00-10:00am, Marsto 0 Overview The exam cosists of four (or five) problems for 100 (or 10) poits. The poits for each part
More informationMachine Learning, Spring 2011: Homework 1 Solution
10-701 Machie Learig, Sprig 011: Homework 1 Solutio February 1, 011 Istructios There are 3 questios o this assigmet. The last questio ivolves codig. Attach your code to the writeup. Please submit your
More informationLecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting
Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would
More informationRead through these prior to coming to the test and follow them when you take your test.
Math 143 Sprig 2012 Test 2 Iformatio 1 Test 2 will be give i class o Thursday April 5. Material Covered The test is cummulative, but will emphasize the recet material (Chapters 6 8, 10 11, ad Sectios 12.1
More informationQueuing Theory. Basic properties, Markovian models, Networks of queues, General service time distributions, Finite source models, Multiserver queues
Queuig Theory Basic properties, Markovia models, Networks of queues, Geeral service time distributios, Fiite source models, Multiserver queues Chapter 8 Kedall s Notatio for Queuig Systems A/B/X/Y/Z: A
More informationProblem Set 4 Due Oct, 12
EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios
More information