On Modeling On Minimum Description Length Modeling. M-closed

Size: px
Start display at page:

Download "On Modeling On Minimum Description Length Modeling. M-closed"

Transcription

1 O Modelig O Miiu Descriptio Legth Modelig M M-closed M-ope Do you believe that the data geeratig echais really is i your odel class M? 7 73 Miiu Descriptio Legth Priciple o-m-closed predictive iferece Explicitly iclude predictio ad itervetio) i odelig MDL is a ethod related to odelig, iductive iferece, achie learig Rissae 978-; Barro, Rissae ad Yu 998 Models are a eas a laguage) to describe iterestig properties of the pheoeo to be studied, but they are ot itrisic to the pheoeo itself. tasks Model selectio Paraeter estiatio All odels are false, but soe are useful. Predictio Fro arithetic codig to odelig 74 Model selectio 75 Descriptive coplexity 000 bit strigs Soloooff-Kologorov-Chaiti coplexity shortest possible ecodig with the help of L code based o a uiversal coputer laguage L too strog a descriptio laguage - ucoputability 76 77

2 The idea a good odel M captures regular features costraits) of the observed data ay set of regularities we fid reduces our ucertaity of the data D, ad we ca use it to ecode the data i a shorter ad less redudat way The ore we are able to copress a sequece of data, the ore regularity you have detected i the data ad the ore you have leared fro the data to ake predictios of future data) For exaple regressio There is a trade-off betwee the odel coplexity ad fit to the data MDL coplex siple Types of MDL algorithic, ideal MDL Li ad Vitáyi 97) MML Wallace 68, 87) two-part code MDL Rissae 78, 83) uiversal odel based MDL Rissae 96, Barro, Rissae, Yu 98, Grüwald 0) + P saple space Probability set of all sequeces of outcoes set of arbitrary legth sequeces set of ifiite sequeces probability distributio over 80 8 Code Aalogy coutable) data alphabet A uiquely decodable) code C is a oe-to-oe ap fro to { 0,} + L C x) deotes the legth i bits eeded to describe x let Pbe a probability distributio. Sice P x) x oly very few x ca have large probability let C be a code for { 0,}. Sice the fractio of sequeces that ca be copressed by ore tha k bits is less tha k k = oly very few sybols ca have sall code legth 8 83

3 Correspodece Uiversal codes L is a set of codelegth fuctio)s available to ecode data x there is a - correspodece betwee probability distributios ad code legth fuctios such that large code legths correspod to sall probabilities ad vice versa assue that oe of the codelegth fuctio)s i L allows for substatial copressio of x TASK: ecode x usig iiu uber of bits for all x # + : L x ) = log P x ) 84 Uiversal code ore) 85 Uiversal Models Let M be aprobabilistic odel, i.e., a faily set) of probability distributios Assue M fiite: M = {P ),K, P M )} For exaple: L fiite There exists a code LL such that for soe costat K, for all, x, all L L : There exists a code LM s.t. for all, x, : L L x ) L x ) + K Specifically L M x ) log P x # ) + K hece, exists distributio PM s.t. L L x ) if LL L x ) + K - log PM x ) log P x # ) + K K does ot deped o, while typically L x ) grows liearly i i.e. PM x ) K'#x $ ) PM is a uiversal odel distributio) for M 86 Bayesia ixture as a uiveral odel 87 Two-part MDL code as a uiversal odel let W be a prior over M. The Bayesia argial likelihood is defied as: The ML axiu likelihood) distributio is ˆ x ) ifp # )$M {%log Px # )} PBayes x M ) = P x j )W j ) code x by first codig ˆ x ), the codig x with the help of ˆ x ) : L x M ) = log W ˆ x )) log P x ˆ x )) p j = This is a uiversal odel, sice For all, x, : log PBayes x M ) = Bayes ixture assigs larger probability shorter code legth) to outcoes... log # P x $ j )W $ j ) j = what prior leads to short code legths? log P x # ) log W # ) j =

4 Optial Uiversal Model Optial Uiversal Model Look for P* such that regret ifp * sup x # $log P * x ) $ $log Px %ˆ x )) ) is sall o atter what x are; i.e. look for ifp * sup x # { )} )} is achieved by Noralized Maxiu Likelihood NML) distributio $log P * x ) $ $log Px %ˆ x )) { log P * x ) log Px #ˆ x )) PNML x M ) = P x $ˆ x )) y # P y $ˆ y )) 90 MDL Model Selectio Geoetric Iterpretatio of MDL Uder regularity coditios log PNML x M ) = Select M i iiizig log PNML x M i ), i.e. k log P x #ˆi x )) + log + log det I # ) d# + o) $ # log P x %ˆi x )) + log y $ P y %ˆi y )) error=ius fit) ter coplexity ter log M ) error ter 9 Space of probability distributios #paraeters volue of odel viewed as aifold i space of all distributios The Rieaia volue easure is related to the uber of all possible distiguishable probability distributios that are idexed by the odel faily Balasubraaia, 997): d 93 Cout oly distiguishable distributios The faily of probability distributios fors a Rieaia aifold iforatio geoetry; Rao, 945; Efro, 975; Aari, 980) det I ) 95 4

5 Bayes vs. MDL Uder regularity coditios log PNML x M ) = ˆ k log P x # i x )) + log + log det I ) d + o) # # $ Uder regularity coditios log PBayes x M ) ˆ k log P x i x )) + log log w ˆ) + log det I ) d + o) # If we take Jeffrey s prior det I ) w ) = det I ) d Predictive Iterpretatio iterpret log P x) as loss icurred whe predictig usig P while actual outcoe was x Bayesia argial likelihood ca be writte as cuulative log-loss predictio error = i PBayes x ) log PBayes x ) log = i PBayes x ) i log PBayes xi x, K, xi ) = Loss xi, PBayes x )) i= 97 Philosophy what do we do whe the data geeratig echais is ot i the faily of odels M we cosider? what is prior?) MDL priors are techical i ature Jeffreys prior is uifor prior o the space of distributios with the atural etric that easures distaces betwee distributios by how distiguishable they are 98 5

Stanford Statistics 311/Electrical Engineering 377

Stanford Statistics 311/Electrical Engineering 377 I. Uiversal predictio ad codig a. Gae: sequecex ofdata, adwattopredict(orcode)aswellasifwekew distributio of data b. Two versios: probabilistic ad adversarial. I either case, let p ad q be desities or

More information

Lecture 10: Universal coding and prediction

Lecture 10: Universal coding and prediction 0-704: Iformatio Processig ad Learig Sprig 0 Lecture 0: Uiversal codig ad predictio Lecturer: Aarti Sigh Scribes: Georg M. Goerg Disclaimer: These otes have ot bee subjected to the usual scrutiy reserved

More information

Statistics and Data Analysis in MATLAB Kendrick Kay, February 28, Lecture 4: Model fitting

Statistics and Data Analysis in MATLAB Kendrick Kay, February 28, Lecture 4: Model fitting Statistics ad Data Aalysis i MATLAB Kedrick Kay, kedrick.kay@wustl.edu February 28, 2014 Lecture 4: Model fittig 1. The basics - Suppose that we have a set of data ad suppose that we have selected the

More information

ECE 901 Lecture 4: Estimation of Lipschitz smooth functions

ECE 901 Lecture 4: Estimation of Lipschitz smooth functions ECE 9 Lecture 4: Estiatio of Lipschitz sooth fuctios R. Nowak 5/7/29 Cosider the followig settig. Let Y f (X) + W, where X is a rado variable (r.v.) o X [, ], W is a r.v. o Y R, idepedet of X ad satisfyig

More information

Chapter 2. Asymptotic Notation

Chapter 2. Asymptotic Notation Asyptotic Notatio 3 Chapter Asyptotic Notatio Goal : To siplify the aalysis of ruig tie by gettig rid of details which ay be affected by specific ipleetatio ad hardware. [1] The Big Oh (O-Notatio) : It

More information

Information Theory and Statistics Lecture 4: Lempel-Ziv code

Information Theory and Statistics Lecture 4: Lempel-Ziv code Iformatio Theory ad Statistics Lecture 4: Lempel-Ziv code Łukasz Dębowski ldebowsk@ipipa.waw.pl Ph. D. Programme 203/204 Etropy rate is the limitig compressio rate Theorem For a statioary process (X i)

More information

Statistics for Applications Fall Problem Set 7

Statistics for Applications Fall Problem Set 7 18.650. Statistics for Applicatios Fall 016. Proble Set 7 Due Friday, Oct. 8 at 1 oo Proble 1 QQ-plots Recall that the Laplace distributio with paraeter λ > 0 is the cotiuous probaλ bility easure with

More information

A string of not-so-obvious statements about correlation in the data. (This refers to the mechanical calculation of correlation in the data.

A string of not-so-obvious statements about correlation in the data. (This refers to the mechanical calculation of correlation in the data. STAT-UB.003 NOTES for Wedesday 0.MAY.0 We will use the file JulieApartet.tw. We ll give the regressio of Price o SqFt, show residual versus fitted plot, save residuals ad fitted. Give plot of (Resid, Price,

More information

arxiv: v1 [math.st] 12 Dec 2018

arxiv: v1 [math.st] 12 Dec 2018 DIVERGENCE MEASURES ESTIMATION AND ITS ASYMPTOTIC NORMALITY THEORY : DISCRETE CASE arxiv:181.04795v1 [ath.st] 1 Dec 018 Abstract. 1) BA AMADOU DIADIÉ AND 1,,4) LO GANE SAMB 1. Itroductio 1.1. Motivatios.

More information

18.S34 (FALL, 2007) GREATEST INTEGER PROBLEMS. n + n + 1 = 4n + 2.

18.S34 (FALL, 2007) GREATEST INTEGER PROBLEMS. n + n + 1 = 4n + 2. 18.S34 (FALL, 007) GREATEST INTEGER PROBLEMS Note: We use the otatio x for the greatest iteger x, eve if the origial source used the older otatio [x]. 1. (48P) If is a positive iteger, prove that + + 1

More information

Some Examples on Gibbs Sampling and Metropolis-Hastings methods

Some Examples on Gibbs Sampling and Metropolis-Hastings methods Soe Exaples o Gibbs Saplig ad Metropolis-Hastigs ethods S420/620 Itroductio to Statistical Theory, Fall 2012 Gibbs Sapler Saple a ultidiesioal probability distributio fro coditioal desities. Suppose d

More information

DISTANCE BETWEEN UNCERTAIN RANDOM VARIABLES

DISTANCE BETWEEN UNCERTAIN RANDOM VARIABLES MATHEMATICAL MODELLING OF ENGINEERING PROBLEMS Vol, No, 4, pp5- http://doiorg/88/ep4 DISTANCE BETWEEN UNCERTAIN RANDOM VARIABLES Yogchao Hou* ad Weicai Peg Departet of Matheatical Scieces, Chaohu Uiversity,

More information

(s)h(s) = K( s + 8 ) = 5 and one finite zero is located at z 1

(s)h(s) = K( s + 8 ) = 5 and one finite zero is located at z 1 ROOT LOCUS TECHNIQUE 93 should be desiged differetly to eet differet specificatios depedig o its area of applicatio. We have observed i Sectio 6.4 of Chapter 6, how the variatio of a sigle paraeter like

More information

A PROBABILITY PROBLEM

A PROBABILITY PROBLEM A PROBABILITY PROBLEM A big superarket chai has the followig policy: For every Euros you sped per buy, you ear oe poit (suppose, e.g., that = 3; i this case, if you sped 8.45 Euros, you get two poits,

More information

Mixture models (cont d)

Mixture models (cont d) 6.867 Machie learig, lecture 5 (Jaakkola) Lecture topics: Differet types of ixture odels (cot d) Estiatig ixtures: the EM algorith Mixture odels (cot d) Basic ixture odel Mixture odels try to capture ad

More information

day month year documentname/initials 1

day month year documentname/initials 1 ECE47-57 Patter Recogitio Lecture 0 Noaraetric Desity Estiatio -earest-eighbor (NN) Hairog Qi, Gozalez Faily Professor Electrical Egieerig ad Couter Sciece Uiversity of Teessee, Koxville htt://www.eecs.ut.edu/faculty/qi

More information

Entropies & Information Theory

Entropies & Information Theory Etropies & Iformatio Theory LECTURE I Nilajaa Datta Uiversity of Cambridge,U.K. For more details: see lecture otes (Lecture 1- Lecture 5) o http://www.qi.damtp.cam.ac.uk/ode/223 Quatum Iformatio Theory

More information

AVERAGE MARKS SCALING

AVERAGE MARKS SCALING TERTIARY INSTITUTIONS SERVICE CENTRE Level 1, 100 Royal Street East Perth, Wester Australia 6004 Telephoe (08) 9318 8000 Facsiile (08) 95 7050 http://wwwtisceduau/ 1 Itroductio AVERAGE MARKS SCALING I

More information

Math 475, Problem Set #12: Answers

Math 475, Problem Set #12: Answers Math 475, Problem Set #12: Aswers A. Chapter 8, problem 12, parts (b) ad (d). (b) S # (, 2) = 2 2, sice, from amog the 2 ways of puttig elemets ito 2 distiguishable boxes, exactly 2 of them result i oe

More information

The Maximum-Likelihood Decoding Performance of Error-Correcting Codes

The Maximum-Likelihood Decoding Performance of Error-Correcting Codes The Maximum-Lielihood Decodig Performace of Error-Correctig Codes Hery D. Pfister ECE Departmet Texas A&M Uiversity August 27th, 2007 (rev. 0) November 2st, 203 (rev. ) Performace of Codes. Notatio X,

More information

1.2 AXIOMATIC APPROACH TO PROBABILITY AND PROPERTIES OF PROBABILITY MEASURE 1.2 AXIOMATIC APPROACH TO PROBABILITY AND

1.2 AXIOMATIC APPROACH TO PROBABILITY AND PROPERTIES OF PROBABILITY MEASURE 1.2 AXIOMATIC APPROACH TO PROBABILITY AND NTEL- robability ad Distributios MODULE 1 ROBABILITY LECTURE 2 Topics 1.2 AXIOMATIC AROACH TO ROBABILITY AND ROERTIES OF ROBABILITY MEASURE 1.2.1 Iclusio-Exclusio Forula I the followig sectio we will discuss

More information

Al Lehnen Madison Area Technical College 10/5/2014

Al Lehnen Madison Area Technical College 10/5/2014 The Correlatio of Two Rado Variables Page Preliiary: The Cauchy-Schwarz-Buyakovsky Iequality For ay two sequeces of real ubers { a } ad = { b } =, the followig iequality is always true. Furtherore, equality

More information

Jacobi symbols. p 1. Note: The Jacobi symbol does not necessarily distinguish between quadratic residues and nonresidues. That is, we could have ( a

Jacobi symbols. p 1. Note: The Jacobi symbol does not necessarily distinguish between quadratic residues and nonresidues. That is, we could have ( a Jacobi sybols efiitio Let be a odd positive iteger If 1, the Jacobi sybol : Z C is the costat fuctio 1 1 If > 1, it has a decopositio ( as ) a product of (ot ecessarily distict) pries p 1 p r The Jacobi

More information

Bertrand s postulate Chapter 2

Bertrand s postulate Chapter 2 Bertrad s postulate Chapter We have see that the sequece of prie ubers, 3, 5, 7,... is ifiite. To see that the size of its gaps is ot bouded, let N := 3 5 p deote the product of all prie ubers that are

More information

Chapter 0. Review of set theory. 0.1 Sets

Chapter 0. Review of set theory. 0.1 Sets Chapter 0 Review of set theory Set theory plays a cetral role i the theory of probability. Thus, we will ope this course with a quick review of those otios of set theory which will be used repeatedly.

More information

CS 70 Second Midterm 7 April NAME (1 pt): SID (1 pt): TA (1 pt): Name of Neighbor to your left (1 pt): Name of Neighbor to your right (1 pt):

CS 70 Second Midterm 7 April NAME (1 pt): SID (1 pt): TA (1 pt): Name of Neighbor to your left (1 pt): Name of Neighbor to your right (1 pt): CS 70 Secod Midter 7 April 2011 NAME (1 pt): SID (1 pt): TA (1 pt): Nae of Neighbor to your left (1 pt): Nae of Neighbor to your right (1 pt): Istructios: This is a closed book, closed calculator, closed

More information

Optimal Estimator for a Sample Set with Response Error. Ed Stanek

Optimal Estimator for a Sample Set with Response Error. Ed Stanek Optial Estiator for a Saple Set wit Respose Error Ed Staek Itroductio We develop a optial estiator siilar to te FP estiator wit respose error tat was cosidered i c08ed63doc Te first 6 pages of tis docuet

More information

If a subset E of R contains no open interval, is it of zero measure? For instance, is the set of irrationals in [0, 1] is of measure zero?

If a subset E of R contains no open interval, is it of zero measure? For instance, is the set of irrationals in [0, 1] is of measure zero? 2 Lebesgue Measure I Chapter 1 we defied the cocept of a set of measure zero, ad we have observed that every coutable set is of measure zero. Here are some atural questios: If a subset E of R cotais a

More information

( ) GENERATING FUNCTIONS

( ) GENERATING FUNCTIONS GENERATING FUNCTIONS Solve a ifiite umber of related problems i oe swoop. *Code the problems, maipulate the code, the decode the aswer! Really a algebraic cocept but ca be eteded to aalytic basis for iterestig

More information

Tight Bounds for Universal Compression of Large Alphabets

Tight Bounds for Universal Compression of Large Alphabets Tight Bouds for Uiversal Copressio of Large Alphabets Jayadev Acharya ECE UCSD acharya@ucsdedu Hirakedu Das Yahoo! hdas@yahoo-icco Ashka Jafarpour ECE UCSD ashka@ucsdedu Alo Orlitsky ECE & CSE UCSD alo@ucsdedu

More information

10-701/ Machine Learning Mid-term Exam Solution

10-701/ Machine Learning Mid-term Exam Solution 0-70/5-78 Machie Learig Mid-term Exam Solutio Your Name: Your Adrew ID: True or False (Give oe setece explaatio) (20%). (F) For a cotiuous radom variable x ad its probability distributio fuctio p(x), it

More information

Common Fixed Point Theorem in Fuzzy Metric Spaces using weakly compatible maps

Common Fixed Point Theorem in Fuzzy Metric Spaces using weakly compatible maps IJ Iforatio Egieerig ad Electroic Busiess 2014 2 64-69 Published Olie April 2014 i MECS (http://wwwecs-pressorg/) DOI: 105815/ijieeb20140208 Coo Fixed Poit Theore i Fuzzy Metric Spaces usig weakly copatible

More information

CHAPTER 6 RESISTANCE FACTOR FOR THE DESIGN OF COMPOSITE SLABS

CHAPTER 6 RESISTANCE FACTOR FOR THE DESIGN OF COMPOSITE SLABS CHAPTER 6 RESISTANCE FACTOR FOR THE DESIGN OF COMPOSITE SLABS 6.1. Geeral Probability-based desig criteria i the for of load ad resistace factor desig (LRFD) are ow applied for ost costructio aterials.

More information

AN EFFICIENT ESTIMATION METHOD FOR THE PARETO DISTRIBUTION

AN EFFICIENT ESTIMATION METHOD FOR THE PARETO DISTRIBUTION Joural of Statistics: Advaces i Theory ad Applicatios Volue 3, Nuber, 00, Pages 6-78 AN EFFICIENT ESTIMATION METHOD FOR THE PARETO DISTRIBUTION Departet of Matheatics Brock Uiversity St. Catharies, Otario

More information

CSCI-6971 Lecture Notes: Stochastic processes

CSCI-6971 Lecture Notes: Stochastic processes CSCI-6971 Lecture Notes: Stochastic processes Kristopher R. Beevers Departet of Coputer Sciece Resselaer Polytechic Istitute beevek@cs.rpi.edu February 2, 2006 1 Overview Defiitio 1.1. A stochastic process

More information

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y

More information

REVIEW OF CALCULUS Herman J. Bierens Pennsylvania State University (January 28, 2004) x 2., or x 1. x j. ' ' n i'1 x i well.,y 2

REVIEW OF CALCULUS Herman J. Bierens Pennsylvania State University (January 28, 2004) x 2., or x 1. x j. ' ' n i'1 x i well.,y 2 REVIEW OF CALCULUS Hera J. Bieres Pesylvaia State Uiversity (Jauary 28, 2004) 1. Suatio Let x 1,x 2,...,x e a sequece of uers. The su of these uers is usually deoted y x 1 % x 2 %...% x ' j x j, or x 1

More information

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function.

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function. MATH 532 Measurable Fuctios Dr. Neal, WKU Throughout, let ( X, F, µ) be a measure space ad let (!, F, P ) deote the special case of a probability space. We shall ow begi to study real-valued fuctios defied

More information

A new sequence convergent to Euler Mascheroni constant

A new sequence convergent to Euler Mascheroni constant You ad Che Joural of Iequalities ad Applicatios 08) 08:7 https://doi.org/0.86/s3660-08-670-6 R E S E A R C H Ope Access A ew sequece coverget to Euler Mascheroi costat Xu You * ad Di-Rog Che * Correspodece:

More information

6.867 Machine learning, lecture 7 (Jaakkola) 1

6.867 Machine learning, lecture 7 (Jaakkola) 1 6.867 Machie learig, lecture 7 (Jaakkola) 1 Lecture topics: Kerel form of liear regressio Kerels, examples, costructio, properties Liear regressio ad kerels Cosider a slightly simpler model where we omit

More information

Lecture 11: Channel Coding Theorem: Converse Part

Lecture 11: Channel Coding Theorem: Converse Part EE376A/STATS376A Iformatio Theory Lecture - 02/3/208 Lecture : Chael Codig Theorem: Coverse Part Lecturer: Tsachy Weissma Scribe: Erdem Bıyık I this lecture, we will cotiue our discussio o chael codig

More information

Lecture Outline. 2 Separating Hyperplanes. 3 Banach Mazur Distance An Algorithmist s Toolkit October 22, 2009

Lecture Outline. 2 Separating Hyperplanes. 3 Banach Mazur Distance An Algorithmist s Toolkit October 22, 2009 18.409 A Algorithist s Toolkit October, 009 Lecture 1 Lecturer: Joatha Keler Scribes: Alex Levi (009) 1 Outlie Today we ll go over soe of the details fro last class ad ake precise ay details that were

More information

Intro to Learning Theory

Intro to Learning Theory Lecture 1, October 18, 2016 Itro to Learig Theory Ruth Urer 1 Machie Learig ad Learig Theory Comig soo 2 Formal Framework 21 Basic otios I our formal model for machie learig, the istaces to be classified

More information

L = n i, i=1. dp p n 1

L = n i, i=1. dp p n 1 Exchageable sequeces ad probabilities for probabilities 1996; modified 98 5 21 to add material o mutual iformatio; modified 98 7 21 to add Heath-Sudderth proof of de Fietti represetatio; modified 99 11

More information

Verification of continuous predictands

Verification of continuous predictands barbara.casati@ec.gc.ca Verificatio of cotiuous predictads Barbara Casati 9 Ja 007 Exploratory ethods: joit distributio Scatter-plot: plot of observatio versus forecast values Perfect forecast obs, poits

More information

Surveying the Variance Reduction Methods

Surveying the Variance Reduction Methods Available olie at www.scizer.co Austria Joural of Matheatics ad Statistics, Vol 1, Issue 1, (2017): 10-15 ISSN 0000-0000 Surveyig the Variace Reductio Methods Arash Mirtorabi *1, Gholahossei Gholai 2 1.

More information

It is often useful to approximate complicated functions using simpler ones. We consider the task of approximating a function by a polynomial.

It is often useful to approximate complicated functions using simpler ones. We consider the task of approximating a function by a polynomial. Taylor Polyomials ad Taylor Series It is ofte useful to approximate complicated fuctios usig simpler oes We cosider the task of approximatig a fuctio by a polyomial If f is at least -times differetiable

More information

Chapter 6 Principles of Data Reduction

Chapter 6 Principles of Data Reduction Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a

More information

Bayes Decision Rule and Naïve Bayes Classifier

Bayes Decision Rule and Naïve Bayes Classifier Bayes Decision Rule and Naïve Bayes Classifier Le Song Machine Learning I CSE 6740, Fall 2013 Gaussian Mixture odel A density odel p(x) ay be ulti-odal: odel it as a ixture of uni-odal distributions (e.g.

More information

Stream Ciphers (contd.) Debdeep Mukhopadhyay

Stream Ciphers (contd.) Debdeep Mukhopadhyay Strea Ciphers (cotd.) Debdeep Mukhopadhyay Assistat Professor Departet of Coputer Sciece ad Egieerig Idia Istitute of Techology Kharagpur IDIA -7232 Objectives iear Coplexity Berlekap Massey Algorith ow

More information

Discrete Mathematics: Lectures 8 and 9 Principle of Inclusion and Exclusion Instructor: Arijit Bishnu Date: August 11 and 13, 2009

Discrete Mathematics: Lectures 8 and 9 Principle of Inclusion and Exclusion Instructor: Arijit Bishnu Date: August 11 and 13, 2009 Discrete Matheatics: Lectures 8 ad 9 Priciple of Iclusio ad Exclusio Istructor: Arijit Bishu Date: August ad 3, 009 As you ca observe by ow, we ca cout i various ways. Oe such ethod is the age-old priciple

More information

Minimum Description Length Principle for Maximum Entropy Model Selection

Minimum Description Length Principle for Maximum Entropy Model Selection Miimum Descriptio Legth Priciple for Maximum Etropy Model Selectio Gaurav Padey ad Ambedkar Dukkipati Departmet of Computer Sciece ad Automatio Idia Istitute of Sciece, Bagalore 560012 Email: {gaurav.padey,ad}.csa.iisc.eret.i

More information

Computability and computational complexity

Computability and computational complexity Computability ad computatioal complexity Lecture 4: Uiversal Turig machies. Udecidability Io Petre Computer Sciece, Åbo Akademi Uiversity Fall 2015 http://users.abo.fi/ipetre/computability/ 21. toukokuu

More information

Lecture 10: Bounded Linear Operators and Orthogonality in Hilbert Spaces

Lecture 10: Bounded Linear Operators and Orthogonality in Hilbert Spaces Lecture : Bouded Liear Operators ad Orthogoality i Hilbert Spaces 34 Bouded Liear Operator Let ( X, ), ( Y, ) i i be ored liear vector spaces ad { } X Y The, T is said to be bouded if a real uber c such

More information

UC Berkeley CS 170: Efficient Algorithms and Intractable Problems Handout 17 Lecturer: David Wagner April 3, Notes 17 for CS 170

UC Berkeley CS 170: Efficient Algorithms and Intractable Problems Handout 17 Lecturer: David Wagner April 3, Notes 17 for CS 170 UC Berkeley CS 170: Efficiet Algorithms ad Itractable Problems Hadout 17 Lecturer: David Wager April 3, 2003 Notes 17 for CS 170 1 The Lempel-Ziv algorithm There is a sese i which the Huffma codig was

More information

Summary. Recap ... Last Lecture. Summary. Theorem

Summary. Recap ... Last Lecture. Summary. Theorem Last Lecture Biostatistics 602 - Statistical Iferece Lecture 23 Hyu Mi Kag April 11th, 2013 What is p-value? What is the advatage of p-value compared to hypothesis testig procedure with size α? How ca

More information

Machine Learning Theory (CS 6783)

Machine Learning Theory (CS 6783) Machie Learig Theory (CS 6783) Lecture 2 : Learig Frameworks, Examples Settig up learig problems. X : istace space or iput space Examples: Computer Visio: Raw M N image vectorized X = 0, 255 M N, SIFT

More information

A Note on the Applied Use of MDL Approximations

A Note on the Applied Use of MDL Approximations A Note on the Applied Use of MDL Approxiations Daniel J. Navarro Departent of Psychology Ohio State University Abstract An applied proble is discussed in which two nested psychological odels of retention

More information

10-704: Information Processing and Learning Spring Lecture 10: Feb 12

10-704: Information Processing and Learning Spring Lecture 10: Feb 12 10-704: Iformatio Processig ad Learig Sprig 2015 Lecture 10: Feb 12 Lecturer: Akshay Krishamurthy Scribe: Dea Asta, Kirthevasa Kadasamy Disclaimer: These otes have ot bee subjected to the usual scrutiy

More information

The Binomial Multi-Section Transformer

The Binomial Multi-Section Transformer 4/15/2010 The Bioial Multisectio Matchig Trasforer preset.doc 1/24 The Bioial Multi-Sectio Trasforer Recall that a ulti-sectio atchig etwork ca be described usig the theory of sall reflectios as: where:

More information

We have also learned that, thanks to the Central Limit Theorem and the Law of Large Numbers,

We have also learned that, thanks to the Central Limit Theorem and the Law of Large Numbers, Cofidece Itervals III What we kow so far: We have see how to set cofidece itervals for the ea, or expected value, of a oral probability distributio, both whe the variace is kow (usig the stadard oral,

More information

Markov Decision Processes

Markov Decision Processes Markov Decisio Processes Defiitios; Statioary policies; Value improvemet algorithm, Policy improvemet algorithm, ad liear programmig for discouted cost ad average cost criteria. Markov Decisio Processes

More information

Pb ( a ) = measure of the plausibility of proposition b conditional on the information stated in proposition a. & then using P2

Pb ( a ) = measure of the plausibility of proposition b conditional on the information stated in proposition a. & then using P2 Axioms for Probability Logic Pb ( a ) = measure of the plausibility of propositio b coditioal o the iformatio stated i propositio a For propositios a, b ad c: P: Pb ( a) 0 P2: Pb ( a& b ) = P3: Pb ( a)

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

42 Dependence and Bases

42 Dependence and Bases 42 Depedece ad Bases The spa s(a) of a subset A i vector space V is a subspace of V. This spa ay be the whole vector space V (we say the A spas V). I this paragraph we study subsets A of V which spa V

More information

Supplementary Material

Supplementary Material Suppleetary Material Wezhuo Ya a0096049@us.edu.s Departet of Mechaical Eieeri, Natioal Uiversity of Siapore, Siapore 117576 Hua Xu pexuh@us.edu.s Departet of Mechaical Eieeri, Natioal Uiversity of Siapore,

More information

Entropy Rates and Asymptotic Equipartition

Entropy Rates and Asymptotic Equipartition Chapter 29 Etropy Rates ad Asymptotic Equipartitio Sectio 29. itroduces the etropy rate the asymptotic etropy per time-step of a stochastic process ad shows that it is well-defied; ad similarly for iformatio,

More information

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 016 MODULE : Statistical Iferece Time allowed: Three hours Cadidates should aswer FIVE questios. All questios carry equal marks. The umber

More information

19.1 The dictionary problem

19.1 The dictionary problem CS125 Lecture 19 Fall 2016 19.1 The dictioary proble Cosider the followig data structural proble, usually called the dictioary proble. We have a set of ites. Each ite is a (key, value pair. Keys are i

More information

Different kinds of Mathematical Induction

Different kinds of Mathematical Induction Differet ids of Mathematical Iductio () Mathematical Iductio Give A N, [ A (a A a A)] A N () (First) Priciple of Mathematical Iductio Let P() be a propositio (ope setece), if we put A { : N p() is true}

More information

An Algorithmist s Toolkit October 20, Lecture 11

An Algorithmist s Toolkit October 20, Lecture 11 18.409 A Algorithist s Toolkit October 20, 2009 Lecture 11 Lecturer: Joatha Keler Scribe: Chaithaya Badi 1 Outlie Today we ll itroduce ad discuss Polar of a covex body. Correspodece betwee or fuctios ad

More information

Minimum Description Length vs. Maximum Likelihood in Lossy Data Compression

Minimum Description Length vs. Maximum Likelihood in Lossy Data Compression Miimum Descriptio Legth vs. Maximum Likelihood i Lossy Data Compressio M. Madima M. Harriso I. Kotoyiais May 18, 2005 Abstract We give a developmet of the theory of lossy data compressio from the poit

More information

Teacher s Marking. Guide/Answers

Teacher s Marking. Guide/Answers WOLLONGONG COLLEGE AUSRALIA eacher s Markig A College of the Uiversity of Wollogog Guide/Aswers Diploa i Iforatio echology Fial Exaiatio Autu 008 WUC Discrete Matheatics his exa represets 60% of the total

More information

Uncertainty Principle of Mathematics

Uncertainty Principle of Mathematics Septeber 27 Ucertaity Priciple of Matheatics Shachter Mourici Israel, Holo ourici@walla.co.il Preface This short paper prove that atheatically, Reality is ot real. This short paper is ot about Heiseberg's

More information

THE GREATEST ORDER OF THE DIVISOR FUNCTION WITH INCREASING DIMENSION

THE GREATEST ORDER OF THE DIVISOR FUNCTION WITH INCREASING DIMENSION MATHEMATICA MONTISNIGRI Vol XXVIII (013) 17-5 THE GREATEST ORDER OF THE DIVISOR FUNCTION WITH INCREASING DIMENSION GLEB V. FEDOROV * * Mechaics ad Matheatics Faculty Moscow State Uiversity Moscow, Russia

More information

6.867 Machine learning, lecture 13 (Jaakkola)

6.867 Machine learning, lecture 13 (Jaakkola) Lecture topics: Boostig, argi, ad gradiet descet copleity of classifiers, geeralizatio Boostig Last tie we arrived at a boostig algorith for sequetially creatig a eseble of base classifiers. Our base classifiers

More information

COMP 2804 Solutions Assignment 1

COMP 2804 Solutions Assignment 1 COMP 2804 Solutios Assiget 1 Questio 1: O the first page of your assiget, write your ae ad studet uber Solutio: Nae: Jaes Bod Studet uber: 007 Questio 2: I Tic-Tac-Toe, we are give a 3 3 grid, cosistig

More information

The multiplicative structure of finite field and a construction of LRC

The multiplicative structure of finite field and a construction of LRC IERG6120 Codig for Distributed Storage Systems Lecture 8-06/10/2016 The multiplicative structure of fiite field ad a costructio of LRC Lecturer: Keeth Shum Scribe: Zhouyi Hu Notatios: We use the otatio

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors ECONOMETRIC THEORY MODULE XIII Lecture - 34 Asymptotic Theory ad Stochastic Regressors Dr. Shalabh Departmet of Mathematics ad Statistics Idia Istitute of Techology Kapur Asymptotic theory The asymptotic

More information

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Convergence of random variables. (telegram style notes) P.J.C. Spreij Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space

More information

Advanced Stochastic Processes.

Advanced Stochastic Processes. Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios.

More information

Asymptotic Coupling and Its Applications in Information Theory

Asymptotic Coupling and Its Applications in Information Theory Asymptotic Couplig ad Its Applicatios i Iformatio Theory Vicet Y. F. Ta Joit Work with Lei Yu Departmet of Electrical ad Computer Egieerig, Departmet of Mathematics, Natioal Uiversity of Sigapore IMS-APRM

More information

Introductory statistics

Introductory statistics CM9S: Machie Learig for Bioiformatics Lecture - 03/3/06 Itroductory statistics Lecturer: Sriram Sakararama Scribe: Sriram Sakararama We will provide a overview of statistical iferece focussig o the key

More information

Context-free grammars and. Basics of string generation methods

Context-free grammars and. Basics of string generation methods Cotext-free grammars ad laguages Basics of strig geeratio methods What s so great about regular expressios? A regular expressio is a strig represetatio of a regular laguage This allows the storig a whole

More information

Modelling Missing Data. Missing Data Mechanism. Problem: Some data Y ij may be missing. Complete-data model:

Modelling Missing Data. Missing Data Mechanism. Problem: Some data Y ij may be missing. Complete-data model: Coplete-data odel: Modellig Missig Data idepedet ad idetically distributed () draws Y,, Y fro ultivariate distributio P θ, Y i = (Y i,, Y ip ) T P θ Data atrix: Y = (Y,, Y ) T = (Y ij ),,;j=,,p Uits Y

More information

Generalized Semi- Markov Processes (GSMP)

Generalized Semi- Markov Processes (GSMP) Geeralized Semi- Markov Processes (GSMP) Summary Some Defiitios Markov ad Semi-Markov Processes The Poisso Process Properties of the Poisso Process Iterarrival times Memoryless property ad the residual

More information

Math Solutions to homework 6

Math Solutions to homework 6 Math 175 - Solutios to homework 6 Cédric De Groote November 16, 2017 Problem 1 (8.11 i the book): Let K be a compact Hermitia operator o a Hilbert space H ad let the kerel of K be {0}. Show that there

More information

Minimum Description Length Principle for Maximum Entropy Model Selection

Minimum Description Length Principle for Maximum Entropy Model Selection Miimum Descriptio Legth Priciple for Maximum Etropy Model Selectio Gaurav Padey ad Ambedkar Dukkipati Departmet of Computer Sciece ad Automatio Idia Istitute of Sciece, Bagalore 560012 Email: {gaurav.padey,

More information

Exam 2 CMSC 203 Fall 2009 Name SOLUTION KEY Show All Work! 1. (16 points) Circle T if the corresponding statement is True or F if it is False.

Exam 2 CMSC 203 Fall 2009 Name SOLUTION KEY Show All Work! 1. (16 points) Circle T if the corresponding statement is True or F if it is False. 1 (1 poits) Circle T if the correspodig statemet is True or F if it is False T F For ay positive iteger,, GCD(, 1) = 1 T F Every positive iteger is either prime or composite T F If a b mod p, the (a/p)

More information

ECON 3150/4150, Spring term Lecture 3

ECON 3150/4150, Spring term Lecture 3 Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step ECON 3150/4150, Sprig term 2014. Lecture 3 Ragar Nymoe Uiversity of Oslo 21 Jauary 2014 1 / 30 Itroductio

More information

Lecture 14: Graph Entropy

Lecture 14: Graph Entropy 15-859: Iformatio Theory ad Applicatios i TCS Sprig 2013 Lecture 14: Graph Etropy March 19, 2013 Lecturer: Mahdi Cheraghchi Scribe: Euiwoog Lee 1 Recap Bergma s boud o the permaet Shearer s Lemma Number

More information

Chapter 9 Computation of the Discrete. Fourier Transform

Chapter 9 Computation of the Discrete. Fourier Transform Chapter 9 Coputatio of the Discrete Fourier Trasfor Itroductio Efficiet Coputatio of the Discrete Fourier Trasfor Goertzel Algorith Deciatio-I-Tie FFT Algoriths Deciatio-I-Frequecy FFT Algoriths Ipleetatio

More information

Define a Markov chain on {1,..., 6} with transition probability matrix P =

Define a Markov chain on {1,..., 6} with transition probability matrix P = Pla Group Work 0. The title says it all Next Tie: MCMC ad Geeral-state Markov Chais Midter Exa: Tuesday 8 March i class Hoework 4 due Thursday Uless otherwise oted, let X be a irreducible, aperiodic Markov

More information

Math 4707 Spring 2018 (Darij Grinberg): midterm 2 page 1. Math 4707 Spring 2018 (Darij Grinberg): midterm 2 with solutions [preliminary version]

Math 4707 Spring 2018 (Darij Grinberg): midterm 2 page 1. Math 4707 Spring 2018 (Darij Grinberg): midterm 2 with solutions [preliminary version] Math 4707 Sprig 08 Darij Griberg: idter page Math 4707 Sprig 08 Darij Griberg: idter with solutios [preliiary versio] Cotets 0.. Coutig first-eve tuples......................... 3 0.. Coutig legal paths

More information

PROBABILITY LOGIC: Part 2

PROBABILITY LOGIC: Part 2 James L Bec 2 July 2005 PROBABILITY LOGIC: Part 2 Axioms for Probability Logic Based o geeral cosideratios, we derived axioms for: Pb ( a ) = measure of the plausibility of propositio b coditioal o the

More information

10/ Statistical Machine Learning Homework #1 Solutions

10/ Statistical Machine Learning Homework #1 Solutions Caregie Mello Uiversity Departet of Statistics & Data Sciece 0/36-70 Statistical Macie Learig Hoework # Solutios Proble [40 pts.] DUE: February, 08 Let X,..., X P were X i [0, ] ad P as desity p. Let p

More information

Double Derangement Permutations

Double Derangement Permutations Ope Joural of iscrete Matheatics, 206, 6, 99-04 Published Olie April 206 i SciRes http://wwwscirporg/joural/ojd http://dxdoiorg/04236/ojd2066200 ouble erageet Perutatios Pooya aeshad, Kayar Mirzavaziri

More information

Lecture 7: October 18, 2017

Lecture 7: October 18, 2017 Iformatio ad Codig Theory Autum 207 Lecturer: Madhur Tulsiai Lecture 7: October 8, 207 Biary hypothesis testig I this lecture, we apply the tools developed i the past few lectures to uderstad the problem

More information

Some remarks on the paper Some elementary inequalities of G. Bennett

Some remarks on the paper Some elementary inequalities of G. Bennett Soe rears o the paper Soe eleetary iequalities of G. Beett Dag Ah Tua ad Luu Quag Bay Vieta Natioal Uiversity - Haoi Uiversity of Sciece Abstract We give soe couterexaples ad soe rears of soe of the corollaries

More information