Naïve Bayes MIT Course Notes Cynthia Rudin

Size: px
Start display at page:

Download "Naïve Bayes MIT Course Notes Cynthia Rudin"

Transcription

1 Thaks to Şeyda Ertek Credt: Ng, Mtchell Naïve Bayes MIT Course Notes Cytha Rud The Naïve Bayes algorthm comes from a geeratve model. There s a mportat dstcto betwee geeratve ad dscrmatve models. I all cases, we wat to predct the label y, gve x, that s, we wat P (Y = y X = x). Throughout the paper, we ll remember that the probablty dstrbuto for measure P s over a ukow dstrbuto over X Y. Naïve Bayes Geeratve Model Estmate P (X = x Y = y) ad P (Y = y) ad use Bayes rule to get P (Y = y X = x) Dscrmatve Model Drectly estmate P (Y = y X = x) Most of the top 0 classfcato algorthms are dscrmatve (K-NN, CART, C4.5, SVM, AdaBoost). For Naïve Bayes, we make a assumpto that f we kow the class label y, the we kow the mechasm (the radom process) of how x s geerated. Naïve Bayes s great for very hgh dmesoal problems because t makes a very strog assumpto. Very hgh dmesoal problems suffer from the curse of dmesoalty t s dffcult to uderstad what s gog o a hgh dmesoal space wthout tos of data. Example: Costructg a spam flter. Each example s a emal, each dmeso j of vector x represets the presece of a word.

2 a 0 0 aardvark aardwolf x =.... buy. 0 zyxt Ths x represets a emal cotag the words a ad buy, but ot aardvark or zyxt. The sze of the vocabulary could be 50,000 words, so we are a 50,000 dmesoal space. Naïve Bayes makes the assumpto that the x (j) s are codtoally depedet gve y. Say y = meas spam emal, word 2,087 s buy, ad word 39,83 s prce. Naïve Bayes assumes that f y = (t s spam), the kowg x (2,087) = (emal cotas buy ) wo t effect your belef about x (39,38) (emal cotas prce ). Note: Ths does ot mea x (2,087) ad x (39,83) are depedet, that s, P (X (2,087) = x (2,087) ) = P (X (2,087) = x (2,087) X (39,83) = x (39,83) ). It oly meas they are codtoally depedet gve y. Usg the defto of codtoal probablty recursvely, P (X () = x (),..., X (50,000) = x (50,000) Y = y) = P (X () = x () Y = y)p (X (2) = x (2) Y = y, X () = x () ) P (X (3) = x (3) Y = y, X () = x (), X (2) = x (2) )... P (X (50,000) = x (50,000) Y = y, X () = x (),..., X (49,999) = x (49,999) ). The depedece assumpto gves: P (X () = x (),..., X () = x () Y = y) = P (X () = x () Y = y)p (X (2) = x (2) Y = y)... P (X () = x () Y = y) = P (X (j) = x (j) Y = y). () 2

3 Bayes rule says () () P (Y = y)p (X () = x (),..., X () = x () Y = y) P (Y = y X = x,..., X () = x () ) = P (X () = x (),..., X () = x () ) so pluggg (), we have P (Y = y) P (X (j) = x (j) Y = y) P (Y = y X () = x (),..., X () = x () ) = P (X () = x (),..., X () = x () ) For a ew test stace, called x test, we wat to choose the most probable value of y, that s () () test () () P (Y = ) () j P (X = x test,..., X () = x y NB arg max P (X () = x test,..., X () = x test ) = arg max P (Y = ) P (X (j) = x (j) Y = ). Y = ) (j) So ow, we just eed P (Y = ) for each possble, ad P (X (j) = x test Y = ) for each j ad. Of course we ca t compute those. Let s use the emprcal probablty estmates: [y =] P (Y = ) = = fracto of data where the label s m [ (j) (j) (j) (j) x =x test,y =] P (X = x Y = ) = = Cof(Y = X (j) (j) test = x test ). [y =] That s the smplest verso of Naïve Bayes: y NB (j) arg max P (Y = ) P (X (j) = x test Y = ). There could potetally be a problem that most of the codtoal probabltes are 0 because the dmesoalty of the data s very hgh compared to the amout (j) of data. Ths causes a problem because f eve oe P (X (j) = x test Y = ) s zero the the whole rght sde s zero. I other words, f o trag examples from class spam have the word tomato, we d ever classfy a test example cotag the word tomato as spam! 3

4 To avod ths, we (sort of) set the probabltes to a small postve value whe there are o data. I partcular, we use a Bayesa shrkage estmate of P (X (j) (j) = x test Y = ) where we add some hallucated examples. There are K hallucated examples spread evely over the possble values of X (j). K s the umber of dstct values of X (j). The probabltes are pulled toward /K. So, ow we replace: (j) (j) + ) [ x = x,y = P (X (j) (j ] = x test Y = ) = test [y =] + K P (Y = ) = [y =] + m + K Ths s called Laplace smoothg. The smoothg for P (Y = ) s probably uecessary ad has lttle to o effect. Naïve Bayes s ot ecessarly the best algorthm, but s a good frst thg to try, ad performs surprsgly well gve ts smplcty! There are extesos to cotuous data ad other varatos too. PPT Sldes 4

5 MIT OpeCourseWare Predcto: Mache Learg ad Statstcs Sprg 202 For formato about ctg these materals or our Terms of Use, vst:

Bayes (Naïve or not) Classifiers: Generative Approach

Bayes (Naïve or not) Classifiers: Generative Approach Logstc regresso Bayes (Naïve or ot) Classfers: Geeratve Approach What do we mea by Geeratve approach: Lear p(y), p(x y) ad the apply bayes rule to compute p(y x) for makg predctos Ths s essetally makg

More information

Chapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements

Chapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements Aoucemets No-Parametrc Desty Estmato Techques HW assged Most of ths lecture was o the blacboard. These sldes cover the same materal as preseted DHS Bometrcs CSE 90-a Lecture 7 CSE90a Fall 06 CSE90a Fall

More information

Generative classification models

Generative classification models CS 75 Mache Learg Lecture Geeratve classfcato models Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square Data: D { d, d,.., d} d, Classfcato represets a dscrete class value Goal: lear f : X Y Bar classfcato

More information

Dimensionality Reduction and Learning

Dimensionality Reduction and Learning CMSC 35900 (Sprg 009) Large Scale Learg Lecture: 3 Dmesoalty Reducto ad Learg Istructors: Sham Kakade ad Greg Shakharovch L Supervsed Methods ad Dmesoalty Reducto The theme of these two lectures s that

More information

ENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections

ENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections ENGI 441 Jot Probablty Dstrbutos Page 7-01 Jot Probablty Dstrbutos [Navd sectos.5 ad.6; Devore sectos 5.1-5.] The jot probablty mass fucto of two dscrete radom quattes, s, P ad p x y x y The margal probablty

More information

Introduction to local (nonparametric) density estimation. methods

Introduction to local (nonparametric) density estimation. methods Itroducto to local (oparametrc) desty estmato methods A slecture by Yu Lu for ECE 66 Sprg 014 1. Itroducto Ths slecture troduces two local desty estmato methods whch are Parze desty estmato ad k-earest

More information

9 U-STATISTICS. Eh =(m!) 1 Eh(X (1),..., X (m ) ) i.i.d

9 U-STATISTICS. Eh =(m!) 1 Eh(X (1),..., X (m ) ) i.i.d 9 U-STATISTICS Suppose,,..., are P P..d. wth CDF F. Our goal s to estmate the expectato t (P)=Eh(,,..., m ). Note that ths expectato requres more tha oe cotrast to E, E, or Eh( ). Oe example s E or P((,

More information

Lecture 9: Tolerant Testing

Lecture 9: Tolerant Testing Lecture 9: Tolerat Testg Dael Kae Scrbe: Sakeerth Rao Aprl 4, 07 Abstract I ths lecture we prove a quas lear lower boud o the umber of samples eeded to do tolerat testg for L dstace. Tolerat Testg We have

More information

CHAPTER VI Statistical Analysis of Experimental Data

CHAPTER VI Statistical Analysis of Experimental Data Chapter VI Statstcal Aalyss of Expermetal Data CHAPTER VI Statstcal Aalyss of Expermetal Data Measuremets do ot lead to a uque value. Ths s a result of the multtude of errors (maly radom errors) that ca

More information

Dimensionality reduction Feature selection

Dimensionality reduction Feature selection CS 750 Mache Learg Lecture 3 Dmesoalty reducto Feature selecto Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 750 Mache Learg Dmesoalty reducto. Motvato. Classfcato problem eample: We have a put data

More information

Kernel-based Methods and Support Vector Machines

Kernel-based Methods and Support Vector Machines Kerel-based Methods ad Support Vector Maches Larr Holder CptS 570 Mache Learg School of Electrcal Egeerg ad Computer Scece Washgto State Uverst Refereces Muller et al. A Itroducto to Kerel-Based Learg

More information

X X X E[ ] E X E X. is the ()m n where the ( i,)th. j element is the mean of the ( i,)th., then

X X X E[ ] E X E X. is the ()m n where the ( i,)th. j element is the mean of the ( i,)th., then Secto 5 Vectors of Radom Varables Whe workg wth several radom varables,,..., to arrage them vector form x, t s ofte coveet We ca the make use of matrx algebra to help us orgaze ad mapulate large umbers

More information

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model Lecture 7. Cofdece Itervals ad Hypothess Tests the Smple CLR Model I lecture 6 we troduced the Classcal Lear Regresso (CLR) model that s the radom expermet of whch the data Y,,, K, are the outcomes. The

More information

Lecture 8: Linear Regression

Lecture 8: Linear Regression Lecture 8: Lear egresso May 4, GENOME 56, Sprg Goals Develop basc cocepts of lear regresso from a probablstc framework Estmatg parameters ad hypothess testg wth lear models Lear regresso Su I Lee, CSE

More information

CS 2750 Machine Learning. Lecture 8. Linear regression. CS 2750 Machine Learning. Linear regression. is a linear combination of input components x

CS 2750 Machine Learning. Lecture 8. Linear regression. CS 2750 Machine Learning. Linear regression. is a linear combination of input components x CS 75 Mache Learg Lecture 8 Lear regresso Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 75 Mache Learg Lear regresso Fucto f : X Y s a lear combato of put compoets f + + + K d d K k - parameters

More information

Class 13,14 June 17, 19, 2015

Class 13,14 June 17, 19, 2015 Class 3,4 Jue 7, 9, 05 Pla for Class3,4:. Samplg dstrbuto of sample mea. The Cetral Lmt Theorem (CLT). Cofdece terval for ukow mea.. Samplg Dstrbuto for Sample mea. Methods used are based o CLT ( Cetral

More information

Bayesian Classification. CS690L Data Mining: Classification(2) Bayesian Theorem: Basics. Bayesian Theorem. Training dataset. Naïve Bayes Classifier

Bayesian Classification. CS690L Data Mining: Classification(2) Bayesian Theorem: Basics. Bayesian Theorem. Training dataset. Naïve Bayes Classifier Baa Classfcato CS6L Data Mg: Classfcato() Referece: J. Ha ad M. Kamber, Data Mg: Cocepts ad Techques robablstc learg: Calculate explct probabltes for hypothess, amog the most practcal approaches to certa

More information

The equation is sometimes presented in form Y = a + b x. This is reasonable, but it s not the notation we use.

The equation is sometimes presented in form Y = a + b x. This is reasonable, but it s not the notation we use. INTRODUCTORY NOTE ON LINEAR REGREION We have data of the form (x y ) (x y ) (x y ) These wll most ofte be preseted to us as two colum of a spreadsheet As the topc develops we wll see both upper case ad

More information

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ  1 STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS Recall Assumpto E(Y x) η 0 + η x (lear codtoal mea fucto) Data (x, y ), (x 2, y 2 ),, (x, y ) Least squares estmator ˆ E (Y x) ˆ " 0 + ˆ " x, where ˆ

More information

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS Postpoed exam: ECON430 Statstcs Date of exam: Jauary 0, 0 Tme for exam: 09:00 a.m. :00 oo The problem set covers 5 pages Resources allowed: All wrtte ad prted

More information

Econometric Methods. Review of Estimation

Econometric Methods. Review of Estimation Ecoometrc Methods Revew of Estmato Estmatg the populato mea Radom samplg Pot ad terval estmators Lear estmators Ubased estmators Lear Ubased Estmators (LUEs) Effcecy (mmum varace) ad Best Lear Ubased Estmators

More information

Parametric Density Estimation: Bayesian Estimation. Naïve Bayes Classifier

Parametric Density Estimation: Bayesian Estimation. Naïve Bayes Classifier arametrc Dest Estmato: Baesa Estmato. Naïve Baes Classfer Baesa arameter Estmato Suppose we have some dea of the rage where parameters θ should be Should t we formalze such pror owledge hopes that t wll

More information

22 Nonparametric Methods.

22 Nonparametric Methods. 22 oparametrc Methods. I parametrc models oe assumes apror that the dstrbutos have a specfc form wth oe or more ukow parameters ad oe tres to fd the best or atleast reasoably effcet procedures that aswer

More information

Chapter 5 Properties of a Random Sample

Chapter 5 Properties of a Random Sample Lecture 6 o BST 63: Statstcal Theory I Ku Zhag, /0/008 Revew for the prevous lecture Cocepts: t-dstrbuto, F-dstrbuto Theorems: Dstrbutos of sample mea ad sample varace, relatoshp betwee sample mea ad sample

More information

Point Estimation: definition of estimators

Point Estimation: definition of estimators Pot Estmato: defto of estmators Pot estmator: ay fucto W (X,..., X ) of a data sample. The exercse of pot estmato s to use partcular fuctos of the data order to estmate certa ukow populato parameters.

More information

EP2200 Queueing theory and teletraffic systems. Queueing networks. Viktoria Fodor KTH EES/LCN KTH EES/LCN

EP2200 Queueing theory and teletraffic systems. Queueing networks. Viktoria Fodor KTH EES/LCN KTH EES/LCN EP2200 Queueg theory ad teletraffc systems Queueg etworks Vktora Fodor Ope ad closed queug etworks Queug etwork: etwork of queug systems E.g. data packets traversg the etwork from router to router Ope

More information

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS Exam: ECON430 Statstcs Date of exam: Frday, December 8, 07 Grades are gve: Jauary 4, 08 Tme for exam: 0900 am 00 oo The problem set covers 5 pages Resources allowed:

More information

ECON 482 / WH Hong The Simple Regression Model 1. Definition of the Simple Regression Model

ECON 482 / WH Hong The Simple Regression Model 1. Definition of the Simple Regression Model ECON 48 / WH Hog The Smple Regresso Model. Defto of the Smple Regresso Model Smple Regresso Model Expla varable y terms of varable x y = β + β x+ u y : depedet varable, explaed varable, respose varable,

More information

Announcements. Recognition II. Computer Vision I. Example: Face Detection. Evaluating a binary classifier

Announcements. Recognition II. Computer Vision I. Example: Face Detection. Evaluating a binary classifier Aoucemets Recogto II H3 exteded to toght H4 to be aouced today. Due Frday 2/8. Note wll take a whle to ru some thgs. Fal Exam: hursday 2/4 at 7pm-0pm CSE252A Lecture 7 Example: Face Detecto Evaluatg a

More information

Lecture 02: Bounding tail distributions of a random variable

Lecture 02: Bounding tail distributions of a random variable CSCI-B609: A Theorst s Toolkt, Fall 206 Aug 25 Lecture 02: Boudg tal dstrbutos of a radom varable Lecturer: Yua Zhou Scrbe: Yua Xe & Yua Zhou Let us cosder the ubased co flps aga. I.e. let the outcome

More information

Lecture Notes Types of economic variables

Lecture Notes Types of economic variables Lecture Notes 3 1. Types of ecoomc varables () Cotuous varable takes o a cotuum the sample space, such as all pots o a le or all real umbers Example: GDP, Polluto cocetrato, etc. () Dscrete varables fte

More information

Outline. Point Pattern Analysis Part I. Revisit IRP/CSR

Outline. Point Pattern Analysis Part I. Revisit IRP/CSR Pot Patter Aalyss Part I Outle Revst IRP/CSR, frst- ad secod order effects What s pot patter aalyss (PPA)? Desty-based pot patter measures Dstace-based pot patter measures Revst IRP/CSR Equal probablty:

More information

ρ < 1 be five real numbers. The

ρ < 1 be five real numbers. The Lecture o BST 63: Statstcal Theory I Ku Zhag, /0/006 Revew for the prevous lecture Deftos: covarace, correlato Examples: How to calculate covarace ad correlato Theorems: propertes of correlato ad covarace

More information

Parameter Estimation

Parameter Estimation arameter Estmato robabltes Notatoal Coveto Mass dscrete fucto: catal letters Desty cotuous fucto: small letters Vector vs. scalar Scalar: la Vector: bold D: small Hgher dmeso: catal Notes a cotuous state

More information

Lecture 3 Naïve Bayes, Maximum Entropy and Text Classification COSI 134

Lecture 3 Naïve Bayes, Maximum Entropy and Text Classification COSI 134 Lecture 3 Naïve Baes, Mamum Etro ad Tet Classfcato COSI 34 Codtoal Parameterzato Two RVs: ItellgeceI ad SATS ValI = {Hgh,Low}, ValS={Hgh,Low} A ossble jot dstrbuto Ca descrbe usg cha rule as PI,S PIPS

More information

Midterm Exam 1, section 1 (Solution) Thursday, February hour, 15 minutes

Midterm Exam 1, section 1 (Solution) Thursday, February hour, 15 minutes coometrcs, CON Sa Fracsco State Uversty Mchael Bar Sprg 5 Mdterm am, secto Soluto Thursday, February 6 hour, 5 mutes Name: Istructos. Ths s closed book, closed otes eam.. No calculators of ay kd are allowed..

More information

CS286.2 Lecture 4: Dinur s Proof of the PCP Theorem

CS286.2 Lecture 4: Dinur s Proof of the PCP Theorem CS86. Lecture 4: Dur s Proof of the PCP Theorem Scrbe: Thom Bohdaowcz Prevously, we have prove a weak verso of the PCP theorem: NP PCP 1,1/ (r = poly, q = O(1)). Wth ths result we have the desred costat

More information

Overview. Basic concepts of Bayesian learning. Most probable model given data Coin tosses Linear regression Logistic regression

Overview. Basic concepts of Bayesian learning. Most probable model given data Coin tosses Linear regression Logistic regression Overvew Basc cocepts of Bayesa learg Most probable model gve data Co tosses Lear regresso Logstc regresso Bayesa predctos Co tosses Lear regresso 30 Recap: regresso problems Iput to learg problem: trag

More information

b. There appears to be a positive relationship between X and Y; that is, as X increases, so does Y.

b. There appears to be a positive relationship between X and Y; that is, as X increases, so does Y. .46. a. The frst varable (X) s the frst umber the par ad s plotted o the horzotal axs, whle the secod varable (Y) s the secod umber the par ad s plotted o the vertcal axs. The scatterplot s show the fgure

More information

Midterm Exam 1, section 2 (Solution) Thursday, February hour, 15 minutes

Midterm Exam 1, section 2 (Solution) Thursday, February hour, 15 minutes coometrcs, CON Sa Fracsco State Uverst Mchael Bar Sprg 5 Mdterm xam, secto Soluto Thursda, Februar 6 hour, 5 mutes Name: Istructos. Ths s closed book, closed otes exam.. No calculators of a kd are allowed..

More information

Linear Regression Linear Regression with Shrinkage. Some slides are due to Tommi Jaakkola, MIT AI Lab

Linear Regression Linear Regression with Shrinkage. Some slides are due to Tommi Jaakkola, MIT AI Lab Lear Regresso Lear Regresso th Shrkage Some sldes are due to Tomm Jaakkola, MIT AI Lab Itroducto The goal of regresso s to make quattatve real valued predctos o the bass of a vector of features or attrbutes.

More information

1. The weight of six Golden Retrievers is 66, 61, 70, 67, 92 and 66 pounds. The weight of six Labrador Retrievers is 54, 60, 72, 78, 84 and 67.

1. The weight of six Golden Retrievers is 66, 61, 70, 67, 92 and 66 pounds. The weight of six Labrador Retrievers is 54, 60, 72, 78, 84 and 67. Ecoomcs 3 Itroducto to Ecoometrcs Sprg 004 Professor Dobk Name Studet ID Frst Mdterm Exam You must aswer all the questos. The exam s closed book ad closed otes. You may use your calculators but please

More information

Bayesian Classifier. v MAP. argmax v j V P(x 1,x 2,...,x n v j )P(v j ) ,..., x. x) argmax. )P(v j

Bayesian Classifier. v MAP. argmax v j V P(x 1,x 2,...,x n v j )P(v j ) ,..., x. x) argmax. )P(v j Bayesa Classfer f:xv, fte set of alues Istaces xx ca be descrbed as a collecto of features x = (x 1, x 2, x x 2 {0,1} Ge a example, assg t the most probable alue V Bayes Rule: MAP argmax V MAP argmax V

More information

Functions of Random Variables

Functions of Random Variables Fuctos of Radom Varables Chapter Fve Fuctos of Radom Varables 5. Itroducto A geeral egeerg aalyss model s show Fg. 5.. The model output (respose) cotas the performaces of a system or product, such as weght,

More information

6. Nonparametric techniques

6. Nonparametric techniques 6. Noparametrc techques Motvato Problem: how to decde o a sutable model (e.g. whch type of Gaussa) Idea: just use the orgal data (lazy learg) 2 Idea 1: each data pot represets a pece of probablty P(x)

More information

2SLS Estimates ECON In this case, begin with the assumption that E[ i

2SLS Estimates ECON In this case, begin with the assumption that E[ i SLS Estmates ECON 3033 Bll Evas Fall 05 Two-Stage Least Squares (SLS Cosder a stadard lear bvarate regresso model y 0 x. I ths case, beg wth the assumto that E[ x] 0 whch meas that OLS estmates of wll

More information

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution: Chapter 4 Exercses Samplg Theory Exercse (Smple radom samplg: Let there be two correlated radom varables X ad A sample of sze s draw from a populato by smple radom samplg wthout replacemet The observed

More information

Chapter 4 Multiple Random Variables

Chapter 4 Multiple Random Variables Revew for the prevous lecture: Theorems ad Examples: How to obta the pmf (pdf) of U = g (, Y) ad V = g (, Y) Chapter 4 Multple Radom Varables Chapter 44 Herarchcal Models ad Mxture Dstrbutos Examples:

More information

Chapter Business Statistics: A First Course Fifth Edition. Learning Objectives. Correlation vs. Regression. In this chapter, you learn:

Chapter Business Statistics: A First Course Fifth Edition. Learning Objectives. Correlation vs. Regression. In this chapter, you learn: Chapter 3 3- Busess Statstcs: A Frst Course Ffth Edto Chapter 2 Correlato ad Smple Lear Regresso Busess Statstcs: A Frst Course, 5e 29 Pretce-Hall, Ic. Chap 2- Learg Objectves I ths chapter, you lear:

More information

Homework 1: Solutions Sid Banerjee Problem 1: (Practice with Asymptotic Notation) ORIE 4520: Stochastics at Scale Fall 2015

Homework 1: Solutions Sid Banerjee Problem 1: (Practice with Asymptotic Notation) ORIE 4520: Stochastics at Scale Fall 2015 Fall 05 Homework : Solutos Problem : (Practce wth Asymptotc Notato) A essetal requremet for uderstadg scalg behavor s comfort wth asymptotc (or bg-o ) otato. I ths problem, you wll prove some basc facts

More information

Feature Selection: Part 2. 1 Greedy Algorithms (continued from the last lecture)

Feature Selection: Part 2. 1 Greedy Algorithms (continued from the last lecture) CSE 546: Mache Learg Lecture 6 Feature Selecto: Part 2 Istructor: Sham Kakade Greedy Algorthms (cotued from the last lecture) There are varety of greedy algorthms ad umerous amg covetos for these algorthms.

More information

Chapter Two. An Introduction to Regression ( )

Chapter Two. An Introduction to Regression ( ) ubject: A Itroducto to Regresso Frst tage Chapter Two A Itroducto to Regresso (018-019) 1 pg. ubject: A Itroducto to Regresso Frst tage A Itroducto to Regresso Regresso aalss s a statstcal tool for the

More information

Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b

Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b CS 70 Dscrete Mathematcs ad Probablty Theory Fall 206 Sesha ad Walrad DIS 0b. Wll I Get My Package? Seaky delvery guy of some compay s out delverg packages to customers. Not oly does he had a radom package

More information

Bayesian belief networks

Bayesian belief networks Lecture 14 ayesa belef etworks los Hauskrecht mlos@cs.ptt.edu 5329 Seott Square Desty estmato Data: D { D1 D2.. D} D x a vector of attrbute values ttrbutes: modeled by radom varables { 1 2 d} wth: otuous

More information

STK4011 and STK9011 Autumn 2016

STK4011 and STK9011 Autumn 2016 STK4 ad STK9 Autum 6 Pot estmato Covers (most of the followg materal from chapter 7: Secto 7.: pages 3-3 Secto 7..: pages 3-33 Secto 7..: pages 35-3 Secto 7..3: pages 34-35 Secto 7.3.: pages 33-33 Secto

More information

A NEW LOG-NORMAL DISTRIBUTION

A NEW LOG-NORMAL DISTRIBUTION Joural of Statstcs: Advaces Theory ad Applcatos Volume 6, Number, 06, Pages 93-04 Avalable at http://scetfcadvaces.co. DOI: http://dx.do.org/0.864/jsata_700705 A NEW LOG-NORMAL DISTRIBUTION Departmet of

More information

Point Estimation: definition of estimators

Point Estimation: definition of estimators Pot Estmato: defto of estmators Pot estmator: ay fucto W (X,..., X ) of a data sample. The exercse of pot estmato s to use partcular fuctos of the data order to estmate certa ukow populato parameters.

More information

CIS 800/002 The Algorithmic Foundations of Data Privacy October 13, Lecture 9. Database Update Algorithms: Multiplicative Weights

CIS 800/002 The Algorithmic Foundations of Data Privacy October 13, Lecture 9. Database Update Algorithms: Multiplicative Weights CIS 800/002 The Algorthmc Foudatos of Data Prvacy October 13, 2011 Lecturer: Aaro Roth Lecture 9 Scrbe: Aaro Roth Database Update Algorthms: Multplcatve Weghts We ll recall aga) some deftos from last tme:

More information

Supervised learning: Linear regression Logistic regression

Supervised learning: Linear regression Logistic regression CS 57 Itroducto to AI Lecture 4 Supervsed learg: Lear regresso Logstc regresso Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 57 Itro to AI Data: D { D D.. D D Supervsed learg d a set of eamples s

More information

2.28 The Wall Street Journal is probably referring to the average number of cubes used per glass measured for some population that they have chosen.

2.28 The Wall Street Journal is probably referring to the average number of cubes used per glass measured for some population that they have chosen. .5 x 54.5 a. x 7. 786 7 b. The raked observatos are: 7.4, 7.5, 7.7, 7.8, 7.9, 8.0, 8.. Sce the sample sze 7 s odd, the meda s the (+)/ 4 th raked observato, or meda 7.8 c. The cosumer would more lkely

More information

Bayesian belief networks

Bayesian belief networks Lecture 19 ayesa belef etworks los Hauskrecht mlos@cs.ptt.edu 539 Seott Square Varous ferece tasks: robablstc ferece Dagostc task. from effect to cause eumoa Fever redcto task. from cause to effect Fever

More information

Lecture Notes Forecasting the process of estimating or predicting unknown situations

Lecture Notes Forecasting the process of estimating or predicting unknown situations Lecture Notes. Ecoomc Forecastg. Forecastg the process of estmatg or predctg ukow stuatos Eample usuall ecoomsts predct future ecoomc varables Forecastg apples to a varet of data () tme seres data predctg

More information

Parameter, Statistic and Random Samples

Parameter, Statistic and Random Samples Parameter, Statstc ad Radom Samples A parameter s a umber that descrbes the populato. It s a fxed umber, but practce we do ot kow ts value. A statstc s a fucto of the sample data,.e., t s a quatty whose

More information

Lecture 1. (Part II) The number of ways of partitioning n distinct objects into k distinct groups containing n 1,

Lecture 1. (Part II) The number of ways of partitioning n distinct objects into k distinct groups containing n 1, Lecture (Part II) Materals Covered Ths Lecture: Chapter 2 (2.6 --- 2.0) The umber of ways of parttog dstct obects to dstct groups cotag, 2,, obects, respectvely, where each obect appears exactly oe group

More information

12.2 Estimating Model parameters Assumptions: ox and y are related according to the simple linear regression model

12.2 Estimating Model parameters Assumptions: ox and y are related according to the simple linear regression model 1. Estmatg Model parameters Assumptos: ox ad y are related accordg to the smple lear regresso model (The lear regresso model s the model that says that x ad y are related a lear fasho, but the observed

More information

Unsupervised Learning and Other Neural Networks

Unsupervised Learning and Other Neural Networks CSE 53 Soft Computg NOT PART OF THE FINAL Usupervsed Learg ad Other Neural Networs Itroducto Mture Destes ad Idetfablty ML Estmates Applcato to Normal Mtures Other Neural Networs Itroducto Prevously, all

More information

1. Overview of basic probability

1. Overview of basic probability 13.42 Desg Prcples for Ocea Vehcles Prof. A.H. Techet Sprg 2005 1. Overvew of basc probablty Emprcally, probablty ca be defed as the umber of favorable outcomes dvded by the total umber of outcomes, other

More information

Mean is only appropriate for interval or ratio scales, not ordinal or nominal.

Mean is only appropriate for interval or ratio scales, not ordinal or nominal. Mea Same as ordary average Sum all the data values ad dvde by the sample sze. x = ( x + x +... + x Usg summato otato, we wrte ths as x = x = x = = ) x Mea s oly approprate for terval or rato scales, ot

More information

ENGI 3423 Simple Linear Regression Page 12-01

ENGI 3423 Simple Linear Regression Page 12-01 ENGI 343 mple Lear Regresso Page - mple Lear Regresso ometmes a expermet s set up where the expermeter has cotrol over the values of oe or more varables X ad measures the resultg values of aother varable

More information

MA/CSSE 473 Day 27. Dynamic programming

MA/CSSE 473 Day 27. Dynamic programming MA/CSSE 473 Day 7 Dyamc Programmg Bomal Coeffcets Warshall's algorthm (Optmal BSTs) Studet questos? Dyamc programmg Used for problems wth recursve solutos ad overlappg subproblems Typcally, we save (memoze)

More information

1 Onto functions and bijections Applications to Counting

1 Onto functions and bijections Applications to Counting 1 Oto fuctos ad bectos Applcatos to Coutg Now we move o to a ew topc. Defto 1.1 (Surecto. A fucto f : A B s sad to be surectve or oto f for each b B there s some a A so that f(a B. What are examples of

More information

Statistical modelling and latent variables (2)

Statistical modelling and latent variables (2) Statstcal modellg ad latet varables (2 Mxg latet varables ad parameters statstcal erece Trod Reta (Dvso o statstcs ad surace mathematcs, Departmet o Mathematcs, Uversty o Oslo State spaces We typcally

More information

Lecture 3 Probability review (cont d)

Lecture 3 Probability review (cont d) STATS 00: Itroducto to Statstcal Iferece Autum 06 Lecture 3 Probablty revew (cot d) 3. Jot dstrbutos If radom varables X,..., X k are depedet, the ther dstrbuto may be specfed by specfyg the dvdual dstrbuto

More information

Chapter 13, Part A Analysis of Variance and Experimental Design. Introduction to Analysis of Variance. Introduction to Analysis of Variance

Chapter 13, Part A Analysis of Variance and Experimental Design. Introduction to Analysis of Variance. Introduction to Analysis of Variance Chapter, Part A Aalyss of Varace ad Epermetal Desg Itroducto to Aalyss of Varace Aalyss of Varace: Testg for the Equalty of Populato Meas Multple Comparso Procedures Itroducto to Aalyss of Varace Aalyss

More information

Rademacher Complexity. Examples

Rademacher Complexity. Examples Algorthmc Foudatos of Learg Lecture 3 Rademacher Complexty. Examples Lecturer: Patrck Rebesch Verso: October 16th 018 3.1 Itroducto I the last lecture we troduced the oto of Rademacher complexty ad showed

More information

X ε ) = 0, or equivalently, lim

X ε ) = 0, or equivalently, lim Revew for the prevous lecture Cocepts: order statstcs Theorems: Dstrbutos of order statstcs Examples: How to get the dstrbuto of order statstcs Chapter 5 Propertes of a Radom Sample Secto 55 Covergece

More information

Part 4b Asymptotic Results for MRR2 using PRESS. Recall that the PRESS statistic is a special type of cross validation procedure (see Allen (1971))

Part 4b Asymptotic Results for MRR2 using PRESS. Recall that the PRESS statistic is a special type of cross validation procedure (see Allen (1971)) art 4b Asymptotc Results for MRR usg RESS Recall that the RESS statstc s a specal type of cross valdato procedure (see Alle (97)) partcular to the regresso problem ad volves fdg Y $,, the estmate at the

More information

Linear Regression with One Regressor

Linear Regression with One Regressor Lear Regresso wth Oe Regressor AIM QA.7. Expla how regresso aalyss ecoometrcs measures the relatoshp betwee depedet ad depedet varables. A regresso aalyss has the goal of measurg how chages oe varable,

More information

Construction and Evaluation of Actuarial Models. Rajapaksha Premarathna

Construction and Evaluation of Actuarial Models. Rajapaksha Premarathna Costructo ad Evaluato of Actuaral Models Raapaksha Premaratha Table of Cotets Modelg Some deftos ad Notatos...4 Case : Polcy Lmtu...4 Case : Wth a Ordary deductble....5 Case 3: Maxmum Covered loss u wth

More information

CLASS NOTES. for. PBAF 528: Quantitative Methods II SPRING Instructor: Jean Swanson. Daniel J. Evans School of Public Affairs

CLASS NOTES. for. PBAF 528: Quantitative Methods II SPRING Instructor: Jean Swanson. Daniel J. Evans School of Public Affairs CLASS NOTES for PBAF 58: Quattatve Methods II SPRING 005 Istructor: Jea Swaso Dael J. Evas School of Publc Affars Uversty of Washgto Ackowledgemet: The structor wshes to thak Rachel Klet, Assstat Professor,

More information

Section l h l Stem=Tens. 8l Leaf=Ones. 8h l 03. 9h 58

Section l h l Stem=Tens. 8l Leaf=Ones. 8h l 03. 9h 58 Secto.. 6l 34 6h 667899 7l 44 7h Stem=Tes 8l 344 Leaf=Oes 8h 5557899 9l 3 9h 58 Ths dsplay brgs out the gap the data: There are o scores the hgh 7's. 6. a. beams cylders 9 5 8 88533 6 6 98877643 7 488

More information

Qualifying Exam Statistical Theory Problem Solutions August 2005

Qualifying Exam Statistical Theory Problem Solutions August 2005 Qualfyg Exam Statstcal Theory Problem Solutos August 5. Let X, X,..., X be d uform U(,),

More information

18.657: Mathematics of Machine Learning

18.657: Mathematics of Machine Learning 8.657: Mathematcs of Mache Learg Lecturer: Phlppe Rgollet Lecture 3 Scrbe: James Hrst Sep. 6, 205.5 Learg wth a fte dctoary Recall from the ed of last lecture our setup: We are workg wth a fte dctoary

More information

hp calculators HP 30S Statistics Averages and Standard Deviations Average and Standard Deviation Practice Finding Averages and Standard Deviations

hp calculators HP 30S Statistics Averages and Standard Deviations Average and Standard Deviation Practice Finding Averages and Standard Deviations HP 30S Statstcs Averages ad Stadard Devatos Average ad Stadard Devato Practce Fdg Averages ad Stadard Devatos HP 30S Statstcs Averages ad Stadard Devatos Average ad stadard devato The HP 30S provdes several

More information

18.413: Error Correcting Codes Lab March 2, Lecture 8

18.413: Error Correcting Codes Lab March 2, Lecture 8 18.413: Error Correctg Codes Lab March 2, 2004 Lecturer: Dael A. Spelma Lecture 8 8.1 Vector Spaces A set C {0, 1} s a vector space f for x all C ad y C, x + y C, where we take addto to be compoet wse

More information

Statistics Descriptive and Inferential Statistics. Instructor: Daisuke Nagakura

Statistics Descriptive and Inferential Statistics. Instructor: Daisuke Nagakura Statstcs Descrptve ad Iferetal Statstcs Istructor: Dasuke Nagakura (agakura@z7.keo.jp) 1 Today s topc Today, I talk about two categores of statstcal aalyses, descrptve statstcs ad feretal statstcs, ad

More information

3. Basic Concepts: Consequences and Properties

3. Basic Concepts: Consequences and Properties : 3. Basc Cocepts: Cosequeces ad Propertes Markku Jutt Overvew More advaced cosequeces ad propertes of the basc cocepts troduced the prevous lecture are derved. Source The materal s maly based o Sectos.6.8

More information

Summary of the lecture in Biostatistics

Summary of the lecture in Biostatistics Summary of the lecture Bostatstcs Probablty Desty Fucto For a cotuos radom varable, a probablty desty fucto s a fucto such that: 0 dx a b) b a dx A probablty desty fucto provdes a smple descrpto of the

More information

THE ROYAL STATISTICAL SOCIETY 2016 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 5

THE ROYAL STATISTICAL SOCIETY 2016 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 5 THE ROYAL STATISTICAL SOCIETY 06 EAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 5 The Socety s provdg these solutos to assst cadtes preparg for the examatos 07. The solutos are teded as learg ads ad should

More information

Chapter 8. Inferences about More Than Two Population Central Values

Chapter 8. Inferences about More Than Two Population Central Values Chapter 8. Ifereces about More Tha Two Populato Cetral Values Case tudy: Effect of Tmg of the Treatmet of Port-We tas wth Lasers ) To vestgate whether treatmet at a youg age would yeld better results tha

More information

Faculty Research Interest Seminar Department of Biostatistics, GSPH University of Pittsburgh. Gong Tang Feb. 18, 2005

Faculty Research Interest Seminar Department of Biostatistics, GSPH University of Pittsburgh. Gong Tang Feb. 18, 2005 Faculty Research Iterest Semar Departmet of Bostatstcs, GSPH Uversty of Pttsburgh Gog ag Feb. 8, 25 Itroducto Joed the departmet 2. each two courses: Elemets of Stochastc Processes (Bostat 24). Aalyss

More information

Application of Calibration Approach for Regression Coefficient Estimation under Two-stage Sampling Design

Application of Calibration Approach for Regression Coefficient Estimation under Two-stage Sampling Design Authors: Pradp Basak, Kaustav Adtya, Hukum Chadra ad U.C. Sud Applcato of Calbrato Approach for Regresso Coeffcet Estmato uder Two-stage Samplg Desg Pradp Basak, Kaustav Adtya, Hukum Chadra ad U.C. Sud

More information

Bayes Decision Theory - II

Bayes Decision Theory - II Bayes Decso Theory - II Ke Kreutz-Delgado (Nuo Vascocelos) ECE 175 Wter 2012 - UCSD Nearest Neghbor Classfer We are cosderg supervsed classfcato Nearest Neghbor (NN) Classfer A trag set D = {(x 1,y 1 ),,

More information

ε. Therefore, the estimate

ε. Therefore, the estimate Suggested Aswers, Problem Set 3 ECON 333 Da Hugerma. Ths s ot a very good dea. We kow from the secod FOC problem b) that ( ) SSE / = y x x = ( ) Whch ca be reduced to read y x x = ε x = ( ) The OLS model

More information

Lecture Notes 2. The ability to manipulate matrices is critical in economics.

Lecture Notes 2. The ability to manipulate matrices is critical in economics. Lecture Notes. Revew of Matrces he ablt to mapulate matrces s crtcal ecoomcs.. Matr a rectagular arra of umbers, parameters, or varables placed rows ad colums. Matrces are assocated wth lear equatos. lemets

More information

An Introduction to. Support Vector Machine

An Introduction to. Support Vector Machine A Itroducto to Support Vector Mache Support Vector Mache (SVM) A classfer derved from statstcal learg theory by Vapk, et al. 99 SVM became famous whe, usg mages as put, t gave accuracy comparable to eural-etwork

More information

Bayes Estimator for Exponential Distribution with Extension of Jeffery Prior Information

Bayes Estimator for Exponential Distribution with Extension of Jeffery Prior Information Malaysa Joural of Mathematcal Sceces (): 97- (9) Bayes Estmator for Expoetal Dstrbuto wth Exteso of Jeffery Pror Iformato Hadeel Salm Al-Kutub ad Noor Akma Ibrahm Isttute for Mathematcal Research, Uverst

More information

Simple Linear Regression

Simple Linear Regression Statstcal Methods I (EST 75) Page 139 Smple Lear Regresso Smple regresso applcatos are used to ft a model descrbg a lear relatoshp betwee two varables. The aspects of least squares regresso ad correlato

More information

Extreme Value Theory: An Introduction

Extreme Value Theory: An Introduction (correcto d Extreme Value Theory: A Itroducto by Laures de Haa ad Aa Ferrera Wth ths webpage the authors ted to form the readers of errors or mstakes foud the book after publcato. We also gve extesos for

More information

PTAS for Bin-Packing

PTAS for Bin-Packing CS 663: Patter Matchg Algorthms Scrbe: Che Jag /9/00. Itroducto PTAS for B-Packg The B-Packg problem s NP-hard. If we use approxmato algorthms, the B-Packg problem could be solved polyomal tme. For example,

More information