Probability Densities in Data Mining

Size: px

Start display at page:

Download "Probability Densities in Data Mining"

Rosamund Woods
5 years ago
Views:

1 Probabilit Densities in Data Mining Note to other teachers and users of these slides. Andrew would be delighted if ou found this source material useful in giving our own lectures. Feel free to use these slides verbatim, or to modif them to fit our own needs. PowerPoint originals are available. If ou make use of a significant ortion of these slides in our own lecture, lease include this message, or the following link to the source reositor of Andrew s tutorials: htt:// Comments and corrections gratefull received. Andrew W. Moore Associate Professor School of Comuter Science Carnegie Mellon Universit awm@cs.cmu.edu Coright 001, Andrew W. Moore Aug 7, 001 Probabilit Densities in Data Mining Wh we should care Notation and Fundamentals of continuous PDFs Multivariate continuous PDFs Combining continuous and discrete random variables Coright 001, Andrew W. Moore Probabilit Densities: Slide

2 Wh we should care Real Numbers occur in at least 50% of database records Can t alwas quantize them So need to understand how to describe where the come from A great wa of saing what s a reasonable range of values A great wa of saing how multile attributes should reasonabl co-occur Coright 001, Andrew W. Moore Probabilit Densities: Slide 3 Wh we should care Can immediatel get us Baes Classifiers that are sensible with real-valued data You ll need to intimatel understand PDFs in order to do kernel methods, clustering with Miture Models, analsis of variance, time series and man other things Will introduce us to linear and non-linear regression Coright 001, Andrew W. Moore Probabilit Densities: Slide 4

3 A PDF of American Ages in 000 Coright 001, Andrew W. Moore Probabilit Densities: Slide 5 A PDF of American Ages in 000 Let X be a continuous random variable. If is a Probabilit Densit Function for X then a < X b P d b a 50 < Age 50 P 30 age dage 0.36 age 30 Coright 001, Andrew W. Moore Probabilit Densities: Slide 6

4 Proerties of PDFs a < X b P d b a That means lim h 0 h P < X h h + P X Coright 001, Andrew W. Moore Probabilit Densities: Slide 7 Proerties of PDFs a < X b P d P b a X Therefore Therefore d 1 : 0 Coright 001, Andrew W. Moore Probabilit Densities: Slide 8

5 Talking to our stomach What s the gut-feel meaning of? If then and when a value X is samled from the distribution, ou are times as likel to find that X is ver close to 5.31 than that X is ver close to 5.9. Coright 001, Andrew W. Moore Probabilit Densities: Slide 9 Talking to our stomach What s the gut-feel meaning of? If then 5.31 a 0.06 and 5.9 b 0.03 when a value X is samled from the distribution, ou are times as likel to find that X is ver close to 5.31 a than that X is ver close to 5.9. b Coright 001, Andrew W. Moore Probabilit Densities: Slide 10

6 Talking to our stomach What s the gut-feel meaning of? If then 5.31 a 0.03 z and 5.9 b 0.06 z when a value X is samled from the distribution, ou are times as likel to find that X is ver close to 5.31 a than that X is ver close to 5.9. b Coright 001, Andrew W. Moore Probabilit Densities: Slide 11 Talking to our stomach What s the gut-feel meaning of? If then 5.31 a 0.03 αz and 5.9 b 0.06 z when a value X is samled from the distribution, ou are α times as likel to find that X is ver close to 5.31 a than that X is ver close to 5.9. b Coright 001, Andrew W. Moore Probabilit Densities: Slide 1

7 Talking to our stomach What s the gut-feel meaning of? If then a b α when a value X is samled from the distribution, ou are α times as likel to find that X is ver close to 5.31 a than that X is ver close to 5.9. b Coright 001, Andrew W. Moore Probabilit Densities: Slide 13 Talking to our stomach What s the gut-feel meaning of? If a b then P a h < lim h 0 P b h < α X X < a + h < b + h α Coright 001, Andrew W. Moore Probabilit Densities: Slide 14

8 Yet another wa to view a PDF A recie for samling a random age. 1. Generate a random dot from the rectangle surrounding the PDF curve. Call the dot age,d. If d < age sto and return age 3. Else tr again: go to Ste 1. Coright 001, Andrew W. Moore Probabilit Densities: Slide 15 Test our understanding True or False: : 1 : P X 0 Coright 001, Andrew W. Moore Probabilit Densities: Slide 16

9 Eectations E[X] the eected value of random variable X the average value we d see if we took a ver large number of random samles of X d Coright 001, Andrew W. Moore Probabilit Densities: Slide 17 Eectations E[age] E[X] the eected value of random variable X the average value we d see if we took a ver large number of random samles of X Coright 001, Andrew W. Moore Probabilit Densities: Slide 18 d the first moment of the shae formed b the aes and the blue curve the best value to choose if ou must guess an unknown erson s age and ou ll be fined the square of our error

10 Eectation of a function E[age ] E[age] µe[fx] the eected value of f where is drawn from X s distribution. the average value we d see if we took a ver large number of random samles of fx µ f d Note that in general: E[ f ] f E[ X ] Coright 001, Andrew W. Moore Probabilit Densities: Slide 19 Variance σ Var[X] the eected squared difference between and E[X] Var [age] σ µ d amount ou d eect to lose if ou must guess an unknown erson s age and ou ll be fined the square of our error, and assuming ou la otimall Coright 001, Andrew W. Moore Probabilit Densities: Slide 0

11 Standard Deviation σ Var[X] the eected squared difference between and E[X] Var [age] σ σ µ d amount ou d eect to lose if ou must guess an unknown erson s age and ou ll be fined the square of our error, and assuming ou la otimall σ Standard Deviation tical deviation of X from its mean σ Var[ X ] Coright 001, Andrew W. Moore Probabilit Densities: Slide 1 In dimensions, robabilit densit of random variables X,Y at location, Coright 001, Andrew W. Moore Probabilit Densities: Slide

12 In dimensions Let X,Y be a air of continuous random variables, and let R be some region of X,Y sace P X, Y R, R, dd Coright 001, Andrew W. Moore Probabilit Densities: Slide 3 In dimensions Let X,Y be a air of continuous random variables, and let R be some region of X,Y sace P X, Y R, R, dd P 0<mg<30 and 500<weight<3000 area under the -d surface within the red rectangle Coright 001, Andrew W. Moore Probabilit Densities: Slide 4

13 In dimensions Let X,Y be a air of continuous random variables, and let R be some region of X,Y sace P X, Y R, R, dd P [mg-5/10] + [weight-3300/1500] < 1 area under the -d surface within the red oval Coright 001, Andrew W. Moore Probabilit Densities: Slide 5 In dimensions Let X,Y be a air of continuous random variables, and let R be some region of X,Y sace P X, Y R, R, dd Take the secial case of region R everwhere. Remember that with robabilit 1, X,Y will be drawn from somewhere. So.., dd 1 Coright 001, Andrew W. Moore Probabilit Densities: Slide 6

14 In dimensions Let X,Y be a air of continuous random variables, and let R be some region of X,Y sace P X, Y R, R, dd, lim h 0 h P < X h + h h < Y h + Coright 001, Andrew W. Moore Probabilit Densities: Slide 7 In m dimensions Let X 1,X, X m be an n-tule of continuous random variables, and let R be some region of R m P X, X,..., X, 1 1 m R...,..., m R, 1,..., m d m,,... d, d 1 Coright 001, Andrew W. Moore Probabilit Densities: Slide 8

Indeendence X Y iff, :, If X and Y are indeendent then knowing the value of X does not hel redict the value of Y mg,weight NOT indeendent Coright 001, Andrew W.

15 Indeendence X Y iff, :, If X and Y are indeendent then knowing the value of X does not hel redict the value of Y mg,weight NOT indeendent Coright 001, Andrew W. Moore Probabilit Densities: Slide 9 Indeendence X Y iff, :, If X and Y are indeendent then knowing the value of X does not hel redict the value of Y the contours sa that acceleration and weight are indeendent Coright 001, Andrew W. Moore Probabilit Densities: Slide 30

16 Multivariate Eectation µ X E [ X] d E[mg,weight] 4.5,600 The centroid of the cloud Coright 001, Andrew W. Moore Probabilit Densities: Slide 31 Multivariate Eectation E [ f X] f d Coright 001, Andrew W. Moore Probabilit Densities: Slide 3

17 Test our understanding Question : When if ever does E [ X + Y] E[ X ] + E[ Y]? All the time? Onl when X and Y are indeendent? It can fail even if X and Y are indeendent? Coright 001, Andrew W. Moore Probabilit Densities: Slide 33 Bivariate Eectation E [ f, ] f,, dd if f, then E[ f X, Y ], dd if f, then E[ f X, Y], dd if f, + then E[ f X, Y ] +, dd E [ X + Y] E[ X ] + E[ Y] Coright 001, Andrew W. Moore Probabilit Densities: Slide 34

18 σ Bivariate Covariance Cov[ X, Y ] E[ X µ Y µ ] σ σ σ σ Cov[ X, X ] Var[ X ] E[ X µ Cov[ Y, Y] Var[ Y ] E[ Y µ ] ] Coright 001, Andrew W. Moore Probabilit Densities: Slide 35 σ Bivariate Covariance Cov[ X, Y ] E[ X µ Y µ ] σ σ σ Cov[ X] σ Cov[ X, X ] Var[ X ] E[ X µ Cov[ Y, Y] Var[ Y ] E[ Y µ X Write X, then Y ] σ σ T E[ X µ X µ ] Σ σ σ Coright 001, Andrew W. Moore Probabilit Densities: Slide 36 ]

19 Covariance Intuition σ weight 700 E[mg,weight] 4.5,600 σ weight 700 σ mg 8 σ mg 8 Coright 001, Andrew W. Moore Probabilit Densities: Slide 37 Covariance Intuition Princial Eigenvector of Σ σ weight 700 E[mg,weight] 4.5,600 σ weight 700 σ mg 8 σ mg 8 Coright 001, Andrew W. Moore Probabilit Densities: Slide 38

20 Cov[ X] Covariance Fun Facts σ σ T E[ X µ X µ ] Σ σ σ True or False: If σ 0 then X and Y are indeendent True or False: If X and Y are indeendent then σ 0 True or False: If σ σ σ then X and Y are deterministicall related True or False: If X and Y are deterministicall related then σ σ σ How could ou rove or disrove these? Coright 001, Andrew W. Moore Probabilit Densities: Slide 39 General Covariance Let X X 1,X, X k be a vector of k continuous random variables T Cov [ X] E [ X µ X µ ] Σ Σ ij Cov[ X, X ] σ i j i j S is a k k smmetric non-negative definite matri If all distributions are linearl indeendent it is ositive definite If the distributions are linearl deendent it has determinant zero Coright 001, Andrew W. Moore Probabilit Densities: Slide 40

It can fail even if X and Y are indeendent? Coright 001, Andrew W.

21 Question : When if Test our understanding ever doesvar [ X + Y] Var[ X ] + Var[ Y]? All the time? Onl when X and Y are indeendent? It can fail even if X and Y are indeendent? Coright 001, Andrew W. Moore Probabilit Densities: Slide 41 Marginal Distributions, d Coright 001, Andrew W. Moore Probabilit Densities: Slide 4

22 mg weight 4600 Conditional Distributions mg weight 300 mg weight 000.d.f. of X when Y Coright 001, Andrew W. Moore Probabilit Densities: Slide 43 mg weight 4600 Conditional Distributions, Wh?.d.f. of X when Y Coright 001, Andrew W. Moore Probabilit Densities: Slide 44

23 Coright 001, Andrew W. Moore Probabilit Densities: Slide 45 Indeendence Revisited It s eas to rove that these statements are equivalent, :, iff Y X :, :,, :, Coright 001, Andrew W. Moore Probabilit Densities: Slide 46 More useful stuff Baes Rule These can all be roved from definitions on revious slides 1 d,, z z z

24 Coright 001, Andrew W. Moore Probabilit Densities: Slide 47 Miing discrete and continuous variables h v A h X h P v A + <, lim 0 h 1, 1 n A v d v A Baes Rule Baes Rule A P A P A A P A A P Coright 001, Andrew W. Moore Probabilit Densities: Slide 48 Miing discrete and continuous variables PEduYears,Wealth

25 Miing discrete and continuous variables PEduYears,Wealth PWealth EduYears Coright 001, Andrew W. Moore Probabilit Densities: Slide 49 Miing discrete and continuous variables PEduYears,Wealth PEduYears Wealth PWealth EduYears Renormalized Aes Coright 001, Andrew W. Moore Probabilit Densities: Slide 50

26 What ou should know You should be able to la with discrete, continuous and mied joint distributions You should be ha with the difference between and PA You should be intimate with eectations of continuous and discrete random variables You should smile when ou meet a covariance matri Indeendence and its consequences should be second nature Coright 001, Andrew W. Moore Probabilit Densities: Slide 51 Discussion Are PDFs the onl sensible wa to handle analsis of real-valued variables? Wh is covariance an imortant concet? Suose X and Y are indeendent real-valued random variables distributed between 0 and 1: What is [minx,y]? What is E[minX,Y]? Prove that E[X] is the value u that minimizes E[X-u ] What is the value u that minimizes E[ X-u ]? Coright 001, Andrew W. Moore Probabilit Densities: Slide 5

Cross-validation for detecting and preventing overfitting

Cross-validation for detecting and preventing overfitting A Regression Problem = f() + noise Can we learn f from this data? Note to other teachers and users of these slides. Andrew would be delighted if