Probability Densities in Data Mining

Similar documents
Cross-validation for detecting and preventing overfitting

Random Vectors. 1 Joint distribution of a random vector. 1 Joint distribution of a random vector

UNIVERSITY OF DUBLIN TRINITY COLLEGE. Faculty of Engineering, Mathematics and Science. School of Computer Science & Statistics

Probabilistic and Bayesian Analytics

Copyright, 2008, R.E. Kass, E.N. Brown, and U. Eden REPRODUCTION OR CIRCULATION REQUIRES PERMISSION OF THE AUTHORS

Review of Probability

Conditioning and Independence

VC-dimension for characterizing classifiers

Covariance and Correlation Class 7, Jeremy Orloff and Jonathan Bloom

VC-dimension for characterizing classifiers

Introduction to Probability for Graphical Models

V-0. Review of Probability

11. Regression and Least Squares

INF Introduction to classifiction Anne Solberg Based on Chapter 2 ( ) in Duda and Hart: Pattern Classification

University of California, Los Angeles Department of Statistics. Joint probability distributions

Information Gain. Andrew W. Moore Professor School of Computer Science Carnegie Mellon University.

CHAPTER 5. Jointly Probability Mass Function for Two Discrete Distributed Random Variables:

f X, Y (x, y)dx (x), where f(x,y) is the joint pdf of X and Y. (x) dx

Biostatistics in Research Practice - Regression I

Correlation analysis 2: Measures of correlation

Inference about the Slope and Intercept

Experimental Uncertainty Review. Abstract. References. Measurement Uncertainties and Uncertainty Propagation

36-720: Latent Class Models

2. This problem is very easy if you remember the exponential cdf; i.e., 0, y 0 F Y (y) = = ln(0.5) = ln = ln 2 = φ 0.5 = β ln 2.

INF Anne Solberg One of the most challenging topics in image analysis is recognizing a specific object in an image.

Algorithms for Playing and Solving games*

General Random Variables

Chapter 7: Special Distributions

INF Introduction to classifiction Anne Solberg

Today. CS-184: Computer Graphics. Introduction. Some Examples. 2D Transformations

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA

3.0 PROBABILITY, RANDOM VARIABLES AND RANDOM PROCESSES

5. Zeros. We deduce that the graph crosses the x-axis at the points x = 0, 1, 2 and 4, and nowhere else. And that s exactly what we see in the graph.

MATH 829: Introduction to Data Mining and Analysis Consistency of Linear Regression

Expected value of r.v. s

Chapter 18 Quadratic Function 2

Multiple Random Variables

Monte Carlo Studies. Monte Carlo Studies. Sampling Distribution

The data can be downloaded as an Excel file under Econ2130 at

Classical AI and ML research ignored this phenomena Another example

Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology

Bayesian Networks: Independencies and Inference

MAHALAKSHMI ENGINEERING COLLEGE TIRUCHIRAPALLI

Review: mostly probability and some statistics

Section 2: Wave Functions and Probability Solutions

10.2 Polar Equations and Graphs

3. Several Random Variables

Linear regression Class 25, Jeremy Orloff and Jonathan Bloom

ME 597: AUTONOMOUS MOBILE ROBOTICS SECTION 2 PROBABILITY. Prof. Steven Waslander

Moment Generating Function. STAT/MTHE 353: 5 Moment Generating Functions and Multivariate Normal Distribution

0.24 adults 2. (c) Prove that, regardless of the possible values of and, the covariance between X and Y is equal to zero. Show all work.

Random vectors X 1 X 2. Recall that a random vector X = is made up of, say, k. X k. random variables.

Summary of Random Variable Concepts March 17, 2000

2: Distributions of Several Variables, Error Propagation

MTH234 Chapter 15 - Multiple Integrals Michigan State University

Probability Theory Refresher

LESSON 35: EIGENVALUES AND EIGENVECTORS APRIL 21, (1) We might also write v as v. Both notations refer to a vector.

Chapter 5 continued. Chapter 5 sections

Introduction to Computational Finance and Financial Econometrics Probability Theory Review: Part 2

Dependence and scatter-plots. MVE-495: Lecture 4 Correlation and Regression

Numerical Linear Algebra

Review (Probability & Linear Algebra)

Section 3.1. ; X = (0, 1]. (i) f : R R R, f (x, y) = x y

7. Two Random Variables

Statistical Distribution Assumptions of General Linear Models

Polar Coordinates; Vectors

Review of Elementary Probability Lecture I Hamid R. Rabiee

Bivariate distributions

Probability. Paul Schrimpf. January 23, Definitions 2. 2 Properties 3

Physics Gravitational force. 2. Strong or color force. 3. Electroweak force

INTRODUCTION TO DIOPHANTINE EQUATIONS

Multivariate Random Variable

Bayesian Networks Structure Learning (cont.)

MA Game Theory 2005, LSE

6. Linear transformations. Consider the function. f : R 2 R 2 which sends (x, y) (x, y)

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

What is the limit of a function? Intuitively, we want the limit to say that as x gets closer to some value a,

Mathematical Statistics. Gregg Waterman Oregon Institute of Technology

Probability Theory Review Reading Assignments

ES.182A Problem Section 11, Fall 2018 Solutions

Lecture 14: Multivariate mgf s and chf s

Next is material on matrix rank. Please see the handout

Chapter 4 Multiple Random Variables

1.1 The Equations of Motion

Business Analytics and Data Mining Modeling Using R Prof. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee

Review of probability

Review of Probability Theory II

Business Statistics 41000: Homework # 2 Solutions

Advanced Econometrics II (Part 1)

Lecture 1.2 Units, Dimensions, Estimations 1. Units To measure a quantity in physics means to compare it with a standard. Since there are many

1.2 Relations. 20 Relations and Functions

Joint Distribution of Two or More Random Variables

San Francisco State University ECON 851 Summer Problem Set 1

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

Vector Basics. Lecture 1 Vector Basics

Uncertainty and Parameter Space Analysis in Visualization -

Let X denote a random variable, and z = h(x) a function of x. Consider the

CONTINUOUS SPATIAL DATA ANALYSIS

4. Distributions of Functions of Random Variables

Transcription:

Probabilit Densities in Data Mining Note to other teachers and users of these slides. Andrew would be delighted if ou found this source material useful in giving our own lectures. Feel free to use these slides verbatim, or to modif them to fit our own needs. PowerPoint originals are available. If ou make use of a significant ortion of these slides in our own lecture, lease include this message, or the following link to the source reositor of Andrew s tutorials: htt://www.cs.cmu.edu/~awm/tutorials. Comments and corrections gratefull received. Andrew W. Moore Associate Professor School of Comuter Science Carnegie Mellon Universit www.cs.cmu.edu/~awm awm@cs.cmu.edu 41-68-7599 Coright 001, Andrew W. Moore Aug 7, 001 Probabilit Densities in Data Mining Wh we should care Notation and Fundamentals of continuous PDFs Multivariate continuous PDFs Combining continuous and discrete random variables Coright 001, Andrew W. Moore Probabilit Densities: Slide

Wh we should care Real Numbers occur in at least 50% of database records Can t alwas quantize them So need to understand how to describe where the come from A great wa of saing what s a reasonable range of values A great wa of saing how multile attributes should reasonabl co-occur Coright 001, Andrew W. Moore Probabilit Densities: Slide 3 Wh we should care Can immediatel get us Baes Classifiers that are sensible with real-valued data You ll need to intimatel understand PDFs in order to do kernel methods, clustering with Miture Models, analsis of variance, time series and man other things Will introduce us to linear and non-linear regression Coright 001, Andrew W. Moore Probabilit Densities: Slide 4

A PDF of American Ages in 000 Coright 001, Andrew W. Moore Probabilit Densities: Slide 5 A PDF of American Ages in 000 Let X be a continuous random variable. If is a Probabilit Densit Function for X then a < X b P d b a 50 < Age 50 P 30 age dage 0.36 age 30 Coright 001, Andrew W. Moore Probabilit Densities: Slide 6

Proerties of PDFs a < X b P d b a That means lim h 0 h P < X h h + P X Coright 001, Andrew W. Moore Probabilit Densities: Slide 7 Proerties of PDFs a < X b P d P b a X Therefore Therefore d 1 : 0 Coright 001, Andrew W. Moore Probabilit Densities: Slide 8

Talking to our stomach What s the gut-feel meaning of? If then 5.31 0.06 and 5.9 0.03 when a value X is samled from the distribution, ou are times as likel to find that X is ver close to 5.31 than that X is ver close to 5.9. Coright 001, Andrew W. Moore Probabilit Densities: Slide 9 Talking to our stomach What s the gut-feel meaning of? If then 5.31 a 0.06 and 5.9 b 0.03 when a value X is samled from the distribution, ou are times as likel to find that X is ver close to 5.31 a than that X is ver close to 5.9. b Coright 001, Andrew W. Moore Probabilit Densities: Slide 10

Talking to our stomach What s the gut-feel meaning of? If then 5.31 a 0.03 z and 5.9 b 0.06 z when a value X is samled from the distribution, ou are times as likel to find that X is ver close to 5.31 a than that X is ver close to 5.9. b Coright 001, Andrew W. Moore Probabilit Densities: Slide 11 Talking to our stomach What s the gut-feel meaning of? If then 5.31 a 0.03 αz and 5.9 b 0.06 z when a value X is samled from the distribution, ou are α times as likel to find that X is ver close to 5.31 a than that X is ver close to 5.9. b Coright 001, Andrew W. Moore Probabilit Densities: Slide 1

Talking to our stomach What s the gut-feel meaning of? If then a b α when a value X is samled from the distribution, ou are α times as likel to find that X is ver close to 5.31 a than that X is ver close to 5.9. b Coright 001, Andrew W. Moore Probabilit Densities: Slide 13 Talking to our stomach What s the gut-feel meaning of? If a b then P a h < lim h 0 P b h < α X X < a + h < b + h α Coright 001, Andrew W. Moore Probabilit Densities: Slide 14

Yet another wa to view a PDF A recie for samling a random age. 1. Generate a random dot from the rectangle surrounding the PDF curve. Call the dot age,d. If d < age sto and return age 3. Else tr again: go to Ste 1. Coright 001, Andrew W. Moore Probabilit Densities: Slide 15 Test our understanding True or False: : 1 : P X 0 Coright 001, Andrew W. Moore Probabilit Densities: Slide 16

Eectations E[X] the eected value of random variable X the average value we d see if we took a ver large number of random samles of X d Coright 001, Andrew W. Moore Probabilit Densities: Slide 17 Eectations E[age]35.897 E[X] the eected value of random variable X the average value we d see if we took a ver large number of random samles of X Coright 001, Andrew W. Moore Probabilit Densities: Slide 18 d the first moment of the shae formed b the aes and the blue curve the best value to choose if ou must guess an unknown erson s age and ou ll be fined the square of our error

Eectation of a function E[age ] 1786.64 E[age] 188.6 µe[fx] the eected value of f where is drawn from X s distribution. the average value we d see if we took a ver large number of random samles of fx µ f d Note that in general: E[ f ] f E[ X ] Coright 001, Andrew W. Moore Probabilit Densities: Slide 19 Variance σ Var[X] the eected squared difference between and E[X] Var [age] 498.0 σ µ d amount ou d eect to lose if ou must guess an unknown erson s age and ou ll be fined the square of our error, and assuming ou la otimall Coright 001, Andrew W. Moore Probabilit Densities: Slide 0

Standard Deviation σ Var[X] the eected squared difference between and E[X] Var [age] σ.3 498.0 σ µ d amount ou d eect to lose if ou must guess an unknown erson s age and ou ll be fined the square of our error, and assuming ou la otimall σ Standard Deviation tical deviation of X from its mean σ Var[ X ] Coright 001, Andrew W. Moore Probabilit Densities: Slide 1 In dimensions, robabilit densit of random variables X,Y at location, Coright 001, Andrew W. Moore Probabilit Densities: Slide

In dimensions Let X,Y be a air of continuous random variables, and let R be some region of X,Y sace P X, Y R, R, dd Coright 001, Andrew W. Moore Probabilit Densities: Slide 3 In dimensions Let X,Y be a air of continuous random variables, and let R be some region of X,Y sace P X, Y R, R, dd P 0<mg<30 and 500<weight<3000 area under the -d surface within the red rectangle Coright 001, Andrew W. Moore Probabilit Densities: Slide 4

In dimensions Let X,Y be a air of continuous random variables, and let R be some region of X,Y sace P X, Y R, R, dd P [mg-5/10] + [weight-3300/1500] < 1 area under the -d surface within the red oval Coright 001, Andrew W. Moore Probabilit Densities: Slide 5 In dimensions Let X,Y be a air of continuous random variables, and let R be some region of X,Y sace P X, Y R, R, dd Take the secial case of region R everwhere. Remember that with robabilit 1, X,Y will be drawn from somewhere. So.., dd 1 Coright 001, Andrew W. Moore Probabilit Densities: Slide 6

In dimensions Let X,Y be a air of continuous random variables, and let R be some region of X,Y sace P X, Y R, R, dd, lim h 0 h P < X h + h h < Y h + Coright 001, Andrew W. Moore Probabilit Densities: Slide 7 In m dimensions Let X 1,X, X m be an n-tule of continuous random variables, and let R be some region of R m P X, X,..., X, 1 1 m R...,..., m R, 1,..., m d m,,... d, d 1 Coright 001, Andrew W. Moore Probabilit Densities: Slide 8

Indeendence X Y iff, :, If X and Y are indeendent then knowing the value of X does not hel redict the value of Y mg,weight NOT indeendent Coright 001, Andrew W. Moore Probabilit Densities: Slide 9 Indeendence X Y iff, :, If X and Y are indeendent then knowing the value of X does not hel redict the value of Y the contours sa that acceleration and weight are indeendent Coright 001, Andrew W. Moore Probabilit Densities: Slide 30

Multivariate Eectation µ X E [ X] d E[mg,weight] 4.5,600 The centroid of the cloud Coright 001, Andrew W. Moore Probabilit Densities: Slide 31 Multivariate Eectation E [ f X] f d Coright 001, Andrew W. Moore Probabilit Densities: Slide 3

Test our understanding Question : When if ever does E [ X + Y] E[ X ] + E[ Y]? All the time? Onl when X and Y are indeendent? It can fail even if X and Y are indeendent? Coright 001, Andrew W. Moore Probabilit Densities: Slide 33 Bivariate Eectation E [ f, ] f,, dd if f, then E[ f X, Y ], dd if f, then E[ f X, Y], dd if f, + then E[ f X, Y ] +, dd E [ X + Y] E[ X ] + E[ Y] Coright 001, Andrew W. Moore Probabilit Densities: Slide 34

σ Bivariate Covariance Cov[ X, Y ] E[ X µ Y µ ] σ σ σ σ Cov[ X, X ] Var[ X ] E[ X µ Cov[ Y, Y] Var[ Y ] E[ Y µ ] ] Coright 001, Andrew W. Moore Probabilit Densities: Slide 35 σ Bivariate Covariance Cov[ X, Y ] E[ X µ Y µ ] σ σ σ Cov[ X] σ Cov[ X, X ] Var[ X ] E[ X µ Cov[ Y, Y] Var[ Y ] E[ Y µ X Write X, then Y ] σ σ T E[ X µ X µ ] Σ σ σ Coright 001, Andrew W. Moore Probabilit Densities: Slide 36 ]

Covariance Intuition σ weight 700 E[mg,weight] 4.5,600 σ weight 700 σ mg 8 σ mg 8 Coright 001, Andrew W. Moore Probabilit Densities: Slide 37 Covariance Intuition Princial Eigenvector of Σ σ weight 700 E[mg,weight] 4.5,600 σ weight 700 σ mg 8 σ mg 8 Coright 001, Andrew W. Moore Probabilit Densities: Slide 38

Cov[ X] Covariance Fun Facts σ σ T E[ X µ X µ ] Σ σ σ True or False: If σ 0 then X and Y are indeendent True or False: If X and Y are indeendent then σ 0 True or False: If σ σ σ then X and Y are deterministicall related True or False: If X and Y are deterministicall related then σ σ σ How could ou rove or disrove these? Coright 001, Andrew W. Moore Probabilit Densities: Slide 39 General Covariance Let X X 1,X, X k be a vector of k continuous random variables T Cov [ X] E [ X µ X µ ] Σ Σ ij Cov[ X, X ] σ i j i j S is a k k smmetric non-negative definite matri If all distributions are linearl indeendent it is ositive definite If the distributions are linearl deendent it has determinant zero Coright 001, Andrew W. Moore Probabilit Densities: Slide 40

Question : When if Test our understanding ever doesvar [ X + Y] Var[ X ] + Var[ Y]? All the time? Onl when X and Y are indeendent? It can fail even if X and Y are indeendent? Coright 001, Andrew W. Moore Probabilit Densities: Slide 41 Marginal Distributions, d Coright 001, Andrew W. Moore Probabilit Densities: Slide 4

mg weight 4600 Conditional Distributions mg weight 300 mg weight 000.d.f. of X when Y Coright 001, Andrew W. Moore Probabilit Densities: Slide 43 mg weight 4600 Conditional Distributions, Wh?.d.f. of X when Y Coright 001, Andrew W. Moore Probabilit Densities: Slide 44

Coright 001, Andrew W. Moore Probabilit Densities: Slide 45 Indeendence Revisited It s eas to rove that these statements are equivalent, :, iff Y X :, :,, :, Coright 001, Andrew W. Moore Probabilit Densities: Slide 46 More useful stuff Baes Rule These can all be roved from definitions on revious slides 1 d,, z z z

Coright 001, Andrew W. Moore Probabilit Densities: Slide 47 Miing discrete and continuous variables h v A h X h P v A + <, lim 0 h 1, 1 n A v d v A Baes Rule Baes Rule A P A P A A P A A P Coright 001, Andrew W. Moore Probabilit Densities: Slide 48 Miing discrete and continuous variables PEduYears,Wealth

Miing discrete and continuous variables PEduYears,Wealth PWealth EduYears Coright 001, Andrew W. Moore Probabilit Densities: Slide 49 Miing discrete and continuous variables PEduYears,Wealth PEduYears Wealth PWealth EduYears Renormalized Aes Coright 001, Andrew W. Moore Probabilit Densities: Slide 50

What ou should know You should be able to la with discrete, continuous and mied joint distributions You should be ha with the difference between and PA You should be intimate with eectations of continuous and discrete random variables You should smile when ou meet a covariance matri Indeendence and its consequences should be second nature Coright 001, Andrew W. Moore Probabilit Densities: Slide 51 Discussion Are PDFs the onl sensible wa to handle analsis of real-valued variables? Wh is covariance an imortant concet? Suose X and Y are indeendent real-valued random variables distributed between 0 and 1: What is [minx,y]? What is E[minX,Y]? Prove that E[X] is the value u that minimizes E[X-u ] What is the value u that minimizes E[ X-u ]? Coright 001, Andrew W. Moore Probabilit Densities: Slide 5