Probability review. Adopted from notes of Andrew W. Moore and Eric Xing from CMU. Copyright Andrew W. Moore Slide 1

Similar documents
Probabilistic and Bayesian Analytics

Engineering Risk Benefit Analysis

Probabilistic and Bayesian Analytics Based on a Tutorial by Andrew W. Moore, Carnegie Mellon University

Homework Assignment 3 Due in class, Thursday October 15

Department of Computer Science Artificial Intelligence Research Laboratory. Iowa State University MACHINE LEARNING

Probability Basics. Robot Image Credit: Viktoriya Sukhanova 123RF.com

} Often, when learning, we deal with uncertainty:

For example, if the drawing pin was tossed 200 times and it landed point up on 140 of these trials,

Lecture 3: Probability Distributions

CS 798: Homework Assignment 2 (Probability)

Expected Value and Variance

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Stochastic Structural Dynamics

Modeling motion with VPython Every program that models the motion of physical objects has two main parts:

Classification Bayesian Classifiers

Sources of Uncertainty

Homework 9 for BST 631: Statistical Theory I Problems, 11/02/2006

Bayes and Naïve Bayes Classifiers CS434

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

CISC 4631 Data Mining

An Experiment/Some Intuition (Fall 2006): Lecture 18 The EM Algorithm heads coin 1 tails coin 2 Overview Maximum Likelihood Estimation

between standard Gibbs free energies of formation for products and reactants, ΔG! R = ν i ΔG f,i, we

SDMML HT MSc Problem Sheet 4

9.913 Pattern Recognition for Vision. Class IV Part I Bayesian Decision Theory Yuri Ivanov

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

PROBABILITY PRIMER. Exercise Solutions

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Limited Dependent Variables

Generative classification models

Randomness and Computation

Clustering with Gaussian Mixtures

arxiv: v2 [stat.me] 26 Jun 2012

Optimization Methods for Engineering Design. Logic-Based. John Hooker. Turkish Operational Research Society. Carnegie Mellon University

Machine learning: Density estimation

MIMA Group. Chapter 2 Bayesian Decision Theory. School of Computer Science and Technology, Shandong University. Xin-Shun SDU

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements

Bayes Nets for representing and reasoning about uncertainty

Bayesian decision theory. Nuno Vasconcelos ECE Department, UCSD

First Year Examination Department of Statistics, University of Florida

1/10/18. Definitions. Probabilistic models. Why probabilistic models. Example: a fair 6-sided dice. Probability

PhysicsAndMathsTutor.com

Probability Theory (revisited)

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

Lecture 20: Hypothesis testing

Evaluation for sets of classes

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Probability and Random Variable Primer

xp(x µ) = 0 p(x = 0 µ) + 1 p(x = 1 µ) = µ

Laboratory 1c: Method of Least Squares

Applied Stochastic Processes

CIS587 - Artificial Intellgence. Bayesian Networks CIS587 - AI. KB for medical diagnosis. Example.

The conjugate prior to a Bernoulli is. A) Bernoulli B) Gaussian C) Beta D) none of the above

Learning from Data 1 Naive Bayes

Statistics and Probability Theory in Civil, Surveying and Environmental Engineering

Statistical pattern recognition

Laboratory 3: Method of Least Squares

1.4. Experiments, Outcome, Sample Space, Events, and Random Variables

Chapter 1. Probability

Rules of Probability

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012

Kernel Methods and SVMs Extension

SELECTED PROOFS. DeMorgan s formulas: The first one is clear from Venn diagram, or the following truth table:

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Basically, if you have a dummy dependent variable you will be estimating a probability.

Robert Eisberg Second edition CH 09 Multielectron atoms ground states and x-ray excitations

Introduction to Regression

Linear Approximation with Regularization and Moving Least Squares

Artificial Intelligence Bayesian Networks

Hidden Markov Models

Classification learning II

Multipoint Analysis for Sibling Pairs. Biostatistics 666 Lecture 18

A be a probability space. A random vector

Learning with Maximum Likelihood

Lecture 6: Introduction to Linear Regression

Statistics and Quantitative Analysis U4320. Segment 3: Probability Prof. Sharyn O Halloran

Expectation Maximization Mixture Models HMMs

Hidden Markov Models

ST2352. Working backwards with conditional probability. ST2352 Week 8 1

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Statistics for Economics & Business

Markov Logic Network. Matthew Richardson and Pedro Domingos. IE , Present by Hao Wu (haowu4)

Ensemble Methods: Boosting

PES 1120 Spring 2014, Spendier Lecture 6/Page 1

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

= z 20 z n. (k 20) + 4 z k = 4

Decision-making and rationality

Course 395: Machine Learning - Lectures

Bayesian Networks. Course: CS40022 Instructor: Dr. Pallab Dasgupta

Statistical analysis using matlab. HY 439 Presented by: George Fortetsanakis

THE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE

# c i. INFERENCE FOR CONTRASTS (Chapter 4) It's unbiased: Recall: A contrast is a linear combination of effects with coefficients summing to zero:

Maximum Likelihood Estimation

Gaussian Mixture Models

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement

Some basic statistics and curve fitting techniques

Bayesian Decision Theory

CHAPTER 3: BAYESIAN DECISION THEORY

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law:

Transcription:

robablty reew dopted from notes of ndrew W. Moore and Erc Xng from CMU Copyrght ndrew W. Moore Slde So far our classfers are determnstc! For a gen X, the classfers we learned so far ge a sngle predcted y alue In contrast, a probablstc predcton returns a probablty oer the output space y X, y X We can easly thnk of stuatons when ths would be ery useful! Gen y X.49, y- X.5, how would you predct? What f I tell you t s much more costly to mss an poste example than the other way around? Copyrght ndrew W. Moore Slde

Dscrete Random Varables s a oolean-alued random arable f denotes an eent, and there s some degree of uncertanty as to whether occurs. Examples The US presdent n 3 wll be male You wake up tomorrow wth a headache You hae Ebola Copyrght ndrew W. Moore Slde 3 robabltes We wrte as the fracton of possble worlds n whch s true We could at ths pont spend hours on the phlosophy of ths. ut we won t. Copyrght ndrew W. Moore Slde 4

Vsualzng Eent space of all possble worlds Worlds n whch s true rea of reddsh oal Its area s Worlds n whch s False Copyrght ndrew W. Moore Slde 5 asc axoms < < True False or + - and Worlds n whch s False or and Smple addton and subtracton Copyrght ndrew W. Moore Slde 6 3

Elementary robablty Theorems ~ + ^ + ^ ~ Copyrght ndrew W. Moore Slde 7 Multalued Random Varables Suppose can take on more than alues s a random arable wth arty k f t can take on exactly one alue out of {,,.. k } Thus f k Copyrght ndrew W. Moore Slde 8 4

5 Copyrght ndrew W. Moore Slde 9 n easy fact about Multalued Random Varables: Usng the axoms of probablty < <, True, False or + - and nd assumng that obeys It s easy to proe that f k Copyrght ndrew W. Moore Slde n easy fact about Multalued Random Varables: Usng the axoms of probablty < <, True, False or + - and nd assumng that obeys It s easy to proe that f k nd thus we can proe k

6 Copyrght ndrew W. Moore Slde nother fact about Multalued Random Varables: Usng the axoms of probablty < <, True, False or + - and nd assumng that obeys It s easy to proe that f k ] [ Copyrght ndrew W. Moore Slde nother fact about Multalued Random Varables: Usng the axoms of probablty < <, True, False or + - and nd assumng that obeys It s easy to proe that f k ] [ nd thus we can proe k

Elementary robablty n ctures k k Copyrght ndrew W. Moore Slde 3 Condtonal robablty Fracton of worlds n whch s true that also hae true H Hae a headache F Comng down wth Flu F H / F /4 H F / H Headaches are rare and flu s rarer, but f you re comng down wth flu there s a 5-5 chance you ll hae a headache. Copyrght ndrew W. Moore Slde 4 7

Condtonal robablty F H F Fracton of flu-nflcted worlds n whch you hae a headache H Hae a headache F Comng down wth Flu H / F /4 H F / H #worlds wth flu and headache ------------------------------------ #worlds wth flu rea of H and F regon ------------------------------ rea of F regon H ^ F ----------- F Copyrght ndrew W. Moore Slde 5 Defnton of Condtonal robablty ^ ----------- Corollary: The Chan Rule ^ Copyrght ndrew W. Moore Slde 6 8

robablstc Inference F H Hae a headache F Comng down wth Flu H H / F /4 H F / One day you wake up wth a headache. You thnk: Drat! 5% of flus are assocated wth headaches so I must hae a 5-5 chance of comng down wth flu Is ths reasonng good? Copyrght ndrew W. Moore Slde 7 robablstc Inference F H Hae a headache F Comng down wth Flu H H / F /4 H F / F ^ H F H Copyrght ndrew W. Moore Slde 8 9

What we ust dd ^ ----------- --------------- Ths s ayes Rule ayes, Thomas 763 n essay towards solng a problem n the doctrne of chances. hlosophcal Transactons of the Royal Socety of London, 53:37-48 Copyrght ndrew W. Moore Slde 9 Usng ayes Rule to Gamble $. The Wn enelope has a dollar and four beads n t The Lose enelope has three beads and no money Tral queston: someone draws an enelope at random and offers to sell t to you. How much should you pay? Copyrght ndrew W. Moore Slde

Usng ayes Rule to Gamble $. The Wn enelope has a dollar and four beads n t The Lose enelope has three beads and no money Interestng queston: before decdng, you are allowed to see one bead drawn from the enelope. Suppose t s black: How much should you pay? Suppose t s red: How much should you pay? Copyrght ndrew W. Moore Slde Calculaton $. Copyrght ndrew W. Moore Slde

Contnuous robablty Dstrbuton contnuous random arable x can take any alue n an nteral on the real lne X usually corresponds to some real-alued measurements, e.g., today s lowest temperature It s not possble to talk about the probablty of a contnuous random arable takng an exact alue --- x56. Instead we talk about the probablty of the random arable takng a alue wthn a gen nteral x [5, 6] Copyrght ndrew W. Moore Slde 3 DF: probablty densty functon The probablty of X takng alue n a gen range [x, x] s defned to be the area under the DF cure between x and x We use fx to represent the DF of x Note: fx fx can be larger than f x dx X [ x, x] x x f x dx Copyrght ndrew W. Moore Slde 4

What s the ntute meanng of fx? If f xα*a and f xa Then when x s sampled from ths dstrbuton, you are α tmes more lkely to see that x s ery close to x than that x s ery close to x Copyrght ndrew W. Moore Slde 5 Some commonly used dstrbutons nomal dstrbuton: nomaln, p the probablty to see x heads out of n flps x n n L n x + x! x n x p p Multnomal dstrbuton: Multnomaln, [x, x,, x k ] The probablty to see x ones, x twos, etc, out of n dce rolls n! x x x [ x x x k,,..., k ] θ θ Lθ k x! x! Lx! k Copyrght ndrew W. Moore Slde 6 3

Contnuous Dstrbutons f f f Copyrght ndrew W. Moore Slde 7 The Jont Dstrbuton Recpe for makng a ont dstrbuton of M arables: Example: oolean arables,, C Copyrght ndrew W. Moore Slde 8 4

The Jont Dstrbuton Recpe for makng a ont dstrbuton of M arables:. Make a truth table lstng all combnatons of alues of your arables f there are M oolean arables then the table wll hae M rows. Example: oolean arables,, C C Copyrght ndrew W. Moore Slde 9 The Jont Dstrbuton Recpe for makng a ont dstrbuton of M arables:. Make a truth table lstng all combnatons of alues of your arables f there are M oolean arables then the table wll hae M rows.. For each combnaton of alues, say how probable t s. Example: oolean arables,, C C rob.3.5..5.5..5. Copyrght ndrew W. Moore Slde 3 5

The Jont Dstrbuton Recpe for makng a ont dstrbuton of M arables:. Make a truth table lstng all combnatons of alues of your arables f there are M oolean arables then the table wll hae M rows.. For each combnaton of alues, say how probable t s. 3. If you subscrbe to the axoms of probablty, those numbers must sum to. Example: oolean arables,, C.5 C..5.5 rob.3.5..5.5..5...5 C.3. Copyrght ndrew W. Moore Slde 3 Usng the Jont One you hae the JD you can ask for the probablty of any logcal expresson nolng your attrbute E rows matchng E row Copyrght ndrew W. Moore Slde 3 6

Usng the Jont oor Male.4654 E rows matchng E row Copyrght ndrew W. Moore Slde 33 Inference wth the Jont E E E E E rows matchng E and E rows matchng E row row Copyrght ndrew W. Moore Slde 34 7

Inference wth the Jont E E E E E rows matchng E and E rows matchng E row row Male oor.4654 /.764.6 Copyrght ndrew W. Moore Slde 35 So we hae learned that Jont dstrbuton s extremely useful! we can do all knds of cool nference I e got a sore neck: how lkely am I to hae menngts? Many ndustres grow around eyesan Inference: examples nclude medcne, pharma, Engne dagnoss etc. ut, HOW do we get them? We can learn from data Copyrght ndrew W. Moore Slde 36 8

Learnng a ont dstrbuton uld a JD table for your attrbutes n whch the probabltes are unspecfed C rob???????? Fracton of all records n whch and are True but C s False The fll n each row wth records matchng row ˆ row total number of records C rob.3.5..5.5..5. Copyrght ndrew W. Moore Slde 37 9