Discrete Random Variables

Similar documents
Why Probability? It's the right way to look at the world.

An AI-ish view of Probability, Conditional Probability & Bayes Theorem

10/18/2017. An AI-ish view of Probability, Conditional Probability & Bayes Theorem. Making decisions under uncertainty.

Random Variables. A random variable is some aspect of the world about which we (may) have uncertainty

COMP5211 Lecture Note on Reasoning under Uncertainty

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber

Our Status in CSE 5522

Uncertainty. Chapter 13

Basic Probability. Robert Platt Northeastern University. Some images and slides are used from: 1. AIMA 2. Chris Amato

Modeling and reasoning with uncertainty

CS188 Outline. We re done with Part I: Search and Planning! Part II: Probabilistic Reasoning. Part III: Machine Learning

Probability Review Lecturer: Ji Liu Thank Jerry Zhu for sharing his slides

Consider an experiment that may have different outcomes. We are interested to know what is the probability of a particular set of outcomes.

Quantifying uncertainty & Bayesian networks

Machine Learning

Recall from last time: Conditional probabilities. Lecture 2: Belief (Bayesian) networks. Bayes ball. Example (continued) Example: Inference problem

CS 343: Artificial Intelligence

Basic Probabilistic Reasoning SEG

Introduction to Machine Learning

Introduction to Machine Learning

Computer Science CPSC 322. Lecture 18 Marginalization, Conditioning

Our Status. We re done with Part I Search and Planning!

10/15/2015 A FAST REVIEW OF DISCRETE PROBABILITY (PART 2) Probability, Conditional Probability & Bayes Rule. Discrete random variables

Basic Probability and Statistics

Information Science 2

CS 188: Artificial Intelligence. Our Status in CS188

CS 331: Artificial Intelligence Probability I. Dealing with Uncertainty

Lecture 10: Introduction to reasoning under uncertainty. Uncertainty

Fusion in simple models

CS 561: Artificial Intelligence

In today s lecture. Conditional probability and independence. COSC343: Artificial Intelligence. Curse of dimensionality.

Uncertainty. Variables. assigns to each sentence numerical degree of belief between 0 and 1. uncertainty

Uncertainty. Chapter 13

PROBABILITY AND INFERENCE

PROBABILITY. Inference: constraint propagation

Machine Learning. CS Spring 2015 a Bayesian Learning (I) Uncertainty

Pengju

UNCERTAINTY. In which we see what an agent should do when not all is crystal-clear.

Bayesian RL Seminar. Chris Mansley September 9, 2008

CS 188: Artificial Intelligence Fall 2011

Probabilistic Robotics

Bayesian Reasoning. Adapted from slides by Tim Finin and Marie desjardins.

Introduction to Bayes Nets. CS 486/686: Introduction to Artificial Intelligence Fall 2013

Bayesian Updating: Discrete Priors: Spring

Uncertainty. Outline

Reinforcement Learning Wrap-up

COMP9414/ 9814/ 3411: Artificial Intelligence. 14. Uncertainty. Russell & Norvig, Chapter 13. UNSW c AIMA, 2004, Alan Blair, 2012

Bayes Networks. CS540 Bryan R Gibson University of Wisconsin-Madison. Slides adapted from those used by Prof. Jerry Zhu, CS540-1

Uncertainty. Logic and Uncertainty. Russell & Norvig. Readings: Chapter 13. One problem with logical-agent approaches: C:145 Artificial

CS188 Outline. CS 188: Artificial Intelligence. Today. Inference in Ghostbusters. Probability. We re done with Part I: Search and Planning!

CS 5100: Founda.ons of Ar.ficial Intelligence

CSE 473: Artificial Intelligence

T Machine Learning: Basic Principles

Introduction into Bayesian statistics

Probability Basics. Robot Image Credit: Viktoriya Sukhanova 123RF.com

Artificial Intelligence Uncertainty

ARTIFICIAL INTELLIGENCE. Uncertainty: probabilistic reasoning

CS 188: Artificial Intelligence Fall 2009

Probabilistic Reasoning

Review: Bayesian learning and inference

Probabilistic representation and reasoning

Stochastic Methods. 5.0 Introduction 5.1 The Elements of Counting 5.2 Elements of Probability Theory

Probability. CS 3793/5233 Artificial Intelligence Probability 1

Bayesian Updating: Discrete Priors: Spring

Artificial Intelligence

Reasoning Under Uncertainty: Conditional Probability

Naive Bayes Classifier. Danushka Bollegala

Lecture 7B: Chapter 6, Section 2 Finding Probabilities: More General Rules

Bayesian networks. Soleymani. CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2018

Machine Learning. Bayesian Learning. Michael M. Richter. Michael M. Richter

Probability and Uncertainty. Bayesian Networks

Hidden Markov Models. Vibhav Gogate The University of Texas at Dallas

Introduction to Probabilistic Reasoning. Image credit: NASA. Assignment

Review (Probability & Linear Algebra)

Cartesian-product sample spaces and independence

PROBABILISTIC REASONING SYSTEMS

Bayesian Approaches Data Mining Selected Technique

Bayesian Networks. Motivation

CS 5522: Artificial Intelligence II

Lecture 2. Logic Compound Statements Conditional Statements Valid & Invalid Arguments Digital Logic Circuits. Reading (Epp s textbook)

Uncertainty and Bayesian Networks

Reasoning Under Uncertainty

Bayesian Machine Learning

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering

Axioms of Probability? Notation. Bayesian Networks. Bayesian Networks. Today we ll introduce Bayesian Networks.

Reasoning Under Uncertainty: Conditioning, Bayes Rule & the Chain Rule

Bayesian Networks. CompSci 270 Duke University Ron Parr. Why Joint Distributions are Important

Probabilistic Robotics. Slides from Autonomous Robots (Siegwart and Nourbaksh), Chapter 5 Probabilistic Robotics (S. Thurn et al.

Outline. CSE 573: Artificial Intelligence Autumn Agent. Partial Observability. Markov Decision Process (MDP) 10/31/2012

Brief Intro. to Bayesian Networks. Extracted and modified from four lectures in Intro to AI, Spring 2008 Paul S. Rosenbloom

Artificial Intelligence Programming Probability

A.I. in health informatics lecture 3 clinical reasoning & probabilistic inference, II *

Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology

Bayes Nets: Independence

CDA6530: Performance Models of Computers and Networks. Chapter 1: Review of Practical Probability

CS 540: Machine Learning Lecture 2: Review of Probability & Statistics

Dealing with Uncertainty. CS 331: Artificial Intelligence Probability I. Outline. Random Variables. Random Variables.

Where are we in CS 440?

Machine Learning

Uncertainty. 22c:145 Artificial Intelligence. Problem of Logic Agents. Foundations of Probability. Axioms of Probability

Transcription:

Probability

Discrete Random Variables We denote discrete random variables with capital letters. A boolean random variable may be either true or false A = true or A=false. P(a), or P(A=true) denotes the probability that A is true. Also unconditional probability or prior probability. P( a) or P(A=false) denotes the probability that A is not true.

Discrete Random Variables P(a): the fraction of worlds in which A is true. Worlds A=true P(a) =.2 P( a) =.8

More Notation We can apply boolean operators Probability of a AND b: P(a b) or P(a,b) Probability of a OR b: P(a b) In general: P(α) a a b b

The Axioms of Probability 0 P(α) for any proposition α. P( τ)=1 if τ is a tautology. P(α β)=p(α)+p(β) if αand β are contradictory propositions

A Simple Proof 0 P(α) for any proposition α. P( τ)=1 if τ is a tautology. P(α β)=p(α)+p(β) if αand β are contradictory propositions Prove that P( α)=1 P(α) P( α α)=1 (axiom 2) P( α)+ P(α)=1 (axiom 3) P( α)=1 P(α) rearranging terms

Multi-valued Random Variables We can define a random variable that can take on more than two possible values. E.g. C is one of {v 1, v 2,...,v N } Note, it must be the case that : N 1 P v i =1 Example: W may have the domain {sunny, cloudy, rainy}.

Conditional Probability P(a b), the probability that A is true given that B is true. P(sunny) =.1 P(sunny warm) =.3 Definition: The fraction of worlds in which B is true, that also have A true. P(a b)= P(a b) P(b) May also be written as the product rule: P(a b)=p(a b) P(b)

Conditional Probability a a b b P(a) =.3 P(b) =.1 P(a b) =.05 P(a b) =.5

Probability Distributions A probability distribution is a complete description of the probability of all possible assignments to a random variable. Examples: For a boolean variable P(A=TRUE) =.1 P(A=FALSE) =.9 Random variable W from the domain {sunny, cloudy, rainy} P(W) = <.2,.7,.1>

Joint Probability Distribution A complete description of the probability of all possible assignments to all variables (atomic event). Two boolean variables A and B A B Prob T T.1 T F.2 F T.5 F F.2 Rooster Crows (C) and Weather (W) C W Prob T sunny.05 T cloudy.2 T rainy 0 F sunny.05 F cloudy.4 F rainy.3

Inference Determining the probability of an event of interest, given everything that we know about the world. This is easy if we have the joint probability distribution. The probability of a proposition is equal to the sum of the probabilities of worlds in which it holds.

Inference Example A B Prob T T.1 T F.2 F T.5 F F.2 What is P(A = true)? P(a) =??

Inference Example A B Prob T T.1 T F.2 F T.5 F F.2 What is P(A = true)? P(a) =.1 +.2 =.3 In general P (Y )= P (Y, z) marginalization. z Here Y and Z may be sets of variables, and the sum is over all possible assignments to the variables Z.

Conditional Inference C W Prob T sunny.05 T cloudy.2 T rainy 0 F sunny.05 F cloudy.4 F rainy.3 P(C=true W =sunny)? Remember that: P(a b)= P(a b) P(b) P(C=true W = sunny) =??

Conditional Inference C W Prob T sunny.05 T cloudy.2 T rainy 0 F sunny.05 F cloudy.4 F rainy.3 P(C=true W =sunny)? Remember that: P(a b)= P(a b) P(b) P(C=true W = sunny) =.05 / (.05 +.05) =.5

Learning a Joint Probability Distribution Where does the joint probability distribution come from? Maybe we (or an expert) make it up. # instances that match row P row = # total instances C W #days Prob T sunny 12 12/38 =.32 T cloudy 3 3/38 =.08 T rainy 0 0/38 =.0 F sunny 8 8/38 =.21 F cloudy 10 10/38 =.26 F rainy 5 5/38 =.13 Or we can learn it: total: 38 ANY PROBLEMS?

Problems with Learning PD This will quickly break down if we have more than a few variables.

Independence Variables A and B are independent if P(a b) = P(a) We can also write: P(a b)=p(a)p(b) Remember the product rule: P(a b)=p(a b) P(b) Independence is a big deal for probabilistic reasoning. Specifying the full joint PD requires exponential storage. Learning it requires an exponentially growing amount of data. These both become linear if all variables are independent. This is called factoring the joint distribution.

Bayes' Rule The most useful identity in AI: P(h e)= P(e h)p(h) P(e) Think of h as hypothesis and e as evidence. P(e h) is called likelihood. posterior = likelihood prior evidence Why would we know P(e h) and not P(h e)?

Diagnosis I have a cough, I want to know the probability that I have pneumonia. P(h e)= P(e h)p(h) P(e) P(cough) =.1, P(pneumonia) =.001, P(cough pneumonia) =.5 P(pneumonia cough) = (.5 *.001) /.1 =.005 =.5%

P(e)? That P(e) term seems a little odd. How do we know the prior probability of the evidence? We don't need to: Maybe we don't care (if only want to know which hypothesis is most likely). Otherwise: P(e)= h H P(e h) P(h)