STAT 598L Probabilistic Graphical Models. Instructor: Sergey Kirshner. Probability Review

Similar documents
Where are we? è Five major sec3ons of this course

UVA CS 6316/4501 Fall 2016 Machine Learning. Lecture 11: Probability Review. Dr. Yanjun Qi. University of Virginia. Department of Computer Science

Probabilistic Graphical Models

V7 Foundations of Probability Theory

Bayesian Networks Representation

CMPSCI 240: Reasoning about Uncertainty

MATH 556: PROBABILITY PRIMER

CS 361: Probability & Statistics

LECTURE 1. 1 Introduction. 1.1 Sample spaces and events

Statistical Inference

STATISTICAL METHODS IN AI/ML Vibhav Gogate The University of Texas at Dallas. Bayesian networks: Representation

Probabilistic Reasoning. Doug Downey, Northwestern EECS 348 Spring 2013

Probability Theory Review

Probability Theory for Machine Learning. Chris Cremer September 2015

Dept. of Linguistics, Indiana University Fall 2015

Recitation 2: Probability

Conditional Probability & Independence. Conditional Probabilities

Directed Graphical Models or Bayesian Networks

COS 424: Interacting with Data. Lecturer: Dave Blei Lecture #2 Scribe: Ellen Kim February 7, 2008

Problem # Number of points 1 /20 2 /20 3 /20 4 /20 5 /20 6 /20 7 /20 8 /20 Total /150

CS 630 Basic Probability and Information Theory. Tim Campbell

Statistics for Financial Engineering Session 2: Basic Set Theory March 19 th, 2006

Probabilistic Systems Analysis Spring 2018 Lecture 6. Random Variables: Probability Mass Function and Expectation

Review of Probabilities and Basic Statistics

Artificial Intelligence Uncertainty

Discrete Probability. Mark Huiskes, LIACS Probability and Statistics, Mark Huiskes, LIACS, Lecture 2

Venn Diagrams; Probability Laws. Notes. Set Operations and Relations. Venn Diagram 2.1. Venn Diagrams; Probability Laws. Notes

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com

Single Maths B: Introduction to Probability

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Introduction to Mobile Robotics Probabilistic Robotics

Topic 5 Basics of Probability

Stat 451: Solutions to Assignment #1

Brief Review of Probability

Randomized Algorithms. Andreas Klappenecker

Chapter Learning Objectives. Random Experiments Dfiii Definition: Dfiii Definition:

Probability: Axioms, Properties, Interpretations

Lecture Notes 1 Basic Probability. Elements of Probability. Conditional probability. Sequential Calculation of Probability

STAT 598L Probabilistic Graphical Models. Instructor: Sergey Kirshner. Bayesian Networks

UVA CS / Introduc8on to Machine Learning and Data Mining

CS626 Data Analysis and Simulation

Deep Learning for Computer Vision

Lecture 11: Random Variables

Probability Review I

Course Introduction. Probabilistic Modelling and Reasoning. Relationships between courses. Dealing with Uncertainty. Chris Williams.

Probability. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh. August 2014

BASICS OF PROBABILITY CHAPTER-1 CS6015-LINEAR ALGEBRA AND RANDOM PROCESSES

Axioms of Probability

Naïve Bayes classification

Probability Theory. Introduction to Probability Theory. Principles of Counting Examples. Principles of Counting. Probability spaces.

Probability & statistics for linguists Class 2: more probability. D. Lassiter (h/t: R. Levy)

The Coin Algebra and Conditional Independence

Why study probability? Set theory. ECE 6010 Lecture 1 Introduction; Review of Random Variables

Stats Probability Theory

Review of Basic Probability Theory

Probabilistic Partial Evaluation: Exploiting rule structure in probabilistic inference

Probabilistic models

Probabilistic models

Probability COMP 245 STATISTICS. Dr N A Heard. 1 Sample Spaces and Events Sample Spaces Events Combinations of Events...

Math-Stat-491-Fall2014-Notes-I

Introduction to Bayesian Learning

Lecture 3 - Axioms of Probability

Data Mining Techniques. Lecture 3: Probability

Properties of Probability

Advanced Probabilistic Modeling in R Day 1

COMP 328: Machine Learning

Sample Spaces, Random Variables

2/3/04. Syllabus. Probability Lecture #2. Grading. Probability Theory. Events and Event Spaces. Experiments and Sample Spaces

Probability Review Lecturer: Ji Liu Thank Jerry Zhu for sharing his slides

ABOUT THE CLASS AND NOTES ON SET THEORY

1. what conditional independencies are implied by the graph. 2. whether these independecies correspond to the probability distribution

Lecture 8: Probability

Fundamentals of Probability CE 311S

Basics of Probability

Module 1. Probability

Naïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability

Statistical Theory 1

CHAPTER 1 SETS AND EVENTS

COS402- Artificial Intelligence Fall Lecture 10: Bayesian Networks & Exact Inference

Communication Theory II

Conditional Probability & Independence. Conditional Probabilities

Chapter 3 : Conditional Probability and Independence

ECE 302: Chapter 02 Probability Model

Origins of Probability Theory

BN Semantics 3 Now it s personal! Parameter Learning 1

Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016

Lecture 1 : The Mathematical Theory of Probability

MATH MW Elementary Probability Course Notes Part I: Models and Counting

Lecture 1: Probability Fundamentals

Recap. The study of randomness and uncertainty Chances, odds, likelihood, expected, probably, on average,... PROBABILITY INFERENTIAL STATISTICS

Intelligent Systems: Reasoning and Recognition. Reasoning with Bayesian Networks

Classical and Bayesian inference

CS 4649/7649 Robot Intelligence: Planning

a b = a T b = a i b i (1) i=1 (Geometric definition) The dot product of two Euclidean vectors a and b is defined by a b = a b cos(θ a,b ) (2)

Fusion in simple models

9/6. Grades. Exam. Card Game. Homework/quizzes: 15% (two lowest scores dropped) Midterms: 25% each Final Exam: 35%

EE 178 Lecture Notes 0 Course Introduction. About EE178. About Probability. Course Goals. Course Topics. Lecture Notes EE 178

Probability Review. Chao Lan

The probability of an event is viewed as a numerical measure of the chance that the event will occur.

Transcription:

STAT 598L Probabilistic Graphical Models Instructor: Sergey Kirshner Probability Review Some slides are taken (or modified) from Carlos Guestrin s 10-708 Probabilistic Graphical Models Fall 2008 at CMU

Graphical Models How to specify the model? What are the variables of interest? What are their values? How likely their combinations are? Need to specify a joint probability distribution But in a compact way Exploit local structure in the domain STAT 598L: Probabilistic Graphical Models (Probability Review) 2

Sample Space and Events Ω : Sample space, result of an experiment If you toss a coin twice Ω = {ΗΗ,ΗΤ,ΤΗ,ΤΤ} Event: a subset of Ω First toss is head = {HH,HT} S: event space, a set of events (σ-algebra): Closed under finite (and countably-infinite) union and complements Entails other binary operation: intersection, set difference, etc. Contains the empty event and Ω STAT 598L: Probabilistic Graphical Models (Probability Review) 3

Probability Measure Defined over (Ω,S) s.t. P(α) 0 for all α in S P(Ω) = 1 If α, β are disjoint, then P(α U β) = P(α) + P(β) We can deduce other properties from the above axioms P(Ø)=0 P(α U β)=p(α)+p(β)-p(α β) (inclusion-exclusion principle) STAT 598L: Probabilistic Graphical Models (Probability Review) 4

Visualization We can go on and define conditional probability, using the above visualization STAT 598L: Probabilistic Graphical Models (Probability Review) 5

Conditional Probability P(F H) = Fraction of worlds in which H is true that also have F true (provided P(H) 0) F H PF ( H) = PF ( H) PH ( ) STAT 598L: Probabilistic Graphical Models (Probability Review) 6

Law of Total Probability B 4 B 5 B 3 B 2 A B 7 B 6 B 1 Given a partition {B 1,,} of Ω p A = P Bi P A B ( ) ( ) ( ) i i STAT 598L: Probabilistic Graphical Models (Probability Review) 7

Chain Rule An equivalent chain for every order of events Foundation of a framework for Bayesian networks STAT 598L: Probabilistic Graphical Models (Probability Review) 8

Bayes Rule Chain Rule: Rule of Total Probability: STAT 598L: Probabilistic Graphical Models (Probability Review) 9

From Events to Random Variable Almost all the semester we will be dealing with r.v.s Concise way of specifying attributes of outcomes Modeling students (Grade and Intelligence): Ω = all possible students What are events Grade_A = all students with grade A Grade_B = all students with grade A Intelligence_High = with high intelligence We need functions that maps from Ω to an attribute space. STAT 598L: Probabilistic Graphical Models (Probability Review) 10

Ω Random Variables I:Intelligence high low G:Grade B A C Val(I)={high,low} Val(G)={A,B,C} STAT 598L: Probabilistic Graphical Models (Probability Review) 11

Ω Random Variables I:Intelligence high low G:Grade B A C P(I = high) = P( {all students whose intelligence is high}) {I -1 (a) a Val(I)} partition of Ω STAT 598L: Probabilistic Graphical Models (Probability Review) 12

Joint Probability Distribution Random variables encodes attributes Not all possible combination of attributes are equally likely Joint probability distributions quantify this P( X= x, Y= y) = P(x, y) How probable is it to observe these two attributes together? STAT 598L: Probabilistic Graphical Models (Probability Review) 13

Joint Probability Distribution P( X= x, Y= y) = P(x, y) P(I,G) G = A G = B G = C I = low 0.1 0.2 0.4 I = high 0.15 0.1 0.05 How probable is it to observe these two attributes together? Generalizes to n random variables How can we manipulate joint probability distributions? STAT 598L: Probabilistic Graphical Models (Probability Review) 14

Conditional Probability events ( x Y y) P X = = = ( = x Y = y) P X ( = y) P Y H P x y ( ) = Pxy (, ) P( y) F Can be computed by marginalization STAT 598L: Probabilistic Graphical Models (Probability Review) 15

Marginalization We know P(X,Y), what is P(X=x)? We can use the Law of Total Probability P x = P x, y ( ) ( ) y B 4 B 5 B 3 B 2 = y P y P x y ( ) ( ) B 7 B 6 A B 1 STAT 598L: Probabilistic Graphical Models (Probability Review) 16

Marginalization We know P(X,Y), what is P(X=x)? P x = P x, y ( ) ( ) P(I,G) G = A G = B G = C P(I) I = low 0.1 0.2 0.4 0.7 I = high 0.15 0.1 0.05 0.3 y P(G) 0.25 0.3 0.45 marginals = written in the margins STAT 598L: Probabilistic Graphical Models (Probability Review) 17

Marginalization More variables? P x ( ) ( ) = = yz, zy, P x, y, z P yz, P x yz, ( ) ( ) Marginalizing (naively) over n-1 variables: O(exp(n-1)) STAT 598L: Probabilistic Graphical Models (Probability Review) 18

Chain Rule Same as in the event case Conditional probabilities can be used to specify the joint distribution (instead of the joint probability tables) STAT 598L: Probabilistic Graphical Models (Probability Review) 19

Bayes Rule We know that P(I=high) =.3 If we also know that the students math grade is A, then how this affects our belief about his intelligence? STAT 598L: Probabilistic Graphical Models (Probability Review) 20

Bayes Rule What if another student grade (in English) becomes available? Can be done in any order STAT 598L: Probabilistic Graphical Models (Probability Review) 21

Joint Probability Assume each variable is binary Need 2 n -1 free parameters for n variables How to interpret, store, estimate for even moderate n? Need to reduce the number of free parameters STAT 598L: Probabilistic Graphical Models (Probability Review) 22

Independence X is independent of Y means that knowing Y does not change our belief about X. P(x y) = P(x) (P(y)>0) Alternatively, P(x, y) = P(x) P(y) The above should hold for all x, y It is symmetric and written as X Y STAT 598L: Probabilistic Graphical Models (Probability Review) 23

CI: Conditional Independence r.v.s are rarely independent but we can still leverage local structural properties like CI. X Y Z if once Z is observed, knowing the value of Y does not change our belief about X The following should hold for all values of X, Y, and Z P(x z, y) = P(x z) P(Y=y Z=z, X=x) = P(Y=y Z=z) P(X=x, Y=y Z=z) = P(X=x Z=z) P(Y=y Z=z) factors STAT 598L: Probabilistic Graphical Models (Probability Review) 24

Characterization of Conditional Independence X Z Y Y Z X Symmetry X Z Y W X Z Y & X Z W Decomposition X Z Y W X Z Y W Weak Union X Z Y W & X Z Y X Z Y W Contraction X Z Y W & X Z Y W X Z Y W Intersection Adopted from [Pearl 88] STAT 598L: Probabilistic Graphical Models (Probability Review) 25