CS626 Data Analysis and Simulation Instructor: Peter Kemper R 104A, phone 221-3462, email:kemper@cs.wm.edu Today: Probability Primer Quick Reference: Sheldon Ross: Introduction to Probability Models 9th Edition, AP, Ch. 1, Berthold, Hand: Intelligent Data Analysis, Springer 99, Chapter 2 by Feelders, Statistics Concepts. 1
Today Sample space and events Probabilities defined on events Kolmogorov s Axioms Conditional probabilities Independent events Excursion on Reliability of Series-Parallel Systems Bayes Formula 2
Today s topics what it is good for Probabilities are introduced in an axiomatic manner: Helps to achieve a sound theory Helps to clarify What assumptions are necessary to make theory apply What needs to be determined to be able to obtain results Clarification of terminology is necessary to Be able to be precise Avoid misunderstandings based on ambiguity of our language Conditional Probability and Bayes Formula are fundamental for many applications, basis of statistical method (Bayesian procedure). 3
Experiment and Sample Space Definition: (Random) Experiment Procedure that has a number of possible outcomes and it is not certain which one will occur Definition: Sample Space The set of all possible outcomes of an experiment is called sample space (denoted by S). Definition A subset E S is called event. Set operations on events: union, intersection 4
Algebra of Events Algebra of events defined by 5 laws, where A, B, C are arbitrary sets (of events) Commutative laws Associative laws Distributive laws Identity laws Complementation laws 5
Some useful relations based on those axioms Idempotent laws Domination laws Absorption laws De Morgan s laws 6
Graphics for Events Venn diagrams S A B Tree diagrams of sequential sample spaces Throw coin twice H T H T H T (H,H) (H,T) (T,H) (T,T) 7
Frequency Definition of Probability If our experiment is repeated over and over again then the proportion of time that event E occurs will just be P(E). Frequency Definition of Probability: P(E) = lim m(e) / m where m(e) is the number of times event E occurs, Note: m m is the number of trials Random experiment can be repeated under identical conditions if repeated indefinitely, relative frequency of occurrence of an event converges to a constant Law of large numbers states that limit does exist. For small m, m(e) can show strong fluctuations. 8
Axiomatic Definition of Probability Definition For each event E of the sample S, we assume that a number P(E) is defined that satisfies Kolmogorov s axioms: 9
Some useful relations derived from axioms What is the probability that E does NOT occur? What is the probability of the impossible event? 10
More relations What is the probability of a UNION of events? What is the probability of a union of a set of events? Is there a better way to calculate this? Sum of disjoint products (SDP) formula 11
Probability space, probability system So far ok for discrete S, but in general we need to be more careful with events E to be able to assign probabilities to events. A probability space is a triple (S,F,P) With sample space S With σ field F of subsets of S to select events from With P being a probability measure defined on F that satisfies Kolmogorov s axioms F is a collection of subsets of S that is closed under countable unions and complementation. Elements of F are called measurable. 12
Outline on Problem Solving (Goodman & Hedetniemi 77) Identify sample space S All elements must be mutually exclusive, collectively exhaustive. All possible outcomes of experiment should be listed separately. (Root of tricky problems: often ambiguity, inexact formulation of the model of a physical situation) Assign probabilities To all elements of S, consistent with Kolmogorov s axioms. (In practice: estimates based on experience, analysis or common assumptions) Identify events of interest Recast statements as subsets of S. Use laws (algebra of events) for simplifications Use visualizations for clarification Compute desired probabilities Use axioms, laws, often helpful: express event of interest as union of mutually exclusive events and sum up probabilities 13
Conditional Probabilities E EF F given F happens EF F Definition The conditional probability of E given F is if P(F) > 0 and it is undefined otherwise. Interpretation: Given F has happened, only events in EF are still possible for E, so original probability P(EF) is scaled by 1/P(F). Multiplication rule: 14
Two examples Family with two children. What is the probability that both children are boys, given that at least one of them is a boy? Given a sample space S where all outcomes are equally likely: Bev can take a computer science course and get an A with 1/2 probability or a chemistry course and get an A with 1/3 probability. If she flips a fair coin to decide what is the probability that Bev will get an A in chemistry? Let C be the event that Bev takes chemistry and A be the event that she receives an A in whatever she takes 15
Using conditional probabilities is trivial? A variation of a classic example: Professional gambler invites you for a game for $50: He has 3 little cups and one little ball, the ball goes under one of the cups and he mixes the cups. You pick a cup. It does not matter if you are right or wrong, the gambler will reveal one of the other cups that has not the little ball (equally likely if you picked the right one and he has a choice of two). It is your choice: To stick with your first guess. To change your mind and switch to the other remaining cup. Then: If you guess the right cup you win $50, Two questions: If you fail you loose $50 to him. What alternative is better according to probability theory? Why do you loose in practice but your neighbor has more luck? 16
Gambling with professionals S = { A, B, C } Initial probabilities, all equal: P(A)=P(B)=P(C)=1/3 Assume you pick A, and C is lifted subsequently as empty P(A) = 1/3, P(A c )=2/3 Now if the one empty cup (say C) is lifted given C is wrong A B C A B So chances for P(B or C) are 2/3 and you can get this with B! Conditional probabilities seem to tell a different story: P(A C c ) = P(AC c ) / P(C c ) = P(A) / P(C c ) = 1/3 / 2/3 = 1/2 P(A c C c ) = P(A c C c ) / P(C c ) = P(B) / P(C c ) = 1/3 / 2/3 = 1/2 What is right? 17
Independent events Definition Two events E and F are independent if: This also means: In English, E and F are independent Notes: if knowledge that F has occurred does not affect the probability that E occurs. if E, F independent then also E,F c and E c,f and E c,f c Generalizes from 2 to n events e.g. n=3 every subset independent Mutually exclusive vs independent 18
Example Tossing two fair dice, let E 1 be the event that the sum of the dice is six and F be the event that the first die is a four Thus E 1 and F are not independent Same experiment, except let E 2 be the event that the sum of the dice is seven Thus E 2 and F are independent 19
Joint and pairwise independence A ball is drawn from an urn containing four balls numbered 1, 2, 3, 4. Then we have: They are pairwise independent, but not jointly independent A sequence of experiments results in either a success or a failure where E i, i >= 1 denotes a success. If for all i 1, i 2,, i n : we say the sequence of experiments consists of independent trials 20
Excursion: Reliability Analysis with Reliability Block Diagrams Reliability of series-parallel systems Motivation: Illustrate how probabilities can be applied Illustrate how powerful independence assumption is We consider a set of components with index i=1,2, Event A i = component i is functioning properly Reliability R i of i is the probability P(A i ) Series system: Entire system fails if any of its components fails Parallel system: Entire system fails if all of its components fail Key assumption: Failures of components are independent. For now. R is a probability, later R will be a function of time t 21
Reliability Analysis (if component failures are independent) Reliability of a series system (Product law of reliabilities) Based on the assumption of series connections. Note how quickly R s degrades for n = 1,2, Reliability of a parallel system Let F i = 1-R i be the unreliability of a component, F p = 1-R p of a parallel system Then (Product law of unreliabilities) Note: also law of diminishing returns (rate of increase in reliability decreases rapidly as n increases) Reliability of a series-parallel system Of n serial stages, at stage i have n i identical components (in parallel) 22
Reliability Block Diagrams Series parallel RBD of a network R1 R2 R3 R3 R3 R4 R4 R5 Other representations: Fault trees Limits: more general dependencies Structure Function Inclusion/exclusion formula (or SDP) Approach with Binary decision diagrams (BDD), Zang 99 (in Trivedi Ch1) Factoring/Conditioning More techniques for more general settings 23
Bayes Formula Let E and F be events, we may express E as: Because EF and EF c are mutually exclusive we can say: In English: Event E is a weighted average of the conditional probability of E given that F has occurred and the conditional probability of E given that F has not occurred. 24
Example: Student solves a multiple choice test. Let: p : probability that he/she knows the answer 1-p: probability that he/she guesses. Assume: guessing has success probability 1/m, where m is the number of multiple choice alternatives. What is the conditional probability that a student knew the answer to the question that he/she answered correctly? Let C: event that student answers correctly Let K: event that student actually knew the answer. Then we have: Known: P(K)=p P(K c )=1-p P(C K c )=1/m P(C K)=1 25
Another example: Laboratory blood test Test: Question: 95% effective in detecting a certain disease when it is existent. 1% error rate of saying that a healthy person has the disease. If 0.5% of the population has the disease, what is the probability that a person has the disease given that the test result is positive? Let D be the event that the tested person has the disease, Let E be the event that the test result is positive. Known: P(E D)=.95 P(E D c )=.01 P(D)=.005 P(D c )=.995 26
Bayes Formula Let F 1, F 2,, F n be events of S, all mutually exclusive and collectively exhaustive. Theorem of total probability (also Rule of Elimination) Bayes Formula helps us to determine which F j happened given we observed E 27
Gambling with professionals revisited S = { A, B, C } Initial probabilities, all equal: P(A)=P(B)=P(C)=1/3 Assume you pick A, and C is lifted subsequently as empty P(A) = 1/3, P(A c )=2/3 Now if the one empty cup (say C) is lifted given C is wrong A B C A B So chances for P(B or C) are 2/3 and you can get this with B! Conditional probabilities seem to tell a different story: P(A C c ) = P(AC c ) / P(C c ) = P(A) / P(C c ) = 1/3 / 2/3 = 1/2 P(A c C c ) = P(A c C c ) / P(C c ) = P(B) / P(C c ) = 1/3 / 2/3 = 1/2 What is right? 28
Gambling with professionals... Bayes Theorem Scenario: you pick cup A, gambler opens cup C Question: Success probability of switching P(B Gc) S = { A, B, C} for Ball is under A, B, or C S = {Ga, Gb, Gc} for Gambler opens A, B, or C Probabilities: P(A)=P(B)=P(C)=1/3 P(Gc A) = 1/2 P(Gc A)P(A)=1/2*1/3=1/6 P(Gc B) = 1 implies P(Gc B)P(B)=1 *1/3=1/3 P(Gc C) = 0 P(Gc C)P(C)=0 *1/3 =0 Bayes Theorem applied: P(B Gc) = P(Gc B) P(B) / X where X = P(Gc A)P(A) + P(Gc B)P(B) + P(Gc C)P(C) such that P(B Gc)= (1 * 1/3) / (1/6 + 1/3 + 0) = 1/3 / 3/6 = 2/3 29
Summary Sample space and events Probabilities defined on events Kolmogorov s Axioms Conditional probabilities Independent events Bayes Formula 30