Standard & Conditional Probability

Similar documents
Topic 4 Probability. Terminology. Sample Space and Event

STA Module 4 Probability Concepts. Rev.F08 1

I - Probability. What is Probability? the chance of an event occuring. 1classical probability. 2empirical probability. 3subjective probability

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com

Probability- describes the pattern of chance outcomes

2011 Pearson Education, Inc

STAT Chapter 3: Probability

Discrete Probability. Chemistry & Physics. Medicine

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14

Chapter 2.5 Random Variables and Probability The Modern View (cont.)

Chapter 14. From Randomness to Probability. Copyright 2012, 2008, 2005 Pearson Education, Inc.

Basic Concepts of Probability

Probability Theory and Applications

4/17/2012. NE ( ) # of ways an event can happen NS ( ) # of events in the sample space

Intermediate Math Circles November 8, 2017 Probability II

Probability the chance that an uncertain event will occur (always between 0 and 1)

STT When trying to evaluate the likelihood of random events we are using following wording.

Chapter Learning Objectives. Random Experiments Dfiii Definition: Dfiii Definition:

The enumeration of all possible outcomes of an experiment is called the sample space, denoted S. E.g.: S={head, tail}

4 Lecture 4 Notes: Introduction to Probability. Probability Rules. Independence and Conditional Probability. Bayes Theorem. Risk and Odds Ratio

Lecture Slides. Elementary Statistics Eleventh Edition. by Mario F. Triola. and the Triola Statistics Series 4.1-1

Chapter 6: Probability The Study of Randomness

CIVL Why are we studying probability and statistics? Learning Objectives. Basic Laws and Axioms of Probability

Probability Year 10. Terminology

3 PROBABILITY TOPICS

Dept. of Linguistics, Indiana University Fall 2015

Elements of probability theory

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Probability Year 9. Terminology

UNIT Explain about the partition of a sampling space theorem?

Statistics 251: Statistical Methods

LECTURE 1. 1 Introduction. 1.1 Sample spaces and events

Basic Concepts of Probability

Lecture 3 Probability Basics

Fundamentals of Probability CE 311S

Lecture 3: Probability

Notes Week 2 Chapter 3 Probability WEEK 2 page 1

Counting principles, including permutations and combinations.

Deep Learning for Computer Vision

1. When applied to an affected person, the test comes up positive in 90% of cases, and negative in 10% (these are called false negatives ).

Venn Diagrams; Probability Laws. Notes. Set Operations and Relations. Venn Diagram 2.1. Venn Diagrams; Probability Laws. Notes

2) There should be uncertainty as to which outcome will occur before the procedure takes place.

Probability Theory Review

Properties of Probability

Topic -2. Probability. Larson & Farber, Elementary Statistics: Picturing the World, 3e 1

A Event has occurred

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019

P(A) = Definitions. Overview. P - denotes a probability. A, B, and C - denote specific events. P (A) - Chapter 3 Probability

MAT2377. Ali Karimnezhad. Version September 9, Ali Karimnezhad

HW MATH425/525 Lecture Notes 1

STAT:5100 (22S:193) Statistical Inference I

Chapter 8: An Introduction to Probability and Statistics

Topic 2: Probability & Distributions. Road Map Probability & Distributions. ECO220Y5Y: Quantitative Methods in Economics. Dr.

(6, 1), (5, 2), (4, 3), (3, 4), (2, 5), (1, 6)

Probability: Why do we care? Lecture 2: Probability and Distributions. Classical Definition. What is Probability?

Lecture 2: Probability and Distributions

3.2 Probability Rules

1 Probability Theory. 1.1 Introduction

Business Statistics. Lecture 3: Random Variables and the Normal Distribution

LECTURE NOTES by DR. J.S.V.R. KRISHNA PRASAD

Probability Theory. Introduction to Probability Theory. Principles of Counting Examples. Principles of Counting. Probability spaces.

7.1 What is it and why should we care?

University of Technology, Building and Construction Engineering Department (Undergraduate study) PROBABILITY THEORY

CS 361: Probability & Statistics

Chapter 2 PROBABILITY SAMPLE SPACE

Introduction to Probability and Sample Spaces

Topic 3: Introduction to Probability

Bayes Formula. MATH 107: Finite Mathematics University of Louisville. March 26, 2014

Probability (Devore Chapter Two)

CIVL 7012/8012. Basic Laws and Axioms of Probability

STAT 201 Chapter 5. Probability

CIVL Probability vs. Statistics. Why are we studying probability and statistics? Basic Laws and Axioms of Probability

What is Probability? Probability. Sample Spaces and Events. Simple Event

Compound Events. The event E = E c (the complement of E) is the event consisting of those outcomes which are not in E.

Lecture Notes 1 Basic Probability. Elements of Probability. Conditional probability. Sequential Calculation of Probability

Chapter 7 Wednesday, May 26th

the time it takes until a radioactive substance undergoes a decay

ECE 340 Probabilistic Methods in Engineering M/W 3-4:15. Lecture 2: Random Experiments. Prof. Vince Calhoun

Probability Review Lecturer: Ji Liu Thank Jerry Zhu for sharing his slides

Math 1313 Experiments, Events and Sample Spaces

STAT 430/510 Probability

Chapter 2 Class Notes

Chapter 4 Probability

STAT 111 Recitation 1

Lecture 8: Probability

Sample Space: Specify all possible outcomes from an experiment. Event: Specify a particular outcome or combination of outcomes.

Statistics for Business and Economics

1 INFO Sep 05

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 2 MATH00040 SEMESTER / Probability

Today we ll discuss ways to learn how to think about events that are influenced by chance.

ELEG 3143 Probability & Stochastic Process Ch. 1 Probability

CINQA Workshop Probability Math 105 Silvia Heubach Department of Mathematics, CSULA Thursday, September 6, 2012

Probability (Devore Chapter Two)

Introduction to Probability

Chapter 2: Probability Part 1

3.2 Intoduction to probability 3.3 Probability rules. Sections 3.2 and 3.3. Elementary Statistics for the Biological and Life Sciences (Stat 205)

Origins of Probability Theory

Discrete Finite Probability Probability 1

Discrete Probability and State Estimation

The probability of an event is viewed as a numerical measure of the chance that the event will occur.

Transcription:

Biostatistics 050 Standard & Conditional Probability 1 ORIGIN 0 Probability as a Concept: Standard & Conditional Probability "The probability of an event is the likelihood of that event expressed either by the relative frequency observed from a large number of data or by knowledge of the system under study" Zar 2010 p 60. Statistics is typically based on a pair of quantities: X <- observed sample values P(X) <- probability of the sampled values under some model of probability. In fact, associating these two quantities is not at all straightforward and is often a point of controversy as both a theoretical and practical matter. Probability by knowledge of a system: Some systems are simple enough that concrete probabilities can be deduced directly by counting all possible outcomes. For instance, in flipping a "fair" coin there are only two possible outcomes with equal probabilities of occurring. Or, in rolling a single die, there are only six possible outcomes also with equal probabilities. Mendelian inheritance is another example in which probabilities of specific genotypes can be calculated using well-known ratios such as 9:3:3:1 for two dominant/recessive alleles on different chromosomes. Counting all possible outcomes is greatly facilitated by considering Permutations and Combinations. Permutations: "A permutation is an arrangement of objects in a specific sequence." Zar 2010 p. 51 "The number of permutations of n things taken k at a time... represents the number of ways of selecting k items out of n where the order of selection is important." Rosner 2006 Definition 4.8, p. 91. n 3 < n things... meaning of factorials (!) k 2 < taken k at a time n 6 3 2 1 6 ^ number of permutations of n things taken k at at time For example, let the n things be the letters: A, B, C. How many pairs of letters can we make where the order of letters is important? What happens if: n 20 k 7 n np k np k 6 < Zar Eq. 5.6 ( n k) AB AC BC 3 6 BA CA CB ( 3 2) np k < n things... < taken k at a time n ( n k) np k 3.907 10 8 ^ number of Permutations of n things take k at at time < fortunately the number n is relatively small, six! n 2.433 10 18 k 5.04 10 3 ( n k) 6.227 10 9 k 2 2 1 2 ( n k) 1 < Fortunately we have this formula, because listing all of the possilities and counting them up would take a lot of time...

Biostatistics 050 Standard & Conditional Probability 2 Degenerate Permutations: Zar 2010 gives additional formulas for permutations in which not all sequences are considered unique. These formulas are not of general importance, although for particular problems you may find them useful. Circular Arrangements: n CnP k CnP k 5.581 10 7 ( n k) k 3.138 10 13 ( n k) k Indistinguishable objects: i 0 2 15 < 15 indistinguishable in class 1 N 20 < number of objects in three classes: n 3 < 3 indistinguishable in class 2 2 < 2 indistinguishtable in class 3 DnP k N n i i DnP k 1.55 10 5 i n i 1.569 1013 n n 0 1 n 1.569 10 13 2 Combinations: ^ Note in all degenerate cases, the denominator is larger than the non-degenerate case. Thus the number of permutations will always be smaller. "If groups... are important to us, but not the sequence of objects within groups, then we are speaking of combinations" Zar 2010 p. 55 "The number of combinations of n things taken k at a time... represents the number of ways of selecting k objects out of n where the order of selection does not matter." Rosner 2006 Definition 4.11, p. 93. Combinations are degenerate permutations in the same sense as above when membership is important but order is not. For example with same n & k as the first one above: n 3 k 2 nc k < n things... < taken k at a time n k ( n k) nc k 3 n 6 k 2 ( n k) 1 ^ number of Combination of n things take k at at time < Note that this is half the number of Permutations with n=3, k=2. What about the larger example above? n 20 k 7 nc k < n things... < taken k at a time n k ( n k) nc k 7.752 10 4 ^ number of Combination of n things take k at at time np k 5040 nc k < a somewhat larger difference between Permutations and Combinations here!

Biostatistics 050 Standard & Conditional Probability 3 Built-in Functions: Most software packages contain built-in functions for Permutations and Combinations: n 20 k 7 Prototype in R: permut( nk) 3.907 10 8 combin( nk) 7.752 10 4 np k 3.907 10 8 7.752 10 4 nc k < and match our calculations above so serve as MathCad prototype... # PERMUTATIONS AND COMBINATIONS n=20 k=7 #PERMUTATIONS # No built-in funcon found, so I'll calculate it directly Permutaons=factorial(n)/(factorial(n-k)) Permutaons # COMBINATIONS - calculated directly Combinaons=factorial(n)/(factorial(k)*factorial(n-k)) Combinaons # BUILT-IN FUNCTION FOR COMBINATIONS choose(n,k) Important symmetry in calculation of Combinations: combin( nk) 7.752 10 4 permut( nk) 3.907 10 8 combin[ n ( n k) ] 7.752 10 4 permut[ n ( n k) ] 4.827 10 14 < k or (n-k) give the same result for combination but NOT permutation. Probability by reference to large samples: There are two important perspectives: - Frequentist (or Standard) Statistical Methods - mostly what we will do in this course. - Bayesian Inference - increasingly prominent in several biological & biomedical fields. Frequentist Method: "The probability of an event is the relative frequency of a set of outcomes over an indefinitely (or infinite) large number of trials." Rosner 206 p. 44 Definition 3.1 Sometimes, for theoretical reasons, aspects of the probablity distributions are known or are assumed. More commonly in practice, however, one takes a reasonably large empirical sample and compares it with known theoretical distributions, such as the Normal Distribution.

Biostatistics 050 Standard & Conditional Probability 4 x rnorm( 1000 1) plot histogram( 30x) < For example data points drawn by this function gives following histogram... 15 10 plot 1 5 0 3 2 1 0 1 2 3 4 plot 0 ^ From this limited sample, one might conclude that the population from which it was drawn has a Normal distribution... x rnorm( 50000 1) plot histogram( 100x) < But what if we draw a bigger sample, say 5000 values, and plot it with 100 bins instead of 30? 200 150 plot 1 100 50 0 4 3 2 1 0 1 2 3 4 plot 0 Conclusion: a bigger sample is usually better in guessing an underlying probability distribution... But other factors usually come into play including cost/time in conducting the study, and other biases of one sort or another.

Biostatistics 050 Standard & Conditional Probability 5 Bayesian Inference: Here two kinds of probability are distinguished: "The prior probability of an event is the best guess by the observer of an event's probability in the absence of data. This prior probability may be a single number, or it may be a range of likely values for the probability, perhaps with weights attached to each possible value." Rosner 2006 p. 63, Definition 3.16. "The posterior probability of an event is the probability of an event after collecting some empirical data. It is obtained by integrating information from the prior probability with additional data related to the event in question." Rosner 2006 p. 64, Definition 3.17. We'll look at aspects of Bayesian Inference shortly... The general logic of probability: Under either of the above views, probability (both as a concept and a property) obeys fundamental logical (or mathematical) rules. These rules are very important to all aspects of statistical inference and in direct prediciton of outcomes. Terminology: sample space = the set of all possible outcomes an event = any specific outcome P(X) = probability of event X, where 0 P( X) 1 complement of X = (1-P(X)) = P(~X). Complement is the probability of X not happening. Mutually exclusive events: Two or more events (A,B,C...) are mutually exclusive if they can not both happen simultaneously. Intersection of events is the empty set: PA ( B) = = PA 1 A 2...A i Union of events - The Law of Addition of probabilities applies: PA ( B) = P( A) P( B) = PA 1 PA 2 PA 1 A 2 A 3...A i < for two events (the smallest number where intersection < for two or more events i has a meaning) < Probability of either A or B happening are their separate probabilities added together... PA 3... PA i Examples: ^ for multiple exclusive events, add them all. Flipping a coin: H=event 'heads', T=event 'tails' For a 'fair' coin, P(H) = 0.5 & P(T) = 0.5 Therefore: PH ( T) = & PH ( T) = P( H) P( T) = 0.5 + 0.5 = 1.0 Rolling a die: Here there are 6 mutally exclusive events: 1, 2, 3, 4, 5, 6 For a 'fair' die, P(i) = 1/6 for all 6 events Therefore: P2 ( 6) = & P2 ( 6) = P( 2) P( 6) = 1/6 + 1/6 = 1/3

Biostatistics 050 Standard & Conditional Probability 6 Potentially co-ocurring events: Two or more events (A,B,C...) may occur simultaneously. There are two kinds: 1. Independent events: The probilities of events P(A), P(B), P(C) etc. have no bearing on each other. Intersection of events - the Law of Multiplied probabilities applies: PA ( B) = P( A)P( B) < multiply the separate probabilities to find the probability of both events occuring simultaneously. PA 1 PA 2 PA 1 A 2 A 3...A i = PA 3... P A i ^ multiply all of them for multiple events Union of events - Expanded Law of Addition applies: PA ( B) = P( A) P( B) PA ( B) Alternate equivalent forms for Independent events only: PA ( B) = P( A) P( B) ( 1 P( A) ) PA ( B) = P( B) P( A) ( 1 P( B) ) Note: the complement here ^ = P(~A) or P(~B) < The probability of A or B happening is the separate probabilities added together minus the probability that both A & B occur together < The probability of A or B happening is the probability of one plus the simultaneous occurrence of the other but not the first! Check Venn diagrams in the text to puzzle this out! Note that Union for multiple independent events greater than two is not given... Example: In a sample 50 birds, by chance 20 are male and 30 are female. Out of these, 7 males and 13 females test positive for a potentially dangerous virus. If a bird is chosen at random from this sample, what is the probability of obtaining a female or a bird testing positive for the virus - but not both occurring together? P(female or pos) = P(female) + P(pos) - P(female and pos) = 30/50 + 20/50-13/50 = 37/50 30 13 20

Biostatistics 050 Standard & Conditional Probability 7 2. Dependent events: The probabilities of two events are related such that knowing the outcome of one event influences the probability of the other. iris READPRN ("c:/data/biostatistics/iris.txt" ) SL iris 1 SW iris 2 PL iris 3 PW iris 4 < Reading the Famous Iris data again... 8 PL 6 4 < A plot of Sepal Length (SL) and Petal Length (PL) shows dependence. Measuring one variable gives important information about more probable values of the other. 2 0 4 5 6 7 8 SL Intersection of events - the Law of Multiplied probabilities fails: PA ( B) P( A)P( B) < This is a more formal statement of what dependence actually means. In practical terms, one often assesses the separate probabilities for A and B, and then compares their product with a separately estimated probability of both events occuring simultaneously to see if they match. Conditional Probability: To proceed at this point, one needs a concept of conditional probability... PA ( B) = "P(B A)" P( A) < intersection in terms of conditional probability. Note that you can switch the roles of A & B depending on PB ( A) = "P(A B)" P( B) which is prior probability P(X) = known prior to collecting data for the study versus conditional probability "P(X Y)" knowledge of Y influencing the probability of X * See below for more than two events! The concept of conditional probability can be applied to both the independent and dependent cases of potentially simultaneous events above, so I'll give both here.. Definition of Conditional Probability: Rearranging the Law of Multipied Probabilities for two dependent events to solve for one of the individual probabilities (i.e., P(B)), gives the definition for conditional probability: P B A = PA ( B) PA ( ) < P(B) is also written as P(B A) with no difference in meaning. ^ This is the conditional probability for B given prior knowledge of A...

Biostatistics 050 Standard & Conditional Probability 8 Calculating Intersection with Conditional Probability: 1. Independent case: P(B A) = P(B) = P(~A) PA ( B) = P( A) P( B) 2. More important Dependent case: < Equalities here makes the Law of Multiplied probabilities a special case of the more general one below... PA ( B) = P B A PA 1 A 2 A 3 PA ( ) PA 1 P A 2 = A 1 < For two events... P A 3 A 2 A 1 < For three events... PA 1 P A 2 = PA 1 A 2 A 3...A i A 1 A 3 P.. A 2 A 1 P A i Ai ( 1) A 3 A 2 A 1 ^ For more than three events... Calculating Total Probability from Conditional Probability: This formulation is common to both the Independent and Dependent cases: P(A) = P(A B). P(B) + P(A ~B). P(~B) P(B) = P(B A). P(A) + P(B ~A). P(~A) < For two possibilities A & B... Note that the roles of A & B are interchangeable P(A) = P(A B i ). P(B i ) < For A given multiple prior probabilities B i 1. Independent case: 2. Dependent case: Bayes' Rule: The formulas simply reduces to multiplying P(B) or P(B i ) depending on number of B i 's The formula doesn't reduce and conditional probabilties must be used as stated above. The point of this procedure for two events A & B is to estimate one conditional probability P(B A) from the other conditional probability P(A B) and one total probablity P(B). P B A = P A B PB ( ) P A B PB ( ) P A notb P( notb) < Of course, as above, the defined roles of A & B here can be reversed. P B i A = PB i P A B PB i i P A B i i < General form of Bayes' Rule giving multiple conditional probabilities for the B's given knowledge of multiple conditional probabilites P(A B i ) and multiple total probabilites P(B i ).