Unit 3. Discrete Distributions

Size: px
Start display at page:

Download "Unit 3. Discrete Distributions"

Transcription

1 PubHlth Discrete Distributions Page 1 of 39 Unit 3. Discrete Distributions Topic 1. Proportions and Rates in Epidemiological Research Review - Bernoulli Distribution. 3. Review - Binomial Distribution Poisson Distribution Hypergeometric Distribution.. 6. Fisher s Eact Test.. 7. Discrete Distribution Themes Appendices A. Mean and Variance of Bernoulli(π).. B. The Binomial(N,π) is the Sum of N Independent Bernoulli(π)... C. The Poisson is an Etension of a Binomial

2 PubHlth Discrete Distributions Page 2 of Proportions and Rates in Epidemiological Research The concepts of proportion of and rate of describe different aspects of disease occurrence. This distinction is important to epidemiological research. A proportion is a relative frequency. Proportion = It is dimensionless. # events that actually occurred # events that could have occurred Valid range: 0 to 1 E.g. Toss a coin 10 times. If we observe 2 heads : Proportion heads = 2/10 = 20% # Events that could have occurred = 10 tosses # Events occurred = 2 heads Prevalence measures are eamples of proportions. A rate is a count of event occurrence per unit of time. As such, it is measured relative to an interval of time. It is not dimensionless. Rate = # events that actually occurred # time periods eperienced Valid range: 0 to E.g., persons are known to have smoked, collectively, for 1,000 pack years. If we observe 3 occurrences of lung cancer: Rate lung cancer = 3/1000 pack years # Time periods eperienced = 1000 pack years # Events occurred = 3 Incidence densities are eamples of rates.

3 PubHlth Discrete Distributions Page 3 of 39 Some Commonly Used Proportions and Rates Proportions describe either eisting disease or new disease within a time frame. flow of occurrence of new disease with time. Rates describe the force or Prevalence = Prevalence # persons with disease at a point in time # persons in the population at a point in time Type of Measure: Proportion Denominator = # events that could have occurred Numerator = # events that actually occurred # persons in the population at a point in time # persons having the disease at a point in time Valid range: 0 to 1 Intepretation Other names The proportion of the population with disease at a point in time Point prevalence Eample: In 1988, the New York State Breast and Cervical Cancer Screening Program registry included 16,529 women with a baseline mammogram. Upon review of their medical records, 528 were found to have a history of breast cancer. The prevalence of breast cancer in this 1988 selected cohort was 528 P = = = 3.19% 16,529 These 528 women were ecluded from the analyses to determine the factors associated with repeat screening mammogram.

4 PubHlth Discrete Distributions Page 4 of 39 Cumulative incidence = # disease onsets during interval of time # persons at risk in population at start of interval Cumulative Incidence Denominator = # events that could have happened Numerator = # events that actually occurred Type of Measure: Proportion # persons in the population at the start of the time interval who could possibly develop disease. Thus, all are disease free at the start of the interval. # persons developing the disease during the time interval. Valid range: 0 to 1 Interpretation Note!! The proportion of healthy persons who go on to develop disease over a specified time period. We assume that every person in the population at the start was followed for the entire interval of time. Eample: In this eample, the event of interest is the completion of a repeat screening mammogram. There were 9,485 women in the New York State Breast and Cervical Cancer Screening Program registry as of 1988 with a negative mammogram, who did not have a history of breast cancer, and provided complete data. 2,604 obtained a repeat screening mammogram during the 6 year period The 6-year cumulative incidence of repeat screening mammogram is therefore 2,604 CI 6year = = = 27% 9,485 Further analysis is focused on the identification of its correlates.

5 PubHlth Discrete Distributions Page 5 of 39 Incidence Density = Incidence Density # disease onsets during interval of time Sum of individual lengths of time actually at risk Type of Measure: Rate Denominator = # time periods eperienced event free. Sum of individual lengths of time during which there was opportunity for event occurrence during a specified interval of time. a This sum is called person time and is N i= 1 (time for i th person free of event) It is also called person years or risk time. Numerator = # new event occurrences Valid range: Intepretation Other names Note!! # disease onsets during a specified interval of time. 0 to The force of disease occurrence per unit of time Incidence rate We must assume that the risk of event remains constant over time. When it doesn t, stratified approaches are required. Be careful in the reporting of an incidence density. Don t forget the scale of measurement of time. I = 7 per person year = 7/52 = 0.13 per person week = 7/ = per person day

6 PubHlth Discrete Distributions Page 6 of 39 A prevalence estimate is useful when interest is in What to Use? who has disease now versus who does not (one time camera picture) planning services; e.g. - delivery of health care The concept of prevalence is not meaningfully applicable to etiologic studies. Susceptibililty and duration of disease contribute to prevalence Thus, prevalence = function(susceptibililty, incidence, survival) Etiologic studies of disease occurrence often use the cumulative incidence measure of frequency. Recall that we must assume complete follow-up of entire study cohort. The cumulative incidence measure of disease frequency is not helpful to us if persons migrate in and out of the study population. Individuals no longer have the same opportunity for event recognition. Etiologic studies of disease occurrence in dynamic populations will then use the incidence density measure of frequency. Be careful here, too! Does risk of event change with time? With age? If so, calculate person time separately in each of several blocks of time. This is a stratified analysis approach.

7 PubHlth Discrete Distributions Page 7 of Review - Bernoulli Distribution In order to analyze variations in proportions and rates, we ll need a few probability models. There are 4 and they are interrelated. Two of them were introduced in PubHlth 540, Introductory Biostatisics: Bernoulli; and Binomial. Two additional distributions that are often used in analyses of discrete data are the following: Poisson; and hypergeometric. Recall - A Bernoulli Trial is the Simplest Binomial Random Variable. A simple eample is the coin toss chance of heads can be re-cast as a random variable. Let Z = random variable representing outcome of one toss, with Z = 1 if heads 0 if tails π= Probability [ coin lands heads }. Thus, π = Probability [ Z = 1 ]

8 PubHlth Discrete Distributions Page 8 of 39 We have what we need to define a probability distribution. Enumeration of all possible outcomes - outcomes are mutually eclusive - outcomes ehaust all possibilities 1 0 Associated probabilities of each - each probability is between 0 and 1 - sum of probabilities totals 1 Outcome Pr[outcome] 1 Pr[Z=1] = π 0 Pr[Z=0] = (1-π) Total = 1 In epidemiology, the Bernoulli might be a model for the description of ONE individual (N=1): This person is in one of two states. He or she is either in a state of: 1) event with probability π; or 2) non event with probability (1-π) The description of the likelihood of being either in the event state or the non-event state is given by the Bernoulli distribution

9 PubHlth Discrete Distributions Page 9 of 39 Bernoulli Distribution Suppose Z can take on only two values, 1 or 0, and suppose: Probability [ Z = 1 ] = π Probability [ Z = 0 ] = (1-π) This gives us the following epression for the likelihood of Z=z. Tip - Recall from PubHlth 540, Unit 5, that the likelihood function is called a probability density function and is written with the notation f Z (z). When the random variable is discrete (as is the case for the Bernoulli), we can write the following: f Z (z) = Probability [ Z = z ] = π z (1-π) 1-z for z = 0 or 1. Epected value of Z is E[Z] = μ = π Variance of Z is Var[Z] = σ 2 = π (1-π) Eample: Z is the result of tossing a coin once. If it lands heads with probability =.5, then π =.5. Later we ll see that individual Bernoulli distributions are the basis of describing patterns of disease occurrence in a logistic regression analysis.

10 PubHlth Discrete Distributions Page 10 of Review - Binomial Distribution Recall from PubHlth 540 (Unit 4) that the Binomial distribution model is the appropriate model to use for the outcome that is the number of successes in a set of N independent success/failure Bernoulli trials. E.g. - What is the probability that 2 of 6 graduate students are female? - What is the probability that of 100 infected persons, 4 will die within a year? Binomial Distribution Among N trials, where The N trials are independent Each trial has two possible outcomes: 1 or 0 Pr [ outcome =1] = π for each trial; and therefore Pr [ outcome = 0] = (1-π) X = # events of outcome The probability density function f X () is now: f X () = Probability [ X = ] = F HG N I π KJ (1-π) N- for =0,, N where F HG Epected value is E[X] = μ = N π Variance is Var[X] = σ 2 = N π (1-π) I N = # ways to choose X from N = KJ N!! (N-)! and where N! = N(N-1)(N-2)(N-3) (4)(3)(2)(1) and is called the factorial.

11 PubHlth Discrete Distributions Page 11 of 39 Your Turn A roulette wheel lands on each of the digits 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9 with probability =.10. Write down the epression for the calculation of the following. #1. The probability of 5 or 6 eactly 3 times in 20 spins. #2. The probability of digit greater than 6 at most 3 times in 20 spins.

12 PubHlth Discrete Distributions Page 12 of 39 #1. Solution: Note The k is your. The p is your π Event is outcome of either 5 or 6 Pr[event] = π =.20 N = 20 X is distributed Binomial(N=20, π=.20) Pr[X=3]= [.20] [ 1.20] 3 20 [.20] 3 [.80] 17 = 3 =.2054 #2. Solution: Event is outcome of either 7 or 8 or 9 Pr[event] = π =.30 N = 20 X is distributed Binomial(N=20, π=.30) Note The k is your. The p is your π Translation: At most 3 times is the same as less than or equal to 3 times Pr[X 3] = Pr[X=0] + Pr[X=1] + Pr[X=2] + Pr[X=3] = [.30] [.70] = = [.30] [.70] + [.30] [.70] = [ ] [ ] [ ] [ ] 17

13 PubHlth Discrete Distributions Page 13 of 39 Review of the Normal Approimation for the Calculation of Binomial Probabilities Calculations of eact binomial probabilities become quite tedious as the number of trials and number of events gets large. Fortunately, the central limit theorem allows us to replace these eact calculations with very good approimate calculations. The approimate calculations are actually normal probability calculations. The following is an eample where the calculation of the required binomial probability would be too much to do! Eample Calculate the chances of between 5 and 28 events (inclusive) in 180 trials with probability of event =.041. Idea of Solution Translate the required eact calculation into a very good approimate one using the z-score. X distributed Binomial (N=180, π=.041) says that μ BINOMIAL = nπ = (180)(.041) = 7.38 σ = n π (1-π) = (180)(.041)(.959) = 7.08 σ 2 BINOMIAL BINOMIAL = σ = BINOMIAL The approimate calculation using the z-score uses μ = μ BINOMIAL and σ = σbinomial 5-μ 28-μ Pr[5 X 28] Pr[ Normal(0,1) ] σ σ = Pr[ Normal(0,1) ] = Pr [ Normal(0,1) ] Pr [ Normal(0,1) ], because is in the etreme right tail. = Pr [ Normal (0,1) ], because of symmetry of the tails of the normal =.8146

14 PubHlth Discrete Distributions Page 14 of 39 Review of the Normal Approimation for the Calculation of Binomial Probabilities continued. More generally, we can use a z-score and the normal distribution for the following reasons. Binomial probabilities are likelihood calculations for a discrete random variable. Normal distribution probabilities are likelihood calculations for a continuous random variable. When substituting for the eact probabilities, we use the Normal distribution that has mean and variance parameter values equal to those of our Binomial distribution. μ = μ = nπ normal binomial σ = σ = n π (1-π) 2 2 normal binomial Desired Binomial Probability Calculation Normal Approimation w Correction Pr [ X=k] Pr [ (k-1/2) < X < (k+1/2) ] Pr [ X > k ] Pr [ X > (k-1/2) ] Pr [ X < k ] Pr [ X < (k+1/2) ]

15 PubHlth Discrete Distributions Page 15 of Poisson Distribution So far we have considered The description of 1 event occurrence for 1 person (Bernoulli) The description of X event occurrences in sample of N persons (Binomial) In stead of thinking in terms of persons, think instead about PERSON TIME: A familiar eample of the idea of person time is pack years smoking E.g. How shall we describe a small number of cancer deaths relative to a large accumulation of person time, such as 3 cancer deaths in 1000 pack years of smoking? Setting. It is no longer the analysis of events among N persons as a proportion. Instead, it is an analysis of events over person time as an incidence rate. T he Poisson Distribution can be appreciated as an etension of the Binomial Distribution. See Appendi C for details. The concept of N persons A large accumulation of person time The likelihood of an event eperienced by 1 person the likelihood of an event in 1 unit of person time. This will be quite small!

16 PubHlth Discrete Distributions Page 16 of 39 If X is distributed Poisson with mean μ, Poisson Distribution f X () = Probability [ X = ] = [ ] μ ep -μ! for = 0, 1, Epected value of X is E[X] = μ Variance of X is Var[X] =σ 2 = μ That s right mean and variance are the same. Where the mean μ is the parameter representing the product of the (incidence rate) (period of observation) The poisson distribution is an appropriate model for describing the frequency of occurrence of a rare event in a very large number of trials. Eample: Suppose lung cancer occurs at a rate of 2 per 1000 pack years. Using a Poisson distribution model, calculate the probability of eactly 3 cases of lung cancer in 3000 pack years Solution: Step 1 Solve for Poisson parameter μ μ = (incidence rate) (period of observation) = ( 2 per 1000 pack years) (3000 pack years) = (.002)*(3000) = 6 Step 2 Identify desired calculation Calculate the probability of eactly 3 cases of lung cancer is translated as follows For X distributed Poisson(μ=6), Probability [X=3] =???

17 PubHlth Discrete Distributions Page 17 of 39 Step 3 Solve, by hand or with use of calculator on the web. Probability [X=3] = 3 μ ep(-μ) 6 ep(-6) = =.0892! 3!

18 PubHlth Discrete Distributions Page 18 of 39 Eample continued. This same eample can be used to illustrate the correspondence between the Binomial and Poisson likelihoods. If lung cancer occurs at a rate of 2 per 1000 pack years, calculate the probability of eactly 3 cases of lung cancer in 3000 pack years using (a) a Binomial model, and (b) a Poisson model. (a) Binomial n = # trials π = Pr [ event ] (b) Poisson What happens as n and π 0? Epected # events nπ μ Pr [ X= events ] n π ( 1-π ) n- μ ep(-μ)! Where ep = numerical constant = e = Solution using the Binomial: Solution using the Poisson: # trials = n = 3000 Length of interval, T = 3000 π =.002 Rate per unit length, λ =.002 per 1 pack year μ = λt = (.002/pack year)(3000 pack years)=6 Pr[3 cases cancer] = F I Pr[3 cases cancer] ] [. 998] = HG 3 KJ[. 6 3 ep[ 6 ] 3! =.0891 =.0892

19 PubHlth Discrete Distributions Page 19 of 39 An eample of the Poisson sampling design is a cross-sectional study Disease Healthy Eposed a b a+b Not eposed c d c+d a+c b+d a+b+c+d a = # persons who are both eposed and with disease b = # persons who are both eposed and healthy c = # persons without eposure with disease d = # persons without eposure and healthy The counts a, b, c, and d are each free to vary and follow their own distribution. a, b, c, and d are 4 independent Poisson random variables. Let s denote the means of these μ 11, μ 12, μ 21, and μ 22. Likelihood[22 table] = L[a,b,c,d] = L NM a μ ep μ a! b b go Lμ ep μ QP NM b! c b go Lμ ep μ QP NM c! d b go Lμ epb μ QP NM d! go QP An association between eposure and disease means the following. 1) Pr[disease given eposure] Pr [disease given no eposure] μ11 μ 21 μ + μ μ + μ ) Pr[eposure given disease] Pr [eposure given no disease] μ11 μ12 μ + μ μ + μ

20 PubHlth Discrete Distributions Page 20 of 39 An eample of the Binomial sampling design is a cohort study Disease Healthy Eposed a b a+b fied by design Not eposed c d c+d fied by design a+c b+d a+b+c+d a = # eposed persons develop disease b = # eposed persons who do not develop disease c = # uneposed persons who develop disease d = # uneposed persons who do not develop disease The counts a and c are each free to vary and follow their own distribution. The counts b and d do not vary because of the fied row totals a and c are 2 independent Binomial random variables. Let s denote the means of these π 1 and π 2. Likelihood[22 table] = L[a,c given (a+b) and (c+d) are fied] = LF N MHG I KJ O Q P LF N MHG a+ b c+ d a b c d π ( 1 π ) π ( 1 π ) a c I KJ O Q P An association between eposure and disease means the following. Pr[disease among eposed] Pr [disease among non-eposed] π π 1 2

21 PubHlth Discrete Distributions Page 21 of 39 An eample of the Binomial sampling design is a case-control study Disease Healthy Eposed a B a+b Not eposed c D c+d a+c fied b+d fied a+b+c+d a = # persons with disease whose recall reveals eposure c = # persons with disease whose recall reveals no eposure b = # healthy persons whose recall reveals eposure d = # healthy persons whose recall reveals no eposure The counts a and b are each free to vary and follow their own distribution. The counts c and d do not vary because of the fied column totals a and b are 2 independent Binomial random variables. Let s denote the means of these θ 1 and θ 2. Likelihood[22 table] = L[a,b give (a+c) and (b+d) are fied] a + c b d θ (1 θ ) + θ (1 θ ) d a b a c b = An association between eposure and disease means the following. Pr[eposure among disease] Pr [eposure among healthy] θ θ 1 2

22 PubHlth Discrete Distributions Page 22 of Hypergeometric Distribution Sometimes, a 22 table analysis of an eposure-disease relationship has as its only focus the investigation of association. This will become more clear in units 4 and 5. In this setting, it turns out that the most appropriate probability model is the hypergeometric distribution model. The Hypergeometric Distribution is introduced in two ways: 1 - The Hypergeometric Distribution in Games of Poker 2 - The Hypergeometric Distribution in Analyses of 22 Tables 1 - The Hypergeometric Distribution in Games of Poker. Eample - In a five card hand, what is the probability of getting 2 queens? 2 queens is, in reality, 2 queens and 3 NON queens. 52 We know the total number of possible hands is 5 Because there are 4 queens in a full deck, the # selections of 2 queens from 4 is 4 2 Similarly, there are (52 4) = 48 non queens in a full deck. 48 Thus, the # selections of 3 non queens from 48 is 3 Putting these together, the number of hands that have 2 queens and 3 non queens is the product Pr[2 queens in five card hand] = (Number of five card hands that have 2 queens) (Total number of five card hands) = 52 5

23 PubHlth Discrete Distributions Page 23 of The Hypergeometric Distribution in Analyses of a 22 Table. Eample A biotech company has N=259 pregnant women in its employ. 23 of them work with video display terminals. Of the 259 pregnancies, 4 ended in spontaneous abortion. Assuming a central hypergeometric model, what is the probability that 2 of the 4 spontaneous abortions were among the 23 women who worked with video display terminals? Spontaneous Abortion Healthy Video Display Terminal Not In this eample, the number of women who worked with video display terminals (23) is fied. That is, the row totals are fied. Similarly, the total number of spontaneous abortions (4) is also fied. That is, the column totals are fied. And the total number of pregnancies (259) is fied. Thus, only one cell count can vary. Having chosen one, all others are known by subtraction. In this eample, we want to know if spontaneous abortion has occurred disproportionately often among the video display terminal workers? Is it reasonable that 2 of the 4 cases of spontaneous abortion occurred among the relatively small number (23) of women who worked with video display terminals? The chance model is the central hypergeometric model just introduced. Using the approach just learned is the following.

24 PubHlth Discrete Distributions Page 24 of 39 Eample At this biotech company, what is the probability that 2 of the 4 abortions occurred among the 23 who worked with video display terminals? 2 abortions among VDT workers is in, reality, 2 abortions among VDT workers and 2 abortions among NON-VDT workers. We know the total number of possible choices of 4 abortions from 259 pregnancies is As there are 23 VDT workers, the # selections of 2 from this group is 23 2 Similarly, as there are (259 23) = 236 non-vdt workers, the number of selections of from this group is 2 Thus, the probability of 2 of the 4 abortions occurring among VDT workers by chance is the hypergeometric probability = 259 4

25 PubHlth Discrete Distributions Page 25 of A Hypergeometric Probability Model for a 22 Table of Eposure-Disease Counts Now we consider the general setting of a 22 table analysis of eposure-disease count data. Case Control Eposed a b (a+b) fied Not eposed c d (c+d) fied (a+c) fied (b+d) fied n = a + b + c + d Here, the only interesting cell count is a = # persons with both eposure and disease Eample In this 22 table with fied total eposed, and fied total number of events, what is the probability of getting a events of (eposed and event)? a events of case among (a+b) eposed is in, reality, a events among (a+b) eposed AND c events among (c+d) NOT eposed. We know the total number of possible choices of (a+c) cases from (a+b+c+d) total is a+b+c+d a+c As there are (a+b) eposed, the # selections of a from this group is a+b a Similarly, as there are (c+d) NOT eposed, the number of selections of c from this c+d group is c Thus, the probability of a events of CASE occurring among (a+c) EXPOSED by chance is the hypergeometric probability a+ b c+ d a c = a+ b+ c+ d a+ c Note This is actually the central hypergeometric. More on this later.

26 PubHlth Discrete Distributions Page 26 of 39 Nice Result The central hypergeometric probability calculation for the 22 table is the same regardless of arrangement of rows and columns. Cohort Design Pr["a" cases among "(a+b)" eposed fied marginals] a+b c+d a c = a+b+c+d a+c Case-Control Design Pr["a" eposed among "(a+c)" cases fied marginals] a+c b+d a b = a+b+c+d a+b

27 PubHlth Discrete Distributions Page 27 of Fisher s Eact Test of Association in a 22 Table Now we can put this all together. We have a null hypothesis chance model for the probability of obtaining whatever count a we get in the upper left cell of the 22 table. It is the hypergeometric probability distribution model. This chance model (the central hypergeometric distribution) is the null hypothesis probability model that is used in the Fisher s eact test of no association in a 22 table. The odds ratio OR is a single parameter which describes the eposure disease association. Consider again the VDT eposure and occurrence of spontaneous abortion data: Disease Healthy Eposed Not eposed We re not interested in the row totals. Nor are we interested in the column totals. Thus, neither the Poisson nor the Bionomial likelihoods are appropriate models for our particular question. Rather, our interest is in the number of persons who have both traits (eposure=yes) and (disease=yes). Is the count of 2 with eposure and disease significantly larger than what might have been epected if there were NO association between eposure and disease? We re interested in this likelihood because it is the association, the odds ratio OR,that is of interest, not the row totals, nor the column totals. Thus, we will take advantage of the central hypergeometric model.

28 PubHlth Discrete Distributions Page 28 of 39 What is the null hypothesis likelihood of the 22 table if the row and column totals are fied? Recall the layout of our 22 table. With row and column totals fied, only one cell count can vary. Let this be the count a. Disease Healthy Eposed a b a+b Not eposed c d c+d a+c b+d a+b+c+d Null Hypothesis Conditional Probability of 22 Table (Conditional means: Row totals fied and column totals fied) When the OR =1, we have the central hypergeometric model just introduced. Probability [ a given row and column totals given OR=1 ] = F HG F HG IF KJ HG a+ c b+ d a b a+ b+ c+ d a+ b (optional for the interested reader): When the OR 1, the correct probability model for the count a is a different hypergeometric distribution; this latter is called a non-central hypergeometric distribution. Interestingly, the probability calculation involves the magnitude of the OR which is now a number different from 1. I KJ I KJ Probability [ a given row and column totals when OR 1] is the non-central hypergeometric probability = F HG min( a+ b, a+ c) u= ma( 0, a d) a+ c a a+ F HG u IFb KJ HG b cif KJ HG + d I KJ OR I KJ b+ d a+ b u OR Note You will not need to work with the non-central hypergeometric distribution in this course. a u

29 PubHlth Discrete Distributions Page 29 of 39 How to compute the p-value for this hypothesis test: The solution for the p-value will be the sum of the probabilities of each possible value of the count a that are as etreme or more etreme relative to a null hypothesis odds ratio OR = 1. Step 1: If the row and column totals are fied, what values of the count a are possible? Answer: Because the column total is 4, the only possible values of the count a are 0, 1, 2, 3, and 4 Step 2: What are the probabilities of the five possible 22 tables? Answer: We calculate the hypergeometric distribution likelihood for each of the 5 tables. While we re at it, we ll calculate the empirical odds ratio (OR) accompanying each possibility: a=0 Disease Healthy Eposed Pr(table)=.6875 Not eposed OR= a=1 Disease Healthy Eposed Pr(table)=.2715 Not eposed OR=

30 PubHlth Discrete Distributions Page 30 of 39 a=2 Note This is our observed Disea se Healthy Eposed Pr(table)=.0386 Not eposed OR= a=3 Disease Healthy Eposed Pr(table)=.0023 Not eposed OR= a=4 Disease Healthy Eposed Pr(table)=.0001 Not eposed OR=infinite Step 3: What is the p-value associated with the observed table? Answer: The observed table has a count of a =2. Recalling our understanding of the ideas of statistical hypothesis testing, the p-value associated with this table is the likelihood of this table or one that is more etreme. The more etreme tables are the ones with higher counts of a because these tables correspond to settings where the OR are as large or larger than the observed value of 11.1 Step 4: p-value = Pr [Table with a = 2] + Pr [Table with a = 3] + Pr [Table with a = 4] = =.0410 Interpretation of Fisher s Eact Test calculations. Under the null hypothesis of no association between eposure and disease, the chances of obtaining an OR=11.1 or greater, when the row and column totals are fied, are approimately 4 in 100.

31 PubHlth Discrete Distributions Page 31 of Discrete Distributions - Themes With Replacement Without Replacement Framework A proportion π are event A fied population of size =N contains a subset of size =M that are event Likelihood of outcome Sampling Sample of size=n with replacement Outcome # events = F HG n I K J π ( 1 π) n Eample Test of equality of proportion Sample of size=n without replacement # events = F HG M IF KJ HG F HG N n N n I KJ M I KJ Test of independence

32 PubHlth Discrete Distributions Page 32 of 39 Appendi A Mean (μ) and Variance (σ 2 ) of a Bernoulli Distribution Mean of Z = μ = π The mean of Z is represented as E[Z]. recall: E stands for epected value of E[Z] = π because the following is true: E[Z]= All possible z [z]probability[z=z] = [0] Pr [Z=0] + [1] Pr [Z=1] =[ 0] (1-π) + [1] (π) = π Variance of Z = σ 2 = (π)(1-π) The variance of Z is Var[Z] = E[ (Z (E[Z]) 2 ]. Var[Z] = π(1-π) because the following is true: Var[Z] = E [(Z-π) ] = 2 2 All possible z (z-π) Probability[Z=z] = 2 2 [(0- π) ]Pr[Z=0]+[(1- π) ]Pr[Z=1] 2 2 = [π ] (1-π) + [(1-π) ](π) = π (1-π) [ π + (1-π) ] = π (1-π)

33 PubHlth Discrete Distributions Page 33 of 39 Appendi B The Binomial(N,π) is the Sum of N Independent Bernoulli(π) The eample of tossing one coin one time is an eample of 1 Bernoulli trial. Suppose we call this random variable Z: Z is distributed Bernoulli (π) with Z=1 when the event occurs and Z=0 when it does not. Tip - The use of the 0 and 1 coding scheme, with 1 being the designation for event occurrence, is key, here. If we toss the same coin several times, say N times, we have N independent Bernoulli trials: Z 1, Z 2,., Z N are each distributed Bernoulli (π) and they re independent. Consider what happens if we add up the Z s. We re actually adding up 1 s and 0 s. The total of Z 1, Z 2,., Z N is thus the total number of 1 s. It is also the net number of events of success in N trials. Let s call this number of events of success a new random variable X. N Z i = X = # events in N trials i=1 This new random variable X is distributed Binomial. X represents the net number of successes in a set of independent Bernoulli trials. A simple eample is the outcome of several coin tosses (eg how many heads did I get? ). The word choice net is deliberate here, to remind ourselves that we re not interested in keeping track of the particular trials that yielded events of success, only the net number of trials that yielded event of success. E.g. - What is the probability that 2 of 6 graduate students are female? - What is the probability that of 100 infected persons, 4 will die within a year?

34 PubHlth Discrete Distributions Page 34 of 39 Steps in calculating the probability of N i=1 Z = X = i Step 1 Pick just one arrangements of events in N trials and calculate its probability The easiest is the ordered sequence consisting of () events followed by (N-) non events events N- non events Pr [ ordered sequence ] = Pr [( Z 1 =1), (Z 2 =1) (Z =1), (Z +1 =0), (Z +2 =0) (Z N =0) ] = π π π π (1- π) (1- π) (1- π) (1- π) = π (1- π) N- Note! Can you see that the result is the same product π the sequence the events occurred? How handy! (1- π) N- regardless of where in Step 2 Determine the number of qualifying ordered sequences that satisfy the requirement of having eactly events and (N-) non events. N Number of qualifying ordered sequences = = N!! (N-)! Step 3 The probability of getting X= events, without regard to sequencing, is thus the sum of the probabilities of each qualifying ordered sequence, all of which have the same probability that was obtained in step 1. Probability [N trials yields events] = (# qualifying sequence) (Pr[one sequence]) N N- N Pr [X = ] = Pr [ Z i = ] = π (1-π) i=1

35 PubHlth Discrete Distributions Page 35 of 39 Appendi C The Poisson Distribution is an Etension of Binomial The concept of N persons A large accumulation of person time The likelihood of an event eperienced by 1 person the likelihood of an event in 1 unit of person time. This will be quite small! The etension begins with a Binomial Distribution Situation. We begin by constructing a binomial likelihood situation. Let T = total accumulation of person time (e.g pack years) n = number of sub-intervals of T (eg 1000) T/n = length of 1 sub-interval of T (e.g.- 1 pack year) λ = event rate per unit length of person time What is our Binomial distribution probability parameter π? π = λ (T/n) because it is (rate)(length of 1 sub-interval) What is our Binomial distribution number of trials? n = number of sub-intervals of T We ll need 3 assumptions 1) The rate of events in each sub-interval is less than 1. Rate per sub-interval = Pr[1 event per sub-interval] 0 < π = (λ)(t/n) < 1 2) The chances of 2 or more events in a sub-interval is zero. 3) The subintervals are mutually independent.

36 PubHlth Discrete Distributions Page 36 of 39 Now we can describe event occurrence over the entire interval of length T with the Binomial. F HG I K J π Let X be the count of number of events. n Probability [ X = ] = (1-π) n- for =0,, n n T T λ λ n n = 1 n because π = ( λ)[ T/ n]. Some algebra (if you care to follow along) will get us to the poisson distribution probability formula. The algebra involves two things Letting n in the binomial distribution probability; and Recognizing that the epected number of events over the entire interval of length T is λt because λ is the per unit subinterval rate and T is the number of units. (analogy: for rate of heads =.50, number of coin tosses = 20, the epected number of head is [.50][20] = 10). This allows us, eventually, to make the substitution of λt = μ. n λt λt Pr[X=] = 1- n n N- n n λt λt λt = 1-1- n n n -

37 PubHlth Discrete Distributions Page 37 of 39 Work with each term on the right hand side one at a time: - 1 st term - n n! n(n-1)(n-2)...(n-+1)(n-)! n(n-1)(n-2)...(n-+1) = = =!(n-)!!(n-)!! As n infinity, the product of terms in the numerator (n)(n)(n) (n) = n. Thus, as n infinity n n! - 2 nd term - λt 1 1 = [λt] = [ μ] n n n Thus, λt 1 = [ μ] n n - 3 rd term - n λt μ 1- = 1- n n n What happens net is a bit of calculus. As n infinity Thus, as n infinity n λt μ 1- = 1- n n n e μ n μ 1- e n -μ where e = constant = 2.718

38 PubHlth Discrete Distributions Page 38 of 39-4 th term - Finally, as n the quotient (λt/n ) is increasingly like (0/n) so that - - λt [] 1 1 n n = Thus, as n infinity - λt 1- = 1 n Now put together the product of the 4 terms and what happens as n infinity n n λt λt λt Pr [X=] = 1-1- n n n - as n infinity n! [ ] 1 n μ { e μ } { 1 } μ e μ ep = =!! -μ -μ

39 PubHlth Discrete Distributions Page 39 of 39 If X is distributed Poisson, Poisson Distribution f X () = Probability [ X = ] = Epected value of X is E[X] = μ Variance of X is Var[X] =σ 2 = μ [ ] μ ep -μ! for = 0, 1, Thus, the poisson distribution is an appropriate model for describing the frequency of occurrence of a rare event in a very large number of trials.

Unit 3. Discrete Distributions

Unit 3. Discrete Distributions BIOSTATS 640 - Spring 2016 3. Discrete Distributions Page 1 of 52 Unit 3. Discrete Distributions Chance favors only those who know how to court her - Charles Nicolle In many research settings, the outcome

More information

Unit 3. Discrete Distributions

Unit 3. Discrete Distributions BIOSTATS 640 - Spring 2017 3. Discrete Distributions Page 1 of 57 Unit 3. Discrete Distributions Chance favors only those who know how to court her - Charles Nicolle In many research settings, the outcome

More information

Probability and Probability Distributions. Dr. Mohammed Alahmed

Probability and Probability Distributions. Dr. Mohammed Alahmed Probability and Probability Distributions 1 Probability and Probability Distributions Usually we want to do more with data than just describing them! We might want to test certain specific inferences about

More information

Unit 2 Introduction to Probability

Unit 2 Introduction to Probability PubHlth 540 Fall 2013 2. Introduction to Probability Page 1 of 55 Unit 2 Introduction to Probability Chance favours only those who know how to court her - Charles Nicolle A weather report statement such

More information

Class 26: review for final exam 18.05, Spring 2014

Class 26: review for final exam 18.05, Spring 2014 Probability Class 26: review for final eam 8.05, Spring 204 Counting Sets Inclusion-eclusion principle Rule of product (multiplication rule) Permutation and combinations Basics Outcome, sample space, event

More information

Probability: Why do we care? Lecture 2: Probability and Distributions. Classical Definition. What is Probability?

Probability: Why do we care? Lecture 2: Probability and Distributions. Classical Definition. What is Probability? Probability: Why do we care? Lecture 2: Probability and Distributions Sandy Eckel seckel@jhsph.edu 22 April 2008 Probability helps us by: Allowing us to translate scientific questions into mathematical

More information

Person-Time Data. Incidence. Cumulative Incidence: Example. Cumulative Incidence. Person-Time Data. Person-Time Data

Person-Time Data. Incidence. Cumulative Incidence: Example. Cumulative Incidence. Person-Time Data. Person-Time Data Person-Time Data CF Jeff Lin, MD., PhD. Incidence 1. Cumulative incidence (incidence proportion) 2. Incidence density (incidence rate) December 14, 2005 c Jeff Lin, MD., PhD. c Jeff Lin, MD., PhD. Person-Time

More information

1. When applied to an affected person, the test comes up positive in 90% of cases, and negative in 10% (these are called false negatives ).

1. When applied to an affected person, the test comes up positive in 90% of cases, and negative in 10% (these are called false negatives ). CS 70 Discrete Mathematics for CS Spring 2006 Vazirani Lecture 8 Conditional Probability A pharmaceutical company is marketing a new test for a certain medical condition. According to clinical trials,

More information

Lecture 2: Probability and Distributions

Lecture 2: Probability and Distributions Lecture 2: Probability and Distributions Ani Manichaikul amanicha@jhsph.edu 17 April 2007 1 / 65 Probability: Why do we care? Probability helps us by: Allowing us to translate scientific questions info

More information

Part 3: Parametric Models

Part 3: Parametric Models Part 3: Parametric Models Matthew Sperrin and Juhyun Park August 19, 2008 1 Introduction There are three main objectives to this section: 1. To introduce the concepts of probability and random variables.

More information

Random Variable. Discrete Random Variable. Continuous Random Variable. Discrete Random Variable. Discrete Probability Distribution

Random Variable. Discrete Random Variable. Continuous Random Variable. Discrete Random Variable. Discrete Probability Distribution Random Variable Theoretical Probability Distribution Random Variable Discrete Probability Distributions A variable that assumes a numerical description for the outcome of a random eperiment (by chance).

More information

1 Normal Distribution.

1 Normal Distribution. Normal Distribution.. Introduction A Bernoulli trial is simple random experiment that ends in success or failure. A Bernoulli trial can be used to make a new random experiment by repeating the Bernoulli

More information

MATH 10 INTRODUCTORY STATISTICS

MATH 10 INTRODUCTORY STATISTICS MATH 10 INTRODUCTORY STATISTICS Ramesh Yapalparvi Week 2 Chapter 4 Bivariate Data Data with two/paired variables, Pearson correlation coefficient and its properties, general variance sum law Chapter 6

More information

CS 361: Probability & Statistics

CS 361: Probability & Statistics February 19, 2018 CS 361: Probability & Statistics Random variables Markov s inequality This theorem says that for any random variable X and any value a, we have A random variable is unlikely to have an

More information

Plotting data is one method for selecting a probability distribution. The following

Plotting data is one method for selecting a probability distribution. The following Advanced Analytical Models: Over 800 Models and 300 Applications from the Basel II Accord to Wall Street and Beyond By Johnathan Mun Copyright 008 by Johnathan Mun APPENDIX C Understanding and Choosing

More information

Math 140 Introductory Statistics

Math 140 Introductory Statistics 5. Models of Random Behavior Math 40 Introductory Statistics Professor Silvia Fernández Chapter 5 Based on the book Statistics in Action by A. Watkins, R. Scheaffer, and G. Cobb. Outcome: Result or answer

More information

Math 140 Introductory Statistics

Math 140 Introductory Statistics Math 140 Introductory Statistics Professor Silvia Fernández Lecture 8 Based on the book Statistics in Action by A. Watkins, R. Scheaffer, and G. Cobb. 5.1 Models of Random Behavior Outcome: Result or answer

More information

Today we ll discuss ways to learn how to think about events that are influenced by chance.

Today we ll discuss ways to learn how to think about events that are influenced by chance. Overview Today we ll discuss ways to learn how to think about events that are influenced by chance. Basic probability: cards, coins and dice Definitions and rules: mutually exclusive events and independent

More information

Central Limit Theorem and the Law of Large Numbers Class 6, Jeremy Orloff and Jonathan Bloom

Central Limit Theorem and the Law of Large Numbers Class 6, Jeremy Orloff and Jonathan Bloom Central Limit Theorem and the Law of Large Numbers Class 6, 8.5 Jeremy Orloff and Jonathan Bloom Learning Goals. Understand the statement of the law of large numbers. 2. Understand the statement of the

More information

Unit 3 Probability Basics

Unit 3 Probability Basics BIOSTATS 540 Fall 2016 2. Probability Basics Page 1 of 40 Unit 3 Probability Basics Chance favours only those who know how to court her - Charles Nicolle A weather report statement such as the probability

More information

Topic 3 Populations and Samples

Topic 3 Populations and Samples BioEpi540W Populations and Samples Page 1 of 33 Topic 3 Populations and Samples Topics 1. A Feeling for Populations v Samples 2 2. Target Populations, Sampled Populations, Sampling Frames 5 3. On Making

More information

Econ 113. Lecture Module 2

Econ 113. Lecture Module 2 Econ 113 Lecture Module 2 Contents 1. Experiments and definitions 2. Events and probabilities 3. Assigning probabilities 4. Probability of complements 5. Conditional probability 6. Statistical independence

More information

Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2)

Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2) Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2) B.H. Robbins Scholars Series June 23, 2010 1 / 29 Outline Z-test χ 2 -test Confidence Interval Sample size and power Relative effect

More information

UNIT 4 MATHEMATICAL METHODS SAMPLE REFERENCE MATERIALS

UNIT 4 MATHEMATICAL METHODS SAMPLE REFERENCE MATERIALS UNIT 4 MATHEMATICAL METHODS SAMPLE REFERENCE MATERIALS EXTRACTS FROM THE ESSENTIALS EXAM REVISION LECTURES NOTES THAT ARE ISSUED TO STUDENTS Students attending our mathematics Essentials Year & Eam Revision

More information

Probability. Lecture Notes. Adolfo J. Rumbos

Probability. Lecture Notes. Adolfo J. Rumbos Probability Lecture Notes Adolfo J. Rumbos October 20, 204 2 Contents Introduction 5. An example from statistical inference................ 5 2 Probability Spaces 9 2. Sample Spaces and σ fields.....................

More information

Part 3: Parametric Models

Part 3: Parametric Models Part 3: Parametric Models Matthew Sperrin and Juhyun Park April 3, 2009 1 Introduction Is the coin fair or not? In part one of the course we introduced the idea of separating sampling variation from a

More information

Probability. Chapter 1 Probability. A Simple Example. Sample Space and Probability. Sample Space and Event. Sample Space (Two Dice) Probability

Probability. Chapter 1 Probability. A Simple Example. Sample Space and Probability. Sample Space and Event. Sample Space (Two Dice) Probability Probability Chapter 1 Probability 1.1 asic Concepts researcher claims that 10% of a large population have disease H. random sample of 100 people is taken from this population and examined. If 20 people

More information

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14 CS 70 Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14 Introduction One of the key properties of coin flips is independence: if you flip a fair coin ten times and get ten

More information

Practice Exam 1: Long List 18.05, Spring 2014

Practice Exam 1: Long List 18.05, Spring 2014 Practice Eam : Long List 8.05, Spring 204 Counting and Probability. A full house in poker is a hand where three cards share one rank and two cards share another rank. How many ways are there to get a full-house?

More information

MAS275 Probability Modelling Exercises

MAS275 Probability Modelling Exercises MAS75 Probability Modelling Exercises Note: these questions are intended to be of variable difficulty. In particular: Questions or part questions labelled (*) are intended to be a bit more challenging.

More information

Introduction to Probability Theory for Graduate Economics Fall 2008

Introduction to Probability Theory for Graduate Economics Fall 2008 Introduction to Probability Theory for Graduate Economics Fall 008 Yiğit Sağlam October 10, 008 CHAPTER - RANDOM VARIABLES AND EXPECTATION 1 1 Random Variables A random variable (RV) is a real-valued function

More information

PubHlth 540 Estimation Page 1 of 69. Unit 6 Estimation

PubHlth 540 Estimation Page 1 of 69. Unit 6 Estimation PubHlth 540 Estimation Page 1 of 69 Unit 6 Estimation Topic 1. Introduction........ a. Goals of Estimation. b. Notation and Definitions. c. How to Interpret a Confidence Interval. Preliminaries: Some Useful

More information

Lecture 8: Probability

Lecture 8: Probability Lecture 8: Probability The idea of probability is well-known The flipping of a balanced coin can produce one of two outcomes: T (tail) and H (head) and the symmetry between the two outcomes means, of course,

More information

3.2 Probability Rules

3.2 Probability Rules 3.2 Probability Rules The idea of probability rests on the fact that chance behavior is predictable in the long run. In the last section, we used simulation to imitate chance behavior. Do we always need

More information

Unit 3 Probability Basics

Unit 3 Probability Basics BIOSTATS 540 Fall 2018 3. Probability Basics Page 1 of 43 Unit 3 Probability Basics Chance favours only those who know how to court her - Charles Nicolle A weather report statement such as the probability

More information

X = X X n, + X 2

X = X X n, + X 2 CS 70 Discrete Mathematics for CS Fall 2003 Wagner Lecture 22 Variance Question: At each time step, I flip a fair coin. If it comes up Heads, I walk one step to the right; if it comes up Tails, I walk

More information

Steve Smith Tuition: Maths Notes

Steve Smith Tuition: Maths Notes Maths Notes : Discrete Random Variables Version. Steve Smith Tuition: Maths Notes e iπ + = 0 a + b = c z n+ = z n + c V E + F = Discrete Random Variables Contents Intro The Distribution of Probabilities

More information

Clinical Trials. Olli Saarela. September 18, Dalla Lana School of Public Health University of Toronto.

Clinical Trials. Olli Saarela. September 18, Dalla Lana School of Public Health University of Toronto. Introduction to Dalla Lana School of Public Health University of Toronto olli.saarela@utoronto.ca September 18, 2014 38-1 : a review 38-2 Evidence Ideal: to advance the knowledge-base of clinical medicine,

More information

Introduction. So, why did I even bother to write this?

Introduction. So, why did I even bother to write this? Introduction This review was originally written for my Calculus I class, but it should be accessible to anyone needing a review in some basic algebra and trig topics. The review contains the occasional

More information

Answers to All Exercises

Answers to All Exercises CAPER 10 CAPER 10 CAPER10 CAPER REFRESING YOUR SKILLS FOR CAPER 10 1a. 5 1 0.5 10 1b. 6 3 0.6 10 5 1c. 0. 10 5 a. 10 36 5 1 0. 7 b. 7 is most likely; probability of 7 is 6 36 1 6 0.1 6. c. 1 1 0.5 36 3a.

More information

If S = {O 1, O 2,, O n }, where O i is the i th elementary outcome, and p i is the probability of the i th elementary outcome, then

If S = {O 1, O 2,, O n }, where O i is the i th elementary outcome, and p i is the probability of the i th elementary outcome, then 1.1 Probabilities Def n: A random experiment is a process that, when performed, results in one and only one of many observations (or outcomes). The sample space S is the set of all elementary outcomes

More information

Chapter 2: The Random Variable

Chapter 2: The Random Variable Chapter : The Random Variable The outcome of a random eperiment need not be a number, for eample tossing a coin or selecting a color ball from a bo. However we are usually interested not in the outcome

More information

Ling 289 Contingency Table Statistics

Ling 289 Contingency Table Statistics Ling 289 Contingency Table Statistics Roger Levy and Christopher Manning This is a summary of the material that we ve covered on contingency tables. Contingency tables: introduction Odds ratios Counting,

More information

MATH MW Elementary Probability Course Notes Part I: Models and Counting

MATH MW Elementary Probability Course Notes Part I: Models and Counting MATH 2030 3.00MW Elementary Probability Course Notes Part I: Models and Counting Tom Salisbury salt@yorku.ca York University Winter 2010 Introduction [Jan 5] Probability: the mathematics used for Statistics

More information

Relative Risks (RR) and Odds Ratios (OR) 20

Relative Risks (RR) and Odds Ratios (OR) 20 BSTT523: Pagano & Gavreau, Chapter 6 1 Chapter 6: Probability slide: Definitions (6.1 in P&G) 2 Experiments; trials; probabilities Event operations 4 Intersection; Union; Complement Venn diagrams Conditional

More information

Discrete Distributions

Discrete Distributions A simplest example of random experiment is a coin-tossing, formally called Bernoulli trial. It happens to be the case that many useful distributions are built upon this simplest form of experiment, whose

More information

Regression techniques provide statistical analysis of relationships. Research designs may be classified as experimental or observational; regression

Regression techniques provide statistical analysis of relationships. Research designs may be classified as experimental or observational; regression LOGISTIC REGRESSION Regression techniques provide statistical analysis of relationships. Research designs may be classified as eperimental or observational; regression analyses are applicable to both types.

More information

Lecture 13. Poisson Distribution. Text: A Course in Probability by Weiss 5.5. STAT 225 Introduction to Probability Models February 16, 2014

Lecture 13. Poisson Distribution. Text: A Course in Probability by Weiss 5.5. STAT 225 Introduction to Probability Models February 16, 2014 Lecture 13 Text: A Course in Probability by Weiss 5.5 STAT 225 Introduction to Probability Models February 16, 2014 Whitney Huang Purdue University 13.1 Agenda 1 2 3 13.2 Review So far, we have seen discrete

More information

Discrete Mathematics and Probability Theory Fall 2011 Rao Midterm 2 Solutions

Discrete Mathematics and Probability Theory Fall 2011 Rao Midterm 2 Solutions CS 70 Discrete Mathematics and Probability Theory Fall 20 Rao Midterm 2 Solutions True/False. [24 pts] Circle one of the provided answers please! No negative points will be assigned for incorrect answers.

More information

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 20

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 20 CS 70 Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 20 Today we shall discuss a measure of how close a random variable tends to be to its expectation. But first we need to see how to compute

More information

Chapter. Probability

Chapter. Probability Chapter 3 Probability Section 3.1 Basic Concepts of Probability Section 3.1 Objectives Identify the sample space of a probability experiment Identify simple events Use the Fundamental Counting Principle

More information

Probability and Discrete Distributions

Probability and Discrete Distributions AMS 7L LAB #3 Fall, 2007 Objectives: Probability and Discrete Distributions 1. To explore relative frequency and the Law of Large Numbers 2. To practice the basic rules of probability 3. To work with the

More information

Discrete Probability

Discrete Probability Discrete Probability Mark Muldoon School of Mathematics, University of Manchester M05: Mathematical Methods, January 30, 2007 Discrete Probability - p. 1/38 Overview Mutually exclusive Independent More

More information

Epidemiology Wonders of Biostatistics Chapter 11 (continued) - probability in a single population. John Koval

Epidemiology Wonders of Biostatistics Chapter 11 (continued) - probability in a single population. John Koval Epidemiology 9509 Wonders of Biostatistics Chapter 11 (continued) - probability in a single population John Koval Department of Epidemiology and Biostatistics University of Western Ontario What is being

More information

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007 MIT OpenCourseWare http://ocw.mit.edu HST.582J / 6.555J / 16.456J Biomedical Signal and Image Processing Spring 2007 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

More information

CS 237: Probability in Computing

CS 237: Probability in Computing CS 237: Probability in Computing Wayne Snyder Computer Science Department Boston University Lecture 11: Geometric Distribution Poisson Process Poisson Distribution Geometric Distribution The Geometric

More information

Examples of frequentist probability include games of chance, sample surveys, and randomized experiments. We will focus on frequentist probability sinc

Examples of frequentist probability include games of chance, sample surveys, and randomized experiments. We will focus on frequentist probability sinc FPPA-Chapters 13,14 and parts of 16,17, and 18 STATISTICS 50 Richard A. Berk Spring, 1997 May 30, 1997 1 Thinking about Chance People talk about \chance" and \probability" all the time. There are many

More information

Review of Basic Probability Theory

Review of Basic Probability Theory Review of Basic Probability Theory James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) 1 / 35 Review of Basic Probability Theory

More information

God doesn t play dice. - Albert Einstein

God doesn t play dice. - Albert Einstein ECE 450 Lecture 1 God doesn t play dice. - Albert Einstein As far as the laws of mathematics refer to reality, they are not certain; as far as they are certain, they do not refer to reality. Lecture Overview

More information

Notes 12 Autumn 2005

Notes 12 Autumn 2005 MAS 08 Probability I Notes Autumn 005 Conditional random variables Remember that the conditional probability of event A given event B is P(A B) P(A B)/P(B). Suppose that X is a discrete random variable.

More information

Week 6, 9/24/12-9/28/12, Notes: Bernoulli, Binomial, Hypergeometric, and Poisson Random Variables

Week 6, 9/24/12-9/28/12, Notes: Bernoulli, Binomial, Hypergeometric, and Poisson Random Variables Week 6, 9/24/12-9/28/12, Notes: Bernoulli, Binomial, Hypergeometric, and Poisson Random Variables 1 Monday 9/24/12 on Bernoulli and Binomial R.V.s We are now discussing discrete random variables that have

More information

Statistics 1L03 - Midterm #2 Review

Statistics 1L03 - Midterm #2 Review Statistics 1L03 - Midterm # Review Atinder Bharaj Made with L A TEX October, 01 Introduction As many of you will soon find out, I will not be holding the next midterm review. To make it a bit easier on

More information

1. Discrete Distributions

1. Discrete Distributions Virtual Laboratories > 2. Distributions > 1 2 3 4 5 6 7 8 1. Discrete Distributions Basic Theory As usual, we start with a random experiment with probability measure P on an underlying sample space Ω.

More information

Advanced Herd Management Probabilities and distributions

Advanced Herd Management Probabilities and distributions Advanced Herd Management Probabilities and distributions Anders Ringgaard Kristensen Slide 1 Outline Probabilities Conditional probabilities Bayes theorem Distributions Discrete Continuous Distribution

More information

Testing Independence

Testing Independence Testing Independence Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM 1/50 Testing Independence Previously, we looked at RR = OR = 1

More information

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 2 MATH00040 SEMESTER / Probability

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 2 MATH00040 SEMESTER / Probability ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 2 MATH00040 SEMESTER 2 2017/2018 DR. ANTHONY BROWN 5.1. Introduction to Probability. 5. Probability You are probably familiar with the elementary

More information

Basic Concepts of Probability. Section 3.1 Basic Concepts of Probability. Probability Experiments. Chapter 3 Probability

Basic Concepts of Probability. Section 3.1 Basic Concepts of Probability. Probability Experiments. Chapter 3 Probability Chapter 3 Probability 3.1 Basic Concepts of Probability 3.2 Conditional Probability and the Multiplication Rule 3.3 The Addition Rule 3.4 Additional Topics in Probability and Counting Section 3.1 Basic

More information

Binomial and Poisson Probability Distributions

Binomial and Poisson Probability Distributions Binomial and Poisson Probability Distributions Esra Akdeniz March 3, 2016 Bernoulli Random Variable Any random variable whose only possible values are 0 or 1 is called a Bernoulli random variable. What

More information

STAT 4385 Topic 01: Introduction & Review

STAT 4385 Topic 01: Introduction & Review STAT 4385 Topic 01: Introduction & Review Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Spring, 2016 Outline Welcome What is Regression Analysis? Basics

More information

3 PROBABILITY TOPICS

3 PROBABILITY TOPICS Chapter 3 Probability Topics 135 3 PROBABILITY TOPICS Figure 3.1 Meteor showers are rare, but the probability of them occurring can be calculated. (credit: Navicore/flickr) Introduction It is often necessary

More information

STAC51: Categorical data Analysis

STAC51: Categorical data Analysis STAC51: Categorical data Analysis Mahinda Samarakoon January 26, 2016 Mahinda Samarakoon STAC51: Categorical data Analysis 1 / 32 Table of contents Contingency Tables 1 Contingency Tables Mahinda Samarakoon

More information

Statistics for Managers Using Microsoft Excel/SPSS Chapter 4 Basic Probability And Discrete Probability Distributions

Statistics for Managers Using Microsoft Excel/SPSS Chapter 4 Basic Probability And Discrete Probability Distributions Statistics for Managers Using Microsoft Excel/SPSS Chapter 4 Basic Probability And Discrete Probability Distributions 1999 Prentice-Hall, Inc. Chap. 4-1 Chapter Topics Basic Probability Concepts: Sample

More information

3.2 Intoduction to probability 3.3 Probability rules. Sections 3.2 and 3.3. Elementary Statistics for the Biological and Life Sciences (Stat 205)

3.2 Intoduction to probability 3.3 Probability rules. Sections 3.2 and 3.3. Elementary Statistics for the Biological and Life Sciences (Stat 205) 3.2 Intoduction to probability Sections 3.2 and 3.3 Elementary Statistics for the Biological and Life Sciences (Stat 205) 1 / 47 Probability 3.2 Intoduction to probability The probability of an event E

More information

success and failure independent from one trial to the next?

success and failure independent from one trial to the next? , section 8.4 The Binomial Distribution Notes by Tim Pilachowski Definition of Bernoulli trials which make up a binomial experiment: The number of trials in an experiment is fixed. There are exactly two

More information

Discrete Probability. Chemistry & Physics. Medicine

Discrete Probability. Chemistry & Physics. Medicine Discrete Probability The existence of gambling for many centuries is evidence of long-running interest in probability. But a good understanding of probability transcends mere gambling. The mathematics

More information

Probability (Devore Chapter Two)

Probability (Devore Chapter Two) Probability (Devore Chapter Two) 1016-345-01: Probability and Statistics for Engineers Fall 2012 Contents 0 Administrata 2 0.1 Outline....................................... 3 1 Axiomatic Probability 3

More information

Conditional Probability (cont'd)

Conditional Probability (cont'd) Conditional Probability (cont'd) April 26, 2006 Conditional Probability (cont'd) Midterm Problems In a ten-question true-false exam, nd the probability that a student get a grade of 70 percent or better

More information

Contingency Tables Part One 1

Contingency Tables Part One 1 Contingency Tables Part One 1 STA 312: Fall 2012 1 See last slide for copyright information. 1 / 32 Suggested Reading: Chapter 2 Read Sections 2.1-2.4 You are not responsible for Section 2.5 2 / 32 Overview

More information

Lecture 1: Probability Fundamentals

Lecture 1: Probability Fundamentals Lecture 1: Probability Fundamentals IB Paper 7: Probability and Statistics Carl Edward Rasmussen Department of Engineering, University of Cambridge January 22nd, 2008 Rasmussen (CUED) Lecture 1: Probability

More information

9/6/2016. Section 5.1 Probability. Equally Likely Model. The Division Rule: P(A)=#(A)/#(S) Some Popular Randomizers.

9/6/2016. Section 5.1 Probability. Equally Likely Model. The Division Rule: P(A)=#(A)/#(S) Some Popular Randomizers. Chapter 5: Probability and Discrete Probability Distribution Learn. Probability Binomial Distribution Poisson Distribution Some Popular Randomizers Rolling dice Spinning a wheel Flipping a coin Drawing

More information

1 The Basic Counting Principles

1 The Basic Counting Principles 1 The Basic Counting Principles The Multiplication Rule If an operation consists of k steps and the first step can be performed in n 1 ways, the second step can be performed in n ways [regardless of how

More information

P (E) = P (A 1 )P (A 2 )... P (A n ).

P (E) = P (A 1 )P (A 2 )... P (A n ). Lecture 9: Conditional probability II: breaking complex events into smaller events, methods to solve probability problems, Bayes rule, law of total probability, Bayes theorem Discrete Structures II (Summer

More information

Probability Experiments, Trials, Outcomes, Sample Spaces Example 1 Example 2

Probability Experiments, Trials, Outcomes, Sample Spaces Example 1 Example 2 Probability Probability is the study of uncertain events or outcomes. Games of chance that involve rolling dice or dealing cards are one obvious area of application. However, probability models underlie

More information

AP Statistics Cumulative AP Exam Study Guide

AP Statistics Cumulative AP Exam Study Guide AP Statistics Cumulative AP Eam Study Guide Chapters & 3 - Graphs Statistics the science of collecting, analyzing, and drawing conclusions from data. Descriptive methods of organizing and summarizing statistics

More information

Discrete Random Variables

Discrete Random Variables Discrete Random Variables An Undergraduate Introduction to Financial Mathematics J. Robert Buchanan Introduction The markets can be thought of as a complex interaction of a large number of random processes,

More information

MTH302 Quiz # 4. Solved By When a coin is tossed once, the probability of getting head is. Select correct option:

MTH302 Quiz # 4. Solved By When a coin is tossed once, the probability of getting head is. Select correct option: MTH302 Quiz # 4 Solved By konenuchiha@gmail.com When a coin is tossed once, the probability of getting head is. 0.55 0.52 0.50 (1/2) 0.51 Suppose the slope of regression line is 20 and the intercept is

More information

Refresher on Discrete Probability

Refresher on Discrete Probability Refresher on Discrete Probability STAT 27725/CMSC 25400: Machine Learning Shubhendu Trivedi University of Chicago October 2015 Background Things you should have seen before Events, Event Spaces Probability

More information

Chapter 3 Single Random Variables and Probability Distributions (Part 1)

Chapter 3 Single Random Variables and Probability Distributions (Part 1) Chapter 3 Single Random Variables and Probability Distributions (Part 1) Contents What is a Random Variable? Probability Distribution Functions Cumulative Distribution Function Probability Density Function

More information

Lecture Notes 2 Random Variables. Random Variable

Lecture Notes 2 Random Variables. Random Variable Lecture Notes 2 Random Variables Definition Discrete Random Variables: Probability mass function (pmf) Continuous Random Variables: Probability density function (pdf) Mean and Variance Cumulative Distribution

More information

Discrete Mathematics for CS Spring 2006 Vazirani Lecture 22

Discrete Mathematics for CS Spring 2006 Vazirani Lecture 22 CS 70 Discrete Mathematics for CS Spring 2006 Vazirani Lecture 22 Random Variables and Expectation Question: The homeworks of 20 students are collected in, randomly shuffled and returned to the students.

More information

ELEG 3143 Probability & Stochastic Process Ch. 1 Probability

ELEG 3143 Probability & Stochastic Process Ch. 1 Probability Department of Electrical Engineering University of Arkansas ELEG 3143 Probability & Stochastic Process Ch. 1 Probability Dr. Jingxian Wu wuj@uark.edu OUTLINE 2 Applications Elementary Set Theory Random

More information

Discrete Mathematics and Probability Theory Fall 2014 Anant Sahai Note 15. Random Variables: Distributions, Independence, and Expectations

Discrete Mathematics and Probability Theory Fall 2014 Anant Sahai Note 15. Random Variables: Distributions, Independence, and Expectations EECS 70 Discrete Mathematics and Probability Theory Fall 204 Anant Sahai Note 5 Random Variables: Distributions, Independence, and Expectations In the last note, we saw how useful it is to have a way of

More information

In Chapter 17, I delve into probability in a semiformal way, and introduce distributions

In Chapter 17, I delve into probability in a semiformal way, and introduce distributions IN THIS CHAPTER»» Understanding the beta version»» Pursuing Poisson»» Grappling with gamma»» Speaking eponentially Appendi A More About Probability In Chapter 7, I delve into probability in a semiformal

More information

Module 8 Probability

Module 8 Probability Module 8 Probability Probability is an important part of modern mathematics and modern life, since so many things involve randomness. The ClassWiz is helpful for calculating probabilities, especially those

More information

1 of 6 7/16/2009 6:31 AM Virtual Laboratories > 11. Bernoulli Trials > 1 2 3 4 5 6 1. Introduction Basic Theory The Bernoulli trials process, named after James Bernoulli, is one of the simplest yet most

More information

Introduction to Statistical Data Analysis Lecture 3: Probability Distributions

Introduction to Statistical Data Analysis Lecture 3: Probability Distributions Introduction to Statistical Data Analysis Lecture 3: Probability Distributions James V. Lambers Department of Mathematics The University of Southern Mississippi James V. Lambers Statistical Data Analysis

More information

Unit 3 Populations and Samples

Unit 3 Populations and Samples BIOSTATS 540 Fall 2015 3. Populations and s Page 1 of 37 Unit 3 Populations and s To all the ladies present and some of those absent - Jerzy Neyman The collection of all individuals with HIV infection

More information

Probability and Inference. POLI 205 Doing Research in Politics. Populations and Samples. Probability. Fall 2015

Probability and Inference. POLI 205 Doing Research in Politics. Populations and Samples. Probability. Fall 2015 Fall 2015 Population versus Sample Population: data for every possible relevant case Sample: a subset of cases that is drawn from an underlying population Inference Parameters and Statistics A parameter

More information

Section 4.2 Basic Concepts of Probability

Section 4.2 Basic Concepts of Probability Section 4.2 Basic Concepts of Probability 2012 Pearson Education, Inc. All rights reserved. 1 of 88 Section 4.2 Objectives Identify the sample space of a probability experiment Identify simple events Use

More information

Applied Statistics I

Applied Statistics I Applied Statistics I (IMT224β/AMT224β) Department of Mathematics University of Ruhuna A.W.L. Pubudu Thilan Department of Mathematics University of Ruhuna Applied Statistics I(IMT224β/AMT224β) 1/158 Chapter

More information