Chapter 4 Part 3. Sections Poisson Distribution October 2, 2008

Similar documents
Dr. Junchao Xia Center of Biophysics and Computational Biology. Fall /13/2016 1/33

37.3. The Poisson Distribution. Introduction. Prerequisites. Learning Outcomes

Probability: Why do we care? Lecture 2: Probability and Distributions. Classical Definition. What is Probability?

Section 2.3: Logarithmic Functions Lecture 3 MTH 124

Lecture 14: Introduction to Poisson Regression

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Stat 2300 International, Fall 2006 Sample Midterm. Friday, October 20, Your Name: A Number:

Probability and Probability Distributions. Dr. Mohammed Alahmed

Homework 4 Math 11, UCSD, Winter 2018 Due on Tuesday, 13th February

Poisson population distribution X P(

Chapter 4 Continuous Random Variables and Probability Distributions

Notes on Continuous Random Variables

Poisson Chris Piech CS109, Stanford University. Piech, CS106A, Stanford University

Discrete probability distributions

Discrete Distributions

Statistical Experiment A statistical experiment is any process by which measurements are obtained.

Exponential, Gamma and Normal Distribuions

CS 361: Probability & Statistics

Random Variables Example:

Probability Distributions

Lecture 2: Probability and Distributions

Module 8 Probability

MA 1125 Lecture 15 - The Standard Normal Distribution. Friday, October 6, Objectives: Introduce the standard normal distribution and table.

Honors Math 2 Unit 5 Exponential Functions. *Quiz* Common Logs Solving for Exponents Review and Practice

Expected Values, Exponential and Gamma Distributions

Part 3: Parametric Models

Created by T. Madas POISSON DISTRIBUTION. Created by T. Madas

I will post Homework 1 soon, probably over the weekend, due Friday, September 30.

Continuity and One-Sided Limits

2. In a clinical trial of certain new treatment, we may be interested in the proportion of patients cured.

Chapter 5 Formulas Distribution Formula Characteristics n. π is the probability Function. x trial and n is the. where x = 0, 1, 2, number of trials

STAT/SOC/CSSS 221 Statistical Concepts and Methods for the Social Sciences. Random Variables

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

14.2 THREE IMPORTANT DISCRETE PROBABILITY MODELS

Continuous-time Markov Chains

Probability Method in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur

Section 4.7 Scientific Notation

Expected Values, Exponential and Gamma Distributions

Recall that the standard deviation σ of a numerical data set is given by

Population Changes at a Constant Percentage Rate r Each Time Period

Lecture 6. Probability events. Definition 1. The sample space, S, of a. probability experiment is the collection of all

Exam 3, Math Fall 2016 October 19, 2016

Applied Statistics I

CS 237: Probability in Computing

Definition: A random variable X is a real valued function that maps a sample space S into the space of real numbers R. X : S R

DISCRETE RANDOM VARIABLES EXCEL LAB #3

b. ( ) ( ) ( ) ( ) ( ) 5. Independence: Two events (A & B) are independent if one of the conditions listed below is satisfied; ( ) ( ) ( )

Probability Methods in Civil Engineering Prof. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur

MAS1302 Computational Probability and Statistics

PASS Sample Size Software. Poisson Regression

Error analysis in biology

Probability and Discrete Distributions

Errata for the ASM Study Manual for Exam P, Fourth Edition By Dr. Krzysztof M. Ostaszewski, FSA, CFA, MAAA

Scientific Notation. exploration. 1. Complete the table of values for the powers of ten M8N1.j. 110 Holt Mathematics

Random Variable. Discrete Random Variable. Continuous Random Variable. Discrete Random Variable. Discrete Probability Distribution

Population Changes at a Constant Percentage Rate r Each Time Period

CS 1538: Introduction to Simulation Homework 1

Unit 2 Modeling with Exponential and Logarithmic Functions

7.1 Exponential Functions

Remember that C is a constant and ë and n are variables. This equation now fits the template of a straight line:

Metric Prefixes UNITS & MEASUREMENT 10/6/2015 WHY DO UNITS AND MEASUREMENT MATTER?

Lecture 10: Alternatives to OLS with limited dependent variables. PEA vs APE Logit/Probit Poisson

Discrete and Continuous Random Variables

Chapter 12: Inference about One Population

Generalized Linear Probability Models in HLM R. B. Taylor Department of Criminal Justice Temple University (c) 2000 by Ralph B.

Bernoulli Counting Process with p=0.1

Probability Experiments, Trials, Outcomes, Sample Spaces Example 1 Example 2

S n = x + X 1 + X X n.

3. DISCRETE PROBABILITY DISTRIBUTIONS

Binomial random variable

One-to-one functions and onto functions

Probability Distribution

Introduction to Determining Power Law Relationships

Experiment 14 It s Snow Big Deal

Unit 3. Discrete Distributions

Official GRE Quantitative Reasoning Practice Questions, Volume 1

Discrete Probability Distributions

This will mark the bills as Paid in QuickBooks and will show us with one balance number in the faux RJO Exchange Account how much we owe RJO.

Guidelines for Solving Probability Problems

CS1800: Strong Induction. Professor Kevin Gold

Chapter 3. Discrete Random Variables and Their Probability Distributions

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 2 MATH00040 SEMESTER / Probability

Instructor Dr. Tomislav Pintauer Mellon Hall Office Hours: 1-2 pm on Thursdays and Fridays, and by appointment.

Experimental Design, Data, and Data Summary

Modeling with Exponential Functions

Part 3: Parametric Models

ASM Study Manual for Exam P, First Edition By Dr. Krzysztof M. Ostaszewski, FSA, CFA, MAAA Errata

Chapters 3.2 Discrete distributions

Grade 8 Chapter 7: Rational and Irrational Numbers

Binomial and Poisson Probability Distributions

Nuclear Physics Lab I: Geiger-Müller Counter and Nuclear Counting Statistics

Chapter 7: Sampling Distributions

Chapter 7: Sampling Distributions

Midterm 2 - Solutions

Expectations. Definition Let X be a discrete rv with set of possible values D and pmf p(x). The expected value or mean value of X, denoted by E(X ) or

Lecture 2 Binomial and Poisson Probability Distributions

Generalized Linear Models for Non-Normal Data

Introduction to Statistical Data Analysis Lecture 3: Probability Distributions

STAT 328 (Statistical Packages)

1. Does each pair of formulas described below represent the same sequence? Justify your reasoning.

Transcription:

Chapter 4 Part 3 Sections 4.10-4.12 Poisson Distribution October 2, 2008

Goal: To develop an understanding of discrete distributions by considering the binomial (last lecture) and the Poisson distributions. Skills: To be able to distinguish which problems are appropriately solved using the binomial distribution and which are better approached with the Poisson. Stata command: pprob findit The following notes are primarily taken from Rosner. Page -1-

Poisson Distribution: The Poisson distribution is a discrete distribution (in the sense that it takes only a x Pr( X = x) = 1 countable number of values to get. The binomial distribution all x is also a discrete distribution but here the number of x values is finite. The set of all positive integers is an example of a countable set. The set of integers from 1 to 25 is an example of a finite set. We tend to think of the Poisson distribution if we are trying to model rare events. Poisson processes involve observing discrete events in a continuous interval of time, length or space. We use the word interval in describing the general Poisson process with the understanding that we may not be dealing with an interval in the usual mathematical sense. For example, we might observe the number of times radioactive gases are emitted from a nuclear power plant during a three month period. The discrete event of concern is the emission of radioactive gases. The continuous interval consists of a period of three months. Or we could observe the number of emergency calls received by a rescue squad per hour. Here the discrete event is the arrival of a call. The continuous interval is the one hour observation period. Suppose we consider a large quantity of some medium such as sea, air or water in which are found a number of discrete small entities, such as plankton or bacteria. The material has to be mixed thoroughly to overcome clumping or social instincts. The mixing must guarantee a uniform density of entities throughout the medium (it turns out that we need independence of the entities to use the Poisson distribution). Then when a small quantity of the medium is examined, the probability that k of the entities will be found in the examined portion is found using the Poisson distribution. Notice that these problems have a kind of independence. The Poisson distribution cannot be expected to fit counts of entities that tend to aggregate or clump, and it cannot be expected to fit arrivals in a waiting line when the length of the line tends to stimulate new arrivals perhaps in search of a bargain or alternatively, to repel them because of the possibility of a long wait. Similarly, the number of cases of streptococcal infection in households would not be expected to follow the Poisson distribution because of the effect of contagion. Page -2-

Properties of Poisson Distribution Property 1: Basic assumptions I will use the incidence of some disease as an example to illustrate 3 basic underlying assumptions. We are going to consider small subintervals of a given time period. We denote these subintervals by. Note that. Assume that Δt Δt > 0 Δt λδt 1) The probability of observing one death in the interval is directly proportion to Δt λ Δt the length of the time interval. That is, Pr(1 death) = for some λ constant. The parameter represents the expected number of events per unit time, i.e. per time. 2) The probability of observing more than 1 death over this time interval ( Δt ) is essentially zero. 3) The probability of observing 0 deaths over is approximately. Δt 1 λδt λδt Since the probability of observing 1 death in the interval is and the probability of observing more than 1 death in the interval is zero, the sum of the probabilities for deaths 1 is essentially. λδt Property 2: Stationarity: Assume that the number of events per unit time (or space) is the same throughout the entire time interval (or area) of interest. The bigger the area or the longer the interval the less likely this assumption will hold. If you are considering some type of bacteria in the water that is sensitive to temperature, as long as the water under consideration is approximately the same temperature you might expect a similar number of bacteria per small unit. But if the temperature of the water changes, the number of bacteria per small unit may change. So beware of areas that are large or time intervals that are long because the larger the interval the less likely it is that stationarity holds. Page -3-

Property 3: Independence: If an event occurs within one time subinterval, it has no bearing on the probability of an event in the next subinterval. k The probability of events occurring in a time interval for a Poisson random variable with parameter (the expected number of events per unit of time) is λ k e f ( k) = Pr( X = k) = for k = 012,,,... k! μ μ μ = λt e where and is approximately 2.71828. t So the Poisson distribution depends on a single parameter μ. Note that the parameter μ t t Δt' s μ represents the expected number of events over time period (or could be thought as over space rather than time). This is made up of. is also called the shape parameter. This is because it turns out that is both the mean and the variance of the Poisson distribution (i.e. the Poisson distribution is a one parameter distribution). μ The mean and variance of the Poisson distribution are both equal to μ. Example: Consider a bank manager who must decide how many tellers should be available during the Friday afternoon rush when customers arrive at a rate of 5 customers per minute but in a pattern described by the Poisson process. Such a manager might wish to know the probability distribution for the number of customer arrivals within a 30 minute period. The manager is trying to avoid long lines. λ t So we have = 5/minute = 30 minutes A this implies = 5 30 = 150 customers per half hour μ Page -4-

We need to be careful not to assume that the value of λ holds over an extended duration. The Poisson process rate frequently varies with the time of day, the day of the week, the season and more. Customers surely arrive at the bank at a different rate during Friday afternoons than on Tuesday mornings. The same would hold for a store manager trying to staff his/her store for the Friday after Thanksgiving rush. The Poisson process rate would be expected to be very different that Friday than for other Fridays. So you have to pick a length of time where the stationarity property holds. This distribution like the binomial distribution is also discrete but the probability mass is located at a countably infinite number of points rather than at only a finite number of points. The Poisson distribution is also a family of distributions. But it has a single parameter as opposed to the two parameters (n and p) for the binomial distribution. Note that μ μ can not be negative because 1) f ( k) is a probability and therefore f ( k)> 0 μ k μ k f ( k) to be negative. 2) if were negative and an odd number (say 3), then would be negative forcing The Poisson distribution is named for its discoverer, Siméon-Denis Poisson, a French mathematician from the late 18 th and early 19 th centuries. It is claimed that he once said that life is only good for two things - to do mathematics and to teach it. The Poisson distribution is particularly useful when the events you are interested in occur infrequently. As a result, it is frequently called the distribution of rare events. It has been applied in epidemiological studies over time of many forms of cancer and other rare diseases. It has also been applied to the study of the number of elements in a small space (the elements in the small space are the things per unit time and λ is the expected number of events per unit time or space) when a large number of these small spaces are spread at random over a much larger space ( t is the area of the larger space), as for example in the study of bacterial colonies on an agar plate. Even though the Poisson and binomial distributions both are used with counts, the situations for their applications differ. The binomial is used when a fixed sample size of size n is selected and the numbers of events and non-events are determined from these n independent trials. The Poisson is used when events occur at random in time or space, and the number of these events is noted (i.e. you count the events as they happen rather than fixing the number of trials in advance). Page -5-

Example: Let us assume we have a fishing company operating on the coast of New England. It is operating a search plane to find schools of salmon that are randomly located in the North Atlantic, there being on the average 1 school per 100,000 square miles of sea. On a given day, the plane can fly 1000 miles effectively searching a lateral distance of 5 miles on either side of its path. What is the probability of finding at least one school of salmon during 3 days of searching? How many days of searching are needed before the probability of finding at least 1 school reaches 0.95? We know that the plane can search a total of (1000)(10)(3) = 30,000 square miles in 3 days (i.e. t = 30,000 square miles) and that λ = 0.00001 schools per square mile (i.e. 1 school/100,000 square miles). Thus the expected number of schools to be found in 3 days is. μ = λt = 03. 03. 0 e (.) 03 1 = 1 e 0! Pr(X $ 1) = 1 - Pr(X = 0) = = 1-0.741 = 0.259. di exp(-0.3) = 0.74081822 Now the question is, how many days do we have to search to get the probability of finding a school up from 0.259 to 0.95. The information we already have is the value for λ (0.00001) and we have set the probability we want to achieve (0.95). We are going to first find μ. As with the problem above, once we know μ we can solve the problem. So we start out the same way as we did before: 0.95 = Pr(X $ 1) = 1 - Pr(X = 0) = 1 So 0.95 = 1 - e μ or e μ = 1-0.95 = 0.05. e μ = 005. So now we have. 03. 0 e μ 1 e μ 0 =! Page -6- μ

The next question now is how do we solve for μ? The answer is that we use the relationship between the natural log (ln) function and the exponential function. We ll use the fact that ln( e μ ) = μ e μ = 0.05 implies that ln( e μ ) = ln(0.05) [just taking the natural log of both sides] But ln( e μ ) = μ and ln(0.05) = -3 (. di ln(0.05) = -2.9957323) So μ = -3 which gives us (finally) μ = 3. μ = 3 μ = λt Then 3 = μ = λt = (0.00001) t which implies that t = 3/0.00001 = 300,000. To achieve the desired result 300,000 square miles of ocean must be searched, and this search, at a rate of 10,000 square miles per day, would take 30 days. Aside about the use of log and ln: Most mathematical texts use ln the denote the natural logarithm (i.e. the logarithm to base e ) and log to denote the logarithm to base 10. Stata, however, uses both ln and log to denote the natural logarithm and log10 to denote the logarithm to base 10. Page -7-

For the Poisson distribution there is a recursive relationship between Pr( X = k) Pr( X = k +1) and. Note that if f is the density function for the Poisson distribution then μ k e μ f ( k) = Pr( X = k) = for k = k! f ( k + 1) = Pr( X = k + 1) = e μ μ k + 1 ( k + 1)! 012,,,... for k = 012,,,... μ f ( k) = f ( k + 1) k + 1 Let us look at this recursive relationship works for a Poisson distribution with μ = 4 μ k μ 4 k 4 e e f( k) = = k! k! f 4 0 e 4 ( 0) = = 0! e 4 Page -8-

4 f() 1 = f( 0) = 4e 0+ 1 4 4 f( 2) = f( 1) = 2 4e = 8e 1+ 1 4 4 We can use Excel and the recursive formula above to obtain the Poisson distribution. The recursive formula allows us to avoid calculating k! Page -9-

Poisson Distributions with Various Means.4 Mean = 0.9.3 f(k).2 Mean = 5 Mean = 10.1 Mean = 20 + 0 0 5 10 15 20 25 30 35 k The Poisson distribution is always skewed to the right, but as the mean increases it begins to look more like the normal distribution. The Poisson distribution is, however, a discrete rather than continuous distribution. So probability mass is located at points rather than areas. This is why the points in the above graph have not been connected. Page -10-

How to install the Poisson command (named pprob). On the command line type: findit pprob You ll get the following window: Single left click on the line that starts nbvargr Y Page -11-

Now single left click on the line: (click here to install) This screen indicates you ve correctly installed the program. Note that pprob is installed as a part of the negative binomial program. Page -12-

Now on the command line type:. help pprob ------------------------------------------------------------------------------ help for pprob ------------------------------------------------------------------------------ Poisson probabilities --------------------- pprob, mean(value) [ n(integer) graph saving(filename) ] Description ----------- pprob generates poisson probabilites. Options ------- mean the mean of the negative binomial distribution. (this is a typo, it should say Poisson distribution) n the number of discrete levels to be included. The default value of n is 10. graph displays a graph of the probabilities. saving creates a data file with the probabilities. Examples --------. pprob, mean(3). pprob, mean(2.3) n(12) graph Author ------ Philip B. Ender Statistical Computing and Consulting Page -13-

UCLA, Office of Academic Computing ender@ucla.edu Example with μ = 4 and k = 0, 1, 2,..., 10 Notice below that Stata uses λ for the mean whereas Rosner uses μ.. pprob, m(4) n(10) Poisson Probabilities for lambda = 4 +------------------------------+ k pprob pcum ------------------------------ 1. 0 0.01831564 0.01831564 2. 1 0.07326256 0.09157820 3. 2 0.14652511 0.23810332 4. 3 0.19536681 0.43347013 5. 4 0.19536681 0.62883693 ------------------------------ 6. 5 0.15629345 0.78513038 7. 6 0.10419563 0.88932604 8. 7 0.05954036 0.94886637 9. 8 0.02977018 0.97863656 10. 9 0.01323119 0.99186778 ------------------------------ 11. 10 0.00529248 0.99716026 +------------------------------+ The variable pprob = f(k) The variable pcum = F(k) = f(0) + f(1) +... + f(k) Note that because the number of events is not fixed like it is for the Binomial distribution, using k = 10 or 5 doesn t give us the whole density and hence the cumulative distribution doesn t necessarily add to 1. The above just prints out the values. If you want to create a data file to use later, you need to use the command. pprob, m(4) n(10) sav(poisson4) Where poisson4 is a name I made up. Page -14-

The mean doesn t have to be an integer.. pprob, m(6.3) n(15) Poisson Probabilities for lambda = 6.3 +------------------------------+ k pprob pcum ------------------------------ 1. 0 0.00183630 0.00183630 2. 1 0.01156872 0.01340503 3. 2 0.03644147 0.04984649 4. 3 0.07652708 0.12637357 5. 4 0.12053016 0.24690373 ------------------------------ 6. 5 0.15186800 0.39877173 7. 6 0.15946139 0.55823314 8. 7 0.14351526 0.70174837 9. 8 0.11301827 0.81476665 10. 9 0.07911278 0.89387941 ------------------------------ 11. 10 0.04984105 0.94372052 12. 11 0.02854533 0.97226584 13. 12 0.01498630 0.98725212 14. 13 0.00726259 0.99451470 15. 14 0.00326817 0.99778289 ------------------------------ 16. 15 0.00137263 0.99915552 +------------------------------+ Page -15-

------------------------------------------------------------------------------ help for nbvargr ------------------------------------------------------------------------------ Graph variable, poisson & negative binomial probabilities ---------------------------------------------------------- nbvargr varname [if] [in] [, n(integer) ] Description ----------- nbvargr graphs the observed proportions along with the poisson and negative binomial probabilities for a count type variable. The poisson probabilities are computed using an estimate of the poisson mean. The negative binomial probabilities use the same mean and an estimate of the overdispersion parameter. Options ------- n the number of discrete levels to be included. The default value of n is 10. Note ---- nbvargr requires that nbreg and pprob be installed. Examples --------. nbvargr days. nbvargr,n(15) Author ------ Philip B. Ender Page -16-

Now we ll use the pprob command to get the Poisson distribution with mean = 20. Note the output below uses lambda (λ) for what we have called mu (μ). As you read through various texts you will find both names used for the mean. So be careful you are clear on how the terms are being used. Below I have used a very large n because I wanted to be able to show you the long tails of the distribution.. pprob, mean(20) n(300) sav(poisson20) (130 missing values generated) Poisson Probabilities for lambda = 20 +-------------------------------+ k pprob pcum 1. 0 0.00000000 0.00000000 2. 1 0.00000004 0.00000004 3. 2 0.00000041 0.00000046 4. 3 0.00000275 0.00000320 5. 4 0.00001374 0.00001694 6. 5 0.00005496 0.00007191 7. 6 0.00018321 0.00025512 8. 7 0.00052347 0.00077859 9. 8 0.00130867 0.00208726 10. 9 0.00290815 0.00499541 11. 10 0.00581631 0.01081172 12. 11 0.01057510 0.02138682 13. 12 0.01762517 0.03901199 14. 13 0.02711565 0.06612764 15. 14 0.03873664 0.10486428 16. 15 0.05164886 0.15651314 17. 16 0.06456107 0.22107421 18. 17 0.07595420 0.29702839 19. 18 0.08439355 0.38142195 20. 19 0.08883531 0.47025728 21. 20 0.08883531 0.55909258 22. 21 0.08460506 0.64369762 23. 22 0.07691369 0.72061133 24. 23 0.06688147 0.78749281 25. 24 0.05573456 0.84322739 26. 25 0.04458765 0.88781500 27. 26 0.03429819 0.92211324 28. 27 0.02540607 0.94751930 29. 28 0.01814719 0.96566647 30. 29 0.01251530 0.97818178 Page -17-

31. 30 0.00834354 0.98652530 32. 31 0.00538293 0.99190825 33. 32 0.00336433 0.99527258 34. 33 0.00203899 0.99731153 35. 34 0.00119940 0.99851096 k = 35-99 have been edited out 101. 100 0.00000000 1.00e+00 102. 101 0.00000000 1.00e+00 103. 102 0.00000000 1.00e+00 104. 103 0.00e+00 1.00e+00 105. 104 0.00e+00 1.00e+00 106. 105 0.00e+00 1.00e+00 107. 106 0.00e+00 1.00e+00 108. 107 0.00e+00 1.00e+00 109. 108 0.00e+00 1.00e+00 110. 109 0.00e+00 1.00e+00 k = 110-164 have also been edited out due to repetition. 166. 165 0.00e+00 1.00e+00 167. 166 0.00e+00 1.00e+00 168. 167 0.00e+00 1.00e+00 169. 168 0.00e+00 1.00e+00 170. 169 0.00e+00 1.00e+00 171. 170 0.00e+00 1.00e+00 172. 171. 1.00e+00 173. 172. 1.00e+00 174. 173. 1.00e+00 175. 174. 1.00e+00 k = 175-300 have been edited out due to repetition. Note that 171 observations have non-missing values for pprob and 130 have missing values (see note below about missing values) for a total of 301 observations. If n = 300, we ll have 301 observations because we start at 0. Why does the tail on the right stop. It is because at k = 171 the number becomes so small that although it is actually positive, Stata registers it as missing. So if Stata would actually keep going there would be a very long tail to the right. Page -18-

Poissonprobability 8.0e-0 2.1 P o i s k s r o u n n s w f r i o t h m m 0 t e o a 3 n 0 0 = 2 0 k.1 Poisson with mean = 20 k runs from 0 to 300 Poisson probability 4.0e-02 6.0e-02 8.0e-02 2.0e-02 0 0 20 100 200 300 k Below we see what happens if we let the mean = 100. Now Stata starts filling in as missing all values with k $ 154. So the graph should have a long tail to the right if Stata didn t set the very small numbers to missing. It does have a long tail to the left now so it is more symmetric looking and between about 50 and 150 it starts to look pretty normal. pprob100 0 2.0e-02 4.0e-02 6.0e-02 Poisson with mean = 100 k runs from 0 to 300 0 100 200 300 k Page -19-

Negative Binomial Since the term negative binomial has been introduced, I give the definition below. The negative binomial distribution is used when the number of successes is fixed and we're interested in the number of failures before reaching the fixed number of successes. An experiment which follows a negative binomial distribution will satisfy the following requirements: 1. The experiment consists of a sequence of independent trials. 2. Each trial has two possible outcomes, S (success) or F (failure). 3. The probability of success is constant from one trial to the next. The experiment continues until a total of r successes are observed, where r is fixed in advance. So you fix not the number of trials but the number of successes. If p is the probability of a success on any given trial and X ~ NB(r,p), then x+ r 1 r Pr( X = x) = p ( 1 p) r 1 x for x = 0,1,2,.. Page -20-

Primer on Scientific Notation Do you know this number, 300,000,000 m/sec.? It's the Speed of light! Do you recognize this number, 0.000 000 000 753 kg.? This is the mass of a dust particle! Scientists have developed a shorter method to express very large numbers. This method is called scientific notation. Scientific notation is based on powers of the base number 10. The number 123,000,000,000 in scientific notation is written as : 123. 10 11 The first number 1.23 is called the coefficient. It must be greater than or equal to 1 and less than 10. The second number is called the base. It must always be 10 in scientific notation. The base number 10 is always written in exponent form. In the number 1.23 x 10 11 the number 11 is referred to as the exponent or power of ten. To write a number in scientific notation: Put the decimal after the first digit and drop the zeroes. So in an intermediate step 123,000,000,000 becomes 1.23 (dropping the 9 zeros). So for the number 123,000,000,000 the coefficient will be 1.23 To find the exponent count the number of places from the decimal to the end of the number before the zeros are removed. In 123,000,000,000 there are 11 places. Therefore we write 123,000,000,000 as: 123. 10 11 Exponents are often expressed using other notations. The number 123,000,000,000 can also be written as: 1.23E+11 or as 1.23 X 10^11 Page -21-

For small numbers we use a similar approach. Numbers smaller than 1 will have a negative exponent. A millionth of a second is: 0.000001 sec. = or 1.0E-6 or 1.0 X 10^-6 or 10. 10 6 Page -22-