Probability Distributions - Lecture 5

Similar documents
Probability and Statistics Concepts

Lectures on Elementary Probability. William G. Faris

PAS04 - Important discrete and continuous distributions

18.175: Lecture 13 Infinite divisibility and Lévy processes

Statistical Methods in Particle Physics

Lecture 2 Binomial and Poisson Probability Distributions

Introduction. Probability and distributions

Guidelines for Solving Probability Problems

Statistical Methods in Particle Physics

Week 4: Chap. 3 Statistics of Radioactivity

II. Probability. II.A General Definitions

Chapter 1 Statistical Reasoning Why statistics? Section 1.1 Basics of Probability Theory

Lecture 12. Poisson random variables

18.175: Lecture 17 Poisson random variables

STAT2201. Analysis of Engineering & Scientific Data. Unit 3

Statistics and data analyses

UNIT NUMBER PROBABILITY 6 (Statistics for the binomial distribution) A.J.Hobson

BMIR Lecture Series on Probability and Statistics Fall, 2015 Uniform Distribution

Random variables. DS GA 1002 Probability and Statistics for Data Science.

Physics 6720 Introduction to Statistics April 4, 2017

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Statistics, Data Analysis, and Simulation SS 2017

Lecture 2. Binomial and Poisson Probability Distributions

Lecture 16. Lectures 1-15 Review

15 Discrete Distributions

Statistics, Data Analysis, and Simulation SS 2013

Topic 2: Probability & Distributions. Road Map Probability & Distributions. ECO220Y5Y: Quantitative Methods in Economics. Dr.

THE QUEEN S UNIVERSITY OF BELFAST

System Simulation Part II: Mathematical and Statistical Models Chapter 5: Statistical Models

Special distributions

Week 1 Quantitative Analysis of Financial Markets Distributions A

Some Statistics. V. Lindberg. May 16, 2007

Introduction to Statistics and Error Analysis

Signal Processing - Lecture 7

Northwestern University Department of Electrical Engineering and Computer Science

Chapter 4. Repeated Trials. 4.1 Introduction. 4.2 Bernoulli Trials

success and failure independent from one trial to the next?

MFM Practitioner Module: Quantitative Risk Management. John Dodson. September 23, 2015

Data, Estimation and Inference

Physics Sep Example A Spin System

1 INFO Sep 05

Probability - Lecture 4

Ching-Han Hsu, BMES, National Tsing Hua University c 2015 by Ching-Han Hsu, Ph.D., BMIR Lab. = a + b 2. b a. x a b a = 12

Department of Mathematics

CME 106: Review Probability theory

Probability Distributions

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu

Chapter 6 Continuous Probability Distributions

Department of Mathematics

PROBABILITY DISTRIBUTION

Introduction to Probability

Math Bootcamp 2012 Miscellaneous

Exam 3, Math Fall 2016 October 19, 2016

Chapter 3 Single Random Variables and Probability Distributions (Part 1)

Physics 403 Probability Distributions II: More Properties of PDFs and PMFs

Introduction to Statistical Data Analysis Lecture 3: Probability Distributions

Slides 8: Statistical Models in Simulation

Stochastic Models in Computer Science A Tutorial

Random processes and probability distributions. Phys 420/580 Lecture 20

Random Variables and Their Distributions

n px p x (1 p) n x. p x n(n 1)... (n x + 1) x!

Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics)

Exercises and Answers to Chapter 1

Computer Science, Informatik 4 Communication and Distributed Systems. Simulation. Discrete-Event System Simulation. Dr.

PCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities

STAT/MATH 395 A - PROBABILITY II UW Winter Quarter Moment functions. x r p X (x) (1) E[X r ] = x r f X (x) dx (2) (x E[X]) r p X (x) (3)

System Simulation Part II: Mathematical and Statistical Models Chapter 5: Statistical Models

Stat410 Probability and Statistics II (F16)

Advanced Herd Management Probabilities and distributions

Topic 3: The Expectation of a Random Variable

PHYS 445 Lecture 3 - Random walks at large N 3-1

Probability Density Functions

CHAPTER 14 THEORETICAL DISTRIBUTIONS

1: PROBABILITY REVIEW

Tom Salisbury

Introduction to Statistics. By: Ewa Paszek

Discrete Binary Distributions

Probability Distribution

ECE 313 Probability with Engineering Applications Fall 2000

Chapter 1. Sets and probability. 1.3 Probability space

n(1 p i ) n 1 p i = 1 3 i=1 E(X i p = p i )P(p = p i ) = 1 3 p i = n 3 (p 1 + p 2 + p 3 ). p i i=1 P(X i = 1 p = p i )P(p = p i ) = p1+p2+p3

Random Variable. Discrete Random Variable. Continuous Random Variable. Discrete Random Variable. Discrete Probability Distribution

functions Poisson distribution Normal distribution Arbitrary functions

EE/CpE 345. Modeling and Simulation. Fall Class 5 September 30, 2002

BINOMIAL DISTRIBUTION

Things to remember when learning probability distributions:

Introduction to Probability Theory for Graduate Economics Fall 2008

Chapter 2: Random Variables

Sampling Distributions

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Chap 2.1 : Random Variables

X 1 ((, a]) = {ω Ω : X(ω) a} F, which leads us to the following definition:

Binomial random variable

Given a experiment with outcomes in sample space: Ω Probability measure applied to subsets of Ω: P[A] 0 P[A B] = P[A] + P[B] P[AB] = P(AB)

{ p if x = 1 1 p if x = 0

Radioactivity. PC1144 Physics IV. 1 Objectives. 2 Equipment List. 3 Theory

1 Probability and Random Variables

Fourier and Stats / Astro Stats and Measurement : Stats Notes

Statistics, Probability Distributions & Error Propagation. James R. Graham

Discrete Distributions

Transcription:

Probability Distributions - Lecture 5 1 Introduction There are a number of mathematical models of probability density functions that represent the behavior of physical systems. In this lecture we explore a few of these functions and their applications. All of the distributions, whether discrete or continuous must be properly normalized so that the area under the distribution function (summed over the individual components if discrete) must equal 1. 2 Stirling s approximation An extremely useful approximation of N! was developed using asymptotic series last semester. Called Stirling s approximation it is; lim N 2π N N+1/2 e N N! = 1 Although the difference between N! and the approximation does not converge, the ratio converges rapidly. This is shown in Table 1. Table 1: The table shows the convergence of Stirling s approximation for N! N N! Approximation Ratio 1 1 0.9221 0.08 2 2 1.919 0.04 5 120 180.19 0.02 10 3,628,800 3,598,600 0.0008 Stirling s approximation can be improved by adding additional terms. Thus the relation below can be used. (2π) 1/2 N N+1/2 e N+(1/12N+1) < N! < (2π) 1/2 N N+1/2 e N+(1/12N) 1

3 Probability distributions 3.1 Bernoulli trials Repeated independent trials that have 2 outcomes, success = p or failure = q, are called Bernoulli trails. From unitarity we have p + q = 1. The probability of a string of n independent trials having k successes might result in a sequence; Prob = ppqpq qpqq = p k q n k In most cases we are interested in the total number of successes or failures and not the order in which they occur. Suppose we want to know the number of successes out of n trials without caring about the ( ordering. ) There will be; n P ossibilities = k each with the probability p k q n k. Thus the probability we seek is ( ) n P B (k, n) = p k k q n k This is the binomial distribution giving the probability of k successes out of n trials. The distribution is obviously discrete. The probabilities p, q cannot change, so either the initial population is very large or sampling is done with replacement. Examples of binomial distributions are shown in Figure 1. The distribution can be easily shown to take the form of a binomial coefficient by considering the probability, P x, that an event occurs x number of times. Write; [p + q] z = p z + z 1! pz 1 q + [p + q] z = P z + P z 1 + + P 0 = 1 z(z 1) p 2! z 2 q 2 + + q z An individual term, P x represents the probability that out of z trials, x successes and z x failures occur. Thus ; P x = z! x!(z x)! px q z x The expectation value of the binomial distribution is E(K) = np with variance σ 2 = np(1 p) It has skewness, µ 3 = np(1 p)(1 2p) = σ 2 (1 2p), and Kurtosis; µ 4 = np(1 p)(1 6p + 6p 2 ) = σ 2 (1 6p + 6p 2 ) 2

Figure 1: Examples of binomial distributions for various parameters 3.1.1 Example Toss a coin 12 times with probability of success 1/2. What is the probability of having 6 successes? P B (6, 12) = 12! 6!6! (1/2)6 (1/2) 6 = 0.23 Only if the number of coin tosses is large will the average probability approach 1/2. However, p and q do not have to be 1/2 and may take any number 0 p, q 1 but p + q = 1 and p, q must remain constant. The Bernoulli model can be applied to any stochastic processes where one is interested in either success or failure. Of course the model is only as good as its representation of empirical data. One might believe that after a long string of coin tosses resulting in heads that the next toss is more likely to be tails, but if this were true, the process would contain a memory of its past and so it would not be random. P(S N ) is a binomial distribution. In the case of p = 1/2, it does not mean that success will occur 1/2 the time. That is there is no tendency for the length of any string of success (or 3

failure) to equalize, only that the frequency of the lead is approximately even, see Figure 2. Throws Failure Success Average to 0 Figure 2: A possible example of a sequence of coin tosses with p = 1/2 3.1.2 Example Suppose one throws 12 dice and count 1,2 as success. Then P = (1/6)(1/6)(12) = 1/3. Table 2 gives the result of successes for 26,306 throws of the dice. The result looks reasonable, but is statistically bad. A fit would suggest this result occurring once in 10,000 tries. A bias with a probability of P = 0.3377 fits the data. Note that the mean is given by µ = pn = (1/3)(12) = 4 and the variance is σ 2 = pn(1 p) = 8/3 = 2.667 As another example, we look at the number of phototmultiplier tubes, k, which have a signal within a given time, if the probability of any one having a signal is, p. The total number of tubes is N. This is represented by a binomial distribution given by; ( ) N P(k, N) = p k k q N k for a practical example, choose N = 342 and p = 0.1. The distribution for various values of k is shown in Figure 3. Note that the figure shows the distribution as continuous, but it is defined for discrete values of k. Calculation of the distribution is not as easy as one might suppose, due to the large and small numbers involved. A calculator will overflow or underflow when calculating N! and/or p k, so the calculation must be carefully handled. One 4

Table 2: Probability of throwing 12 dice 26,306 times Event Number (k) P(k;12,1/3) Observed P(k;12, 0.3377 0 0.007707 0.007033 0.007123 1 0.1494 0.1992 0.8008 2 0.046244 0.043678 0.043584 3 0.127171 0.124116 0.122225 4 0.211952 0.208127 0.207736 5 0.238446 0.232418 0.238342 6 0.190757 0.197445 0.194429 7 0.111275 0.116589 0.115660 8 0.047689 0.050597 0.050549 9 0.003312 0.003991 0.003650 10 0.000497 0.000532 0.000558 11 0.000045 0.000152 0.000052 12 0.000002 0.000000 0.000002 way to proceed for very large factorial numbers is to use Stirling s approximation and write the distribution in terms of logarithms. Thus; N! 2π N N+1/2 e N Substitute this into the above equation, write the remaining terms in logarithms and collect terms. The result is; P(k, N) = 1 2π EXP[φ] φ = 2k + N ln( Nq k)p ) + k ln((n ) N k qk Note the average value ie the peak in the distribution) is µ = pn = 34 and the standard deviation is σ = pn(p 1) = 5.5. The figure on the right figure looks like a normal distribution, and this will be discussed below. 3.2 Poisson distribution Suppose a set of Bernoulli trials when n and p is small. We want λ = np such that as n, λ remains constant. Taking appropriate limits we obtain; P P (k, λ) = λk k! e λ This is the Poisson distribution. It is a continuum limit of the binomial distribution. Sup- 5

Figure 3: The figure of the left is the binomial distribution of k successes out of 342 samples plotted on a log scale. The figure on the right is the same but plotted on a linear scale 6

pose we consider; ( ) n P(0; n, p) = p k 0 q n = (1 p) n ln[p(0; n, p)] = n ln(1 p) = n ln(1 λ/n) Then apply a Taylor expansion; ln[p] = n(λ/n + λ 2 /2n 2 + ) Thus as n ln[p] = λ so ; P(0; n, p) e λ Consider the ratio; P(k; n, p) P(k 1; n, p) = λ (k 1)p k(1 p) Now p = λ/n, so as n p 0, and; P(k; n, p) P(k 1; n, p) λ/n Begin by choosing k = 1 For k = 2; By induction; P(1; n, p) λ P(0; n, p) = λe λ P(2; n, p) λ/2 P(1; n, p) = (λ 2 /2) e λ P(k; n, p) = λk k! e λ In another development suppose we apply Bernouli trials with probability of succcess, p n, but the number of trials is the integer nearest t/(1/n) = nt. This represents sub-divisions of an interval of length t into n divisions. Then take the limit as n and let λ λt. Thus we find that; P(k; λt) = e λt ( (λt)k k! ) 7

is the probability of finding exacly, k samples in a fixed interval of length t. The probability of no sample is, P(0; λt) = e λt. The parameter λ determines the density of samples along the t axis. However, the Poisson distribution could be directly derived as a continuum function without using an approximation to the binomial distribution. This will be shown later. Examples of the Poisson distribution are shown in Figure 4 Figure 4: Examples of the Poisson distribution for various parameters To show the normalization; Since; k=0 P k = e λ = k=0 k=0 λ k /k λ k /k e λ = 1 The Poisson distribution has one parameter, λ obtained from the binomial distribution when p = 0 and pn = λ. Again Stirling s approximation will be useful. By considering the ratio of terms and applying Stirling s approximation; P(k; n, p) P(k 1; n, p) = 1 + (n + 1)p k kq 8

Table 3: Probability of events for an average rate of 3 events/run Event Number (k) Probability for k probability for < k Probability for > k 0 0.0498 0.0498 0.98502 1 0.1494 0.1992 0.8008 2 0.2240 0.4232 0.5768 3 0.2240 0.6472 0.3528 4 0.1680 0.8152 0.1848 5 0.1008 0.9160 0.084 6 0.0504 0.9664 0.0336 7 0.0216 0.9880 0.012 8 0.0081 0.9961 0.004 Table 4: The Poisson probability as a function of a cut on detected photo-electrons in 16 µs for λ = 0.0528 Event Number (k) Probability for k probability for < k Probability for > k 0 0.948 0.948 0.052 1 0.0501 0.998 0.1.9 10 3 1.5 0.0129-1.2 10 3 2 1.25 10 3 0.9994 6.5 10 4 The greatest value of the ratio occurs when k = (n+1)p. This is the most probable number of successes. We expect that if S n is the number of successes in n trials the average value is S n /n and should be near p. Thus as n the average number of successes deviates from p by a small number. This is the law of large numbers. Therefore, for the Poisson distribution defined by; P(λ) = λk e λ k! The most probable value (expectation value) is Λ, the variance λ, the skewness is 1/λ, and the Kutosis is 1/λ. As an example we propose to observe a nuclear state. We have data in 3 preliminary runs yielding 1, 5, and 3 events. We need to estimate how many runs to observe 15 events. The average number of events per run is λ = 3. Then fill Table 3. As another example, we presume an average dark current rate in a PMT is 3300 Hz λ t = 0.0528. Apply Poisson statistics to determine the probability of getting a background count in 16 µs. The result is shown in Table 4. 9

Table 5: A comparison between the binomial and poisson distributions for the example in the text Event Number Binomial Poisson N k N k /100 k P(k;100,1/100) P(k;1) 0 0.366032 0.367879 41 0.41 1 0.369730 0.367879 34 0.34 2 0.184865 0.183940 16 0.16 3 0.060999 0.061313 8 0.08 4 0.014942 0.015328 0 0.00 5 0.002898 0.003066 1 0.00 6 0.000463 0.000511 0 0.00 7 0.000063 0.000073 0 0.00 8 0.000007 0.000009 0 0.00 9 0.000001 0.000001 0 0.00 Table 5 compares the binomial and poisson probability distributions for N k, the number of 2-number sets in 100 pairs of random numbers in which the pair (7,7) appears k times. 3.2.1 Radioactive decay Suppose λ is the average rate of decay of a radioactive nuclide. The average number of decays in a time, t, is λt. If the interval is small then λ dt is the probability of decay in dt. The probability of no decays in dt is (1 λ dt). For x particles to decay; P x (t + dt) = P 1 (dt)p (x 1) (t) + [1 P 1 (dt)]p x (t) The first term on the right is the probability of one decay in dt multiplied by the probability of (x 1) decays in t. The second term is the probability of no decays in dt and x decays in t. Thus; P x (t + dt) = λ dt P x 1 (t) + (1 λ dt)p x In the limit as dt 0; dp x dt = λ[p x 1 (t) P x (t)] The above has solution; P x = (λt)x x! e λt 10

3.3 Normal distribution The normal distribution is an analytical approximation to the binomial distribution when x = n and the average is λ = xp. From the binomial distribution let x keeping p fixed and δ/n 0. Where; δ = x np (deviation from the mean) (n x) = nq δ Use Stirling s approximation; N! = (2π) 0.5 N N+0.5 e N and substitute into the binomial distribution, re-arranging terms. The result gives; dp N (λ) dx = 1 2πσ e (x λ)2 /(2σ 2 ) An example of the normal distribution is shown in Figure 6. Here λ is the mean value (expectation value), σ 2 is the variance, and both the Skewness and Kutosis equal zero. Integration over < x < gives P N = 1 and when x = σ then ; 1 σ 2πσ σ dxe (x λ)2 /(2σ 2) = erf( 1/2) = 0.682 In the above, erf(x) is the error function evaluated at x. An example of the error function is shown in Figure 6 The mean or expectation value is E(x) = µ with variance, σ 2. The skewness and kurtosis vanish. The normal distribution is the most important distribution for statistical analysis and usually written as N(µ, σ 2 ). Note that σ is not the half-width at half-height which is 1.176σ and the distribution extends equally above and below the mean, so if the mean is not large, there can be negative probabilities, which of course is non-physical, meaning that the Normal approximation to the binomial of Poisson distributions is invalid. The probability content of various intervals of the variable x σ µ = z is; P( 1.64 < z < 1.64) = 0.9 P( 1.96 < z < 1.96) = 0.95 P( 2.58 < z < 2.58) = 0.99 11

Figure 5: An example of a normal distribution showing the defining parameters Figure 6: Graphs of the error integral for various parameters P( 3.29 < z < 3.29) = 0.999 12

As an example suppose a sample of 100 events with p = 0.3. This is a relatively small sample, but is used here to note deviations from the binomial distribution. This is illustrated in Table 5 where S n represents the number of successes as one moves away from the expected value of 30. 4 Error integral erf(x) = 2 x π 0 dt e t2 erfc(x) = 1 erf(x) The error function erf(x), is the probability that a number drawn from a normal distribution is less than y. Prob(y < x Gaussian) = (1/2)[1 + erf( x µ 2σ )] 5 Relationship between the distributions p n λ Probability of success # events average Binomial p 0 n p n λ 8 Poisson n p n λ 8 λ 8 Gaussian λ 8 Figure 7: The connection between the binomial, Poisson, and normal distributions 13

Table 6: A comparison between the binomial and normal distributions for N = 100 and p = 0.3 Success range Binomial normal Percent Error 9 S n 11 0.00000 0.00003 + 400 12 S n 14 0.00015 0.00033 + 100 15 S n 17 0.00201 0.00283 + 40 18 S n 20 0.01430 0.01599 + 12 21 S n 23 0.05907 0.05895 0 24 S n 26 0.14887 0.14447-3 27 S n 29 0.23794 0.23405-2 31 S n 33 0.23013 0.23405 + 2 34 S n 36 0.14086 0.14447 + 3 37 S n 39 0.05889 0.05895 0 40 S n 42 0.01702 0.0599-6 43 S n 45 0.00343 0.00283-18 46 S n 48 0.00049 0.00033-33 49 S n 51 0.00005 0.00003-40 The relationship of various distributions to the binomial distribution is shown in Figure 7. The error using the normal distribution is small if npq is large. If n is large and p is small the Poisson distribution is more appropriate. If λ is large, neither the Poisson or normal distribution is appropriate. The mean value is the arithmetic average of a data set. It is defined as the expectation value. np m = E(X) = (1/N) N i X i The breadth of a statistical fluctuation about the mean is measured by the standard deviation. σ 2 = N (X i m) 2 i=1 Then for the normal distribution, σ 2 = npq. For the Poisson distribution σ 2 = λ = m. Of course neither m nor λ can be accurately determined. The distribution of the mean values is more normal than the parent. As another example consider a comparison of the Poisson distribution to the normal distribution. Suppose we choose N = 10 8 and p = 10 8. Then npq = 100. The Poisson distribution agrees with the binomial distribution P B (k; 10 8, 10 6 ). This is now compared to the normal distribution in Table 7. 14

Table 7: A comparison between the Poisson distribution P P (k; 100) and normal distributions P N (µ, σ) Success range Poisson normal P(85, 90) 0.11384 0.11049 P(90, 95) 0.18485 0.17950 P(95, 105) 0.41763 0.4168 P(90, 110) 0.70652 0.70628 P(110, 115) 0.10738 0.11049 P(115, 120) 0.05323 0.05335 As a final example, suppose a telephone exchange to serve 2000 phones and wishes to connnect to another exchange. Obviously one does not run 2000 cables. Choose the number of lines, N such that with some probability at least one will connect. Suppose that only 1% of calls are dropped and each call lasts 2 minutes. Set the probability that a trunk line is required at p = 1/30. Assume random requirements for the lines. There are then 2000 Berrnoili trials with p = 1/30, and the number of samples N is chosen such that the probability of more than N successes is less than 0.1. For a Poisson approximation take λ = 2000/30 = 66.67. For N = 87 this results in S n = 0.0097. For the normal approximation, (N 1/2 np)/npq = 85.8. 6 Statistical distributions In statistical mechanics we divide phase space into a large number of cells, N, and select, k, particles to be placed in a cell, N > k. In this way the state of a system is defined by a random distribution of k particles in N cells. Assume all N k arrangements have equal probabilities. This results in a Maxwell-Boltzmann distribution. The particles can be distributed in k! ways so that the N cells contain L 1, L 2, L N particles where N L i = k. There are then k! ways to choose the k particles and L i! ways to reorder the particles in the i th cell. The number of possibilities is then; N P = k! L 1!L 2! L N! For equal probability of all arrangements, the probability that N cells contain L 1, L 2, L N indistinguishable particles is; P M = N P /N k i 15

7 Example Consider 6 particles among which is distributed 9 units of energy, as illustrated in figure 8. There are 26 possible distributions. 4 Figure 8: Examples of the distribution of 6 particles into 10 energy states states The 3 distributions in the figure have different numbers of statistical possibilities as calculated by Maxwell-Boltzmann statistics, but the sum of the energies of all the states is the same total energy. The number of possible statistical arrangements is written above each distribution for the Maxwell-Boltzmann and Bose-Einstein distributions. Perhaps it is easiest to find the total number of possible energy states by tabulation. There are 26 such possibilities assuming indistinguishable rearrangements within each sub-energy level. There are N k total ways to distribute the particles, so the probability of a Maxwell-Boltzmann distribution is; P MB = [ k! L 1!L 2 L N! ] N k The average occupancy is the sum over the 26 distributions of the number of particles in a given energy state divided by 26. On the other hand, if the particles are indistinguishable and described by Bose-Einstein statistics, all the distributions have equal probability. Thus not all N k arrangements are equally probable. In the example above, each macro-arrangement is equally probable having an energy proportional to N. It is assigned a probability given by (the inverse of the number 16

of representations); P BE = ( N + k 1 k ) 1 Low energy states are more probable in Bose-Einstein than for Maxwell-Boltzmann statistics. To obtain the distribution of particles as a function of energy, the average population of each energy state must be evaluated. Using ( ) Fermi-Dirac statistics only one particle can occupy a state, so k N and, P FD = 1 N. In this case the statistical sample is obtained by specifying which of the N cells k contain a particle and then re-ordering the particles. Statistical mechanics finds the maximum number densities for a given energy that is derived from these expressions. Figure 9 compares the Maxwell-Boltzmann distribution to the Bose-Einstein distribution. ρ MB = 1 e (ǫ µ)/kt ρ BE = 1 e (ǫ µ)/kt + 1 ρ FD = 1 e (ǫ µ)/kt 1 Maxwell-Boltzmann Bose-Einstein Fermi-Dirac 8 Example Consider the distribution given in the diagram; ( ) The number of cells is 5 ( N = 5) and the number of particles is 3 (k = 3). Bose-Einstein Statistics ( ) ( ) N + k 1 7 = k 3 P BE = 1/35 = 0.286 = 35 Fermi-Dirac 17

Figure 9: A comparison of the Maxwell-Boltzmann and Bose-Einstein distributions ( N k ) = ( 5 3 ) = 10 P FD = 1/10 = 0.100 The 10 possible arrangements are graphically shown below. ( ); ( ); ( ); ( ); ( ); ( ) ( ); ( ); ( ); ( ) Maxwell-Boltzmann Statistics Number of possibilities N 5 ; Number of choices k!; Equal probabilities per state 1/N 5. P MB = k! N k = 0.048 For the state illustrated by the graph above, the number of indistinguishable states are given in the graphs below. (1 2 3 ); (2 3 1 ); (3 1 2 ); 18

(2 1 3 ); (1 3 2 ); (3 2 1 ) 9 Interval distribution The interval distribution uses the Poisson distribution to represent the waiting time between successive events of any random process. We have already used this to represent radioactive decay. The mean rate of events is λ. The probability of no events in time, t is; P 0 = (λt)0 0! e λt The probability of k events in t is then; P k = (λt)k k! e λt 9.1 Example Consider 2 models of a set of coin tosses. In model (1) the probability of p is unknown and in model (2) p = 1/2. The likelihoods are obtained from the binomial distribution. ( ) N P(k; N, p) = p k k (1 p) N k = C n p k (1 p) N k This is written as P rob(d p, M) which represents the conditional probability of observing the data, D, conditioned by the model, M, and probability, p. In the case of Model (1) this is; ( ) N P(k; N, p) = (1/2)N 1/2 To compare the models we want to invert the probabilities. Thus we want Prob(p D, M). Thus we use Bayes theorem; Prob(p D, M) = Prob(D p, M) Prob(P, M) Prob(p, M) We do not know the initial probability for Model (1), P rob(p, M). Suppose we choose this by assuming; P(p M 1 ) = (2n + 1)! (k!) 2 p n (1 p) n In the above, k = 0 represents no information and k = 10 represents incomplete, but informed information. The first case gives a flat prior. The second case gives a peak for a 19

probability of 1/2 with a spread in the distribution. The inverted probability is; Prob(p D, M 1 ) = C np k+n (1 p) N k+n Prob(B, M 1 ) Of course this may be iterated using the posterior Prob(p D, M 1 ) as the next prior, Prob(P, M). This means the probability is adjusted based on the results of each coin toss. Then to compare models; R = Prob(M 1 D) Prob(M 2 D) By substitution of Bayes theorem in the above gives the result. For a large number of tries, this can be analytically evaluated in approximation; R = (2n + 1)!(k + n)!(n k + n)! (n!) 2 (N + 2n + 1)! 2 N Note this becomes independent of n for large N, k. The Figure 10 shows convergence to the binomial distribution no matter the initial (prior) probability which model is used). 10 Chi-square distribution The Chi-squared distribution is important in statistics as it is used for testing the goodness of fit of a theory to data and for testing various theories. Let X i be a set of N normally distributed variables with means, µ i, and variances, σ 2 i. Write the variant, Z i, in the form; Z i = X i µ i σ i. Denote χ 2 = N Z 2 i. Then the probability density becomes; P(χ 2 /2) = (χ 2 /2) N/2 1 e χ2 /2 Γ(N/2) Here, N is the number of degrees of freedom. It is the number of normal variates whose squares are added to reproduce the χ 2. Examples of χ 2 distributions are shown in Figure 11. The shape of the curve depends on N. The expectation value is N, the variance is 2N, the Skewness is 2(2/N) 1/2, and the Kurtosis is 12/N. Asymptotically, χ 2 (N) becomes a normal distribution. In statistical applications we are usually interested in the shaded area to the right of a particular value of χ 2, as shown in the Figure 11. This must be obtained numerically from tables, for example Table 12. However, for large N the normalized chi-square, χ 2 /N is normally distributed with a variance of 1. 20

Figure 10: A series of figures showing that the convergence to a probability based on a model and data. The result as n is independent on the initial prior 11 Exponential Distribution The exponential distribution has a probability density function of the form; P(X) = λ e λx The expectation value is 1/λ, the variance is (1/λ) 2, the Skewness is 2 and the Kurtosis 21

Figure 11: The figuure on the left shows χ 2 plots for various degrees of freedom, N, vs the number of random variables, X. The figure on the right shows the χ 2 probability for N = 6 as a function of X, indicating the area for X > 7 which represents the error in a fit. is 6. The exponential distribution has no memory. If and event occurs up to time y, the probability of no event in a subsequent time is independent of y. The distribution applies to particles arriving at a counter. We obtained this distribution previously when considering the Poisson distribution. The probability of N events occurring in a time interval t is; P(N t) = (λt)n N! e λt The probability of no events is therefore, e λt. 12 Breit Wigner Distribution This distribution is identical to the Cauchy distribution, and is important as an approximation to a resolution function. This distribution is a pathalogical case as it does not have an expectation value, variance, Skew, or Kurtosis unless the distribution has a cutoff, since the integrals defining these properties do not converge. The distribution takes the form; P(X) = (1/π)[ Γ Γ 2 + (X X 0 ) 2 22

In the above, X 0, and Γ represent the location and scale parameters, Γ being the half-width at half-height, respectively. 23

Figure 12: A table of χ2 probability for various N. Note that χ2 /N 1 gives close to the standard deviation (ie for large N) 24