Computational modeling Lecture 4 : Central Limit Theorem Theory: Normal distribution Programming: Arrays Instructor : Cedric Weber Course : 4CCP1000
Schedule Class/Week Chapter Topic Milestones 1 Monte Carlo UNIX system / Fortran 2 Monte Carlo Fibonacci sequence 3 Monte Carlo Random variables 4 Monte Carlo Central Limit Theorem 5 Monte Carlo Monte Carlo integration Milestone 1 6 Differential equations The Pendulum 7 Differential equations A Quantum Particle in a box 8 Differential equations The Tacoma bridge Milestone 2 9 Linear Algebra System of equations 10 Linear Algebra Matrix operations Milestone 3 2
What did you learn last time? 1. Summing up sequences with a do loop " " "ý " " 2. Defining your own functions " ý " 3. Using a random generator to produce random values ý " 4. Generating files and plotting them with xmgrace " "ý 5. Using modules to build up your own library" " "ý " 3
Where are we going? 100% Lecture 1: Introduction 90% 80% 70% Lecture 2: Fibonacci sequence / do loop Lecture 3: Random variables / functions 60% 50% 40% 30% 20% 10% Milestone 1 : Monte Carlo integration Lecture 4: Summing up random variables / multiple events statistics 0% week 1 week 2 week 3 week 4 week 5 week 6 week 7 week 8 week 9 week 10 4
Part 1 : Theory Central limit theorem 5
Jacob Bernoulli (1654-1705): Game Theory q Simple idea: lets define the outcome of an experiment X i, and the result of the experiment is either X i =0 or X i =1. When I repeat the experiment, I sometimes get X i =0, and sometimes I obtain X i =1" q Letʼs think of X i as a coin, which is flipped (head/tail). " q Experiment: flip the coin 1000 times" q how many tail, how many head?" q Definition of a Probability: " q P(head) = N(head) / total P(tail) = N(tail) / total " " q Reference: The Life and Times of the Central Limit Theorem, William J. Adams" = 500 = 500 6
Flipping many coins Now, what if I throw many coins at the same time? Each coin is labeled with an index We attribute the value 0 to tail, 1 to head X =(X 1,X 2,X 3,X 4,X 5,X 6,X 7 ) 0 0 1 0 0 1 1 To describe the outcome of this experiment, we define the variable S n : S n = X 1 + X 2 +... + X n In the experiment above I get : S n = 3 Now, I throw the coins again We get (fill the blank) : S n = [ ]? 1 1 1 0 0 1 0
I repeat the experiments many times I throw 7 coins, count how many heads I obtain, I do it again I obtain the sequence : Central limit theorem S=3, S=4, S=1, S=7, S=4, S=3, S=4, S=5, S=4, S=0, S=2.. I count how many times I obtain S=0, how many times I obtain S=1,, and how many times I obtain S=7 Number of occurence Question: if I throw the coins 700 times, how many times will you obtain the result S=0? As many times as you obtain S=1, S=2, S=3, S=4, S=5, S=6 and S=7 right? S=0 S=1 S=2 S=3 S=4 S=5 100 75 50 25 0 S=6 S=7 8
Simulating this experiment with a computer This program will be written in the practice session Result : The number of S 1,S 2,S 3 S 7 obtained 200 150 100 50 S=0 S=1 S=2 S=3 S=4 S=5 0 This histogram gives the probability P(S) to obtain S, if we divide the number of occurrence of each S i by the total number of experiments (700 in this case) Who can spot something wrong in the histogram? 9 S=6 S=7
Seriously? Nope, I am not cheating you Central limit theorem (CLT) : the sum of a large collection of random variables (S n = X 1 + + X n ) is distributed as a binomial law (or gaussian) P (S) =e (S µ)2 /(2σ 2 ) Where m is called the average and s the variance First discussed by Abraham De Moivre (1667-1754) Physical experiment in a laboratory: There is a large number of unknown parameters which are uncontrolled and contribute to your physical measurement (measure velocity with a timer, ) CLT : by repeating the same measurement, you will obtain a distribution of results (data), this distribution will a gaussian centered around the average value (m=your final measurement) and with a given width (s=your error bars) 10
Gaussian function m=0, s=1 m=0, s=3 m=1, s=1 Average m Variance s 11
Example 1 : gambling (Dices) Problem to be solved in the practice session Throwing to the casino a pair of dice at the same time Sum of a pair of dice takes values from 2 to 12 Where do you put your bet? S=2? Rolling many dices : Distribution is again gaussian 12
Example 2: Galton q Balls are bumping against many ticks during their fall q Balls are collected at the end of the free fall q Gaussian distribution http://www.elica.net (software to perform simple model calculations) 13
Example 3: population height Distribution of people according to their height is a gaussian as well (top: college male students, bottom: men in black and women in white) Pr Joiner s students, class of 1975, Penn state Joiner, B. L. (1975), Living Histograms, International Statistical Review, 3, 339 340. Connecticut State College (J. Heredity 5:511 518, 1914). 14
Fuzzy CLT 15 Larry Gonick, The Cartoon Guide to Statistics, New York, NY : Collins Reference / HarperPerennial 1993, p. 83.
Normal is everywhere 16
Convergence to the Normal distribution by increasing the number of coins to ~1000 average variance 17
Part 2 : Programming Arrays 18
Collection of variables 1. Program something 2. Integer(4) :: x1,x2,x3,x4,x5 3. Integer(4) :: x6,x7 4. x1 = 1 5. x2 = 2 6. x3 = 3 7. x4 = 4 8. x5 = 5 9. x6 = 6 10. x7 = 7 Ø We need to define a collection of variables Ø X1,.,X10 Ø We need to assign values to these variables Ø Time consuming! 11. end program 19
Arrays An array is a collection of variables, grouped under the same name myarray is the array s name for the array in this example Each box of the array corresponds to a real(8) variable, which can take any value To access a particular variable in the group, the position of the variable in the array needs to be provided, example : write(*,*) myarray(2) 20
Arrays Variables real(8) aa = 16 :: aa Arrays real(8) :: myarray(7) myarray(2) = 16 16 16 21
Variables real(8) :: aa aa = 16 Arrays Arrays real(8) :: myarray(7) myarray(2) = 16 Accessing member 2 Number of members (7) 16 16 22
Array: Global assignment real(8) ":: myarray(3)" " myarray(2) = 0" real(8) ":: myarray(3)" " myarray = 0" How many members/elements does the array myarray have?" " "Answer : " " [FILL IN]" What do you think the difference is between those two operations? " On the left side, "which element is set to zero? " On the right side, "which element is set to zero? " 23
Array: Global assignment real(8) ":: myarray(3)" " myarray(2) = 0" real(8) ":: myarray(3)" " myarray = 0" 0 0 0 0 24
Array: global operations 1. program testarray 2. real(8) :: myarray(10) 3. myarray = 2.0 Is this operation global and affects the whole array, or is it local and only affects one element? [Tick] global local 4. write(*,*) the second element of the array has the value:, myarray(2) 5. myarray = sin ( myarray ) 6. end program 25
Defining an array s dimension The last index of the array is y real(8) :: myarray( x : y ) The first index of the array is x (if not specified x=1) by default : x=1 x can be zero : x=0 x can be negative : x=- 5 Real(8):: myarray(1:5), myarray(5) Real(8):: myarray(0:7) Real(8):: myarray(- 5:5) 26
Arrays & Coins What we will practice today : we define an array which contains the number of outcomes obtained by repeating the experiment of throwing 7 coins. Each box of the array is counting the number of outcomes obtained with a given sum of coins, S=0, S=1, S=2, S=3, S=4, S=5, S=7 Each time an experiment is realized, we measure S, and we increment myarray(s): myarray(s) = myarray(s)+1" myarray is the histogram of the obtained S 2
Outcome of experiment: Arrays & flipping coins S=3 0 0 1 0 0 1 1 We keep track that we obtained one more event S=3 by repeating our experiment : " " " " "myarray(3) = myarray(3) + 1" S=4 1 1 1 0 0 1 0 We now want to keep track that we obtained S=4: " " " " "myarray(4) = myarray(4) + 1" 28
Summation of all values contained in the array myarray : y=sum(myarray)" Returns the minimum value contained in myarray : y=minval(myarray) Returns the maximum value contained in myarray : y=maxval(myarray)" Functions of arrays Returns the dimension of the array (i in this example should be an integer) : i =size(myarray) 29
Combined conditional statements BLOCK 1 BLOCK 2 1. Program something 2. Implicit none 3. Real(8) :: a 4. a=0.2 5. If ( a > 0.1 && a < 0.5 ) then 6. write(*,*) It is true 7. else 8. write(*,*) It is false 9. end if The if statement executes: the block 1 if the condition a>0.1 and the condition a<0.5 are satisfied The block 2 if either of a>0.1 or a<0.5 is NOT satisfied 10. end program 30
Practice Problems u Exercice 1 : Central limit theorem, convergence to a Normal distribution with random variables (X i =0,1) u Exercice 2 : Rolling dices at the casino 31