Statistics, Data Analysis, and Simulation SS 2015

Similar documents
Statistics, Data Analysis, and Simulation SS 2013

Statistics, Data Analysis, and Simulation SS 2017

Numerical Methods for Data Analysis

Statistics and data analyses

Lecture 2: Repetition of probability theory and statistics

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

Statistics, Data Analysis, and Simulation SS 2015

3. Probability and Statistics

Random Variables and Their Distributions

Gaussian vectors and central limit theorem

Algorithms for Uncertainty Quantification

Probability and Distributions

Formulas for probability theory and linear models SF2941

P (x). all other X j =x j. If X is a continuous random vector (see p.172), then the marginal distributions of X i are: f(x)dx 1 dx n

Statistical Methods in Particle Physics

1.1 Review of Probability Theory

Statistics, Data Analysis, and Simulation SS 2013

Random Variables. P(x) = P[X(e)] = P(e). (1)

STAT Chapter 5 Continuous Distributions

ECE 541 Stochastic Signals and Systems Problem Set 9 Solutions

LIST OF FORMULAS FOR STK1100 AND STK1110

2 Functions of random variables

Statistical Methods in Particle Physics

EE4601 Communication Systems

Chapter 2. Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables

A Probability Review

STAT 430/510: Lecture 16

Continuous Random Variables

Joint Probability Distributions and Random Samples (Devore Chapter Five)

Lecture 2: Review of Probability

STA 256: Statistics and Probability I

Let X and Y denote two random variables. The joint distribution of these random

Revised May 1996 by D.E. Groom (LBNL) and F. James (CERN), September 1999 by R. Cousins (UCLA), October 2001 and October 2003 by G. Cowan (RHUL).

Mathematical statistics: Estimation theory

Ch. 5 Joint Probability Distributions and Random Samples

Continuous Random Variables and Continuous Distributions

conditional cdf, conditional pdf, total probability theorem?

18.440: Lecture 28 Lectures Review

18.175: Lecture 15 Characteristic functions and central limit theorem

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.

1 Review of Probability and Distributions

Exercises and Answers to Chapter 1

18 Bivariate normal distribution I

EE 302: Probabilistic Methods in Electrical Engineering

Statistics STAT:5100 (22S:193), Fall Sample Final Exam B

Joint p.d.f. and Independent Random Variables

Chapter 5 Joint Probability Distributions

PCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities

18.440: Lecture 28 Lectures Review

Sampling Distributions

E[X n ]= dn dt n M X(t). ). What is the mgf? Solution. Found this the other day in the Kernel matching exercise: 1 M X (t) =

Sampling Distributions

Lecture The Sample Mean and the Sample Variance Under Assumption of Normality

Statistics 351 Probability I Fall 2006 (200630) Final Exam Solutions. θ α β Γ(α)Γ(β) (uv)α 1 (v uv) β 1 exp v }

Multiple Random Variables

Lecture 1: August 28

1 Presessional Probability

Quick Tour of Basic Probability Theory and Linear Algebra

Review of Statistics I

Week 1 Quantitative Analysis of Financial Markets Distributions A

Math 3215 Intro. Probability & Statistics Summer 14. Homework 5: Due 7/3/14

1: PROBABILITY REVIEW

Covariance. Lecture 20: Covariance / Correlation & General Bivariate Normal. Covariance, cont. Properties of Covariance

Multiple Random Variables

p. 6-1 Continuous Random Variables p. 6-2

Introduction to Statistical Inference Self-study

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester

Multivariate distributions

ECE 302 Division 2 Exam 2 Solutions, 11/4/2009.

Brief Review of Probability

Chapter 5 continued. Chapter 5 sections

Lectures on Statistical Data Analysis

Statistical Pattern Recognition

Chapter 4 Multiple Random Variables

EEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as

Exam P Review Sheet. for a > 0. ln(a) i=0 ari = a. (1 r) 2. (Note that the A i s form a partition)

Distributions of Functions of Random Variables. 5.1 Functions of One Random Variable

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

1 Exercises for lecture 1

Expectation. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

LECTURE NOTES FYS 4550/FYS EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2013 PART I A. STRANDLIE GJØVIK UNIVERSITY COLLEGE AND UNIVERSITY OF OSLO

Limiting Distributions

BASICS OF PROBABILITY

Continuous Random Variables

Lecture 11. Probability Theory: an Overveiw

Actuarial Science Exam 1/P

Lecture 22: Variance and Covariance

n! (k 1)!(n k)! = F (X) U(0, 1). (x, y) = n(n 1) ( F (y) F (x) ) n 2

1 Random Variable: Topics

1.6 Families of Distributions

Basic concepts of probability theory

Basics on Probability. Jingrui He 09/11/2007

Chapter 2: Random Variables

Bivariate distributions

MATH c UNIVERSITY OF LEEDS Examination for the Module MATH2715 (January 2015) STATISTICAL METHODS. Time allowed: 2 hours

ENGG2430A-Homework 2

Introduction to Computational Finance and Financial Econometrics Matrix Algebra Review

STA2603/205/1/2014 /2014. ry II. Tutorial letter 205/1/

[POLS 8500] Review of Linear Algebra, Probability and Information Theory

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics

Transcription:

Statistics, Data Analysis, and Simulation SS 2015 08.128.730 Statistik, Datenanalyse und Simulation Dr. Michael O. Distler <distler@uni-mainz.de> Mainz, 27. April 2015 Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 1 / 44

What we ve learned so far Fundamental concepts random variable, probability frequentist vs. bayesian interpretation probability mass function, probability density function, cumulative distribution function expectation values and moments Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 2 / 44

Definitions probability mass function (pmf) probability density function (pdf) of a measured value (=random variable) f(n) 0.1 0.09 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0 0 5 10 15 20 25 30 n f(x) 0.1 0.09 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0 0 5 10 15 20 25 30 f (n) discrete f (x) continuous Normalization: f (n) 0 f (n) = 1 f (x) 0 f (x) dx = 1 Probability: n p(n 1 n n 2 ) = n 2 x n 1 f (n) p(x 1 x x 2 ) = x2 x 1 f (x)dx Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 3 / 44

Expectation values and moments Mean: A random variable X takes on the values X 1, X 2,..., X n with probability p(x i ), then the expected value of X ( mean ) is X = X = n X i p(x i ) i=1 The expected value of an arbitrary function h(x) for a continuous random variable is: E[h(x)] = The mean ist the expected value of x: E[x] = x = h(x) f (x)dx x f (x)dx Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 4 / 44

Expectation values and moments Moments are the expected value of x n and of (x x ) n. They are called nth algebraic moment µ n and nth central moment µ n, respectivly. Skewness v(x) is a measure of the asymmetry of the probability distribution of a random variable x: v = µ 3 σ 3 = E[(x E[x])3 ] σ 3 Kurtosis is a measure of the peakedness of the probability distribution of a random variable x. β 2 = µ 4 σ 4 = E[(x E[x])4 ] σ 4 γ 2 = β 2 3 (excess kurtosis) Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 5 / 44

Binomial distribution The binomial distribution is the discrete probability distribution of the number of successes r in a sequence of n independent yes/no experiments, each of which yields success with probability p (Bernoulli experiment). P(r) = ( n r ) p r (1 p) n r P(r) is normalized. Proof: Binomial theorem with q = 1 p. The mean of r is: n r = E[r] = rp(r)= np The variance σ 2 is V [r] = E[(r r ) 2 ] = r=0 n (r r ) 2 P(r)= np(1 p) r=0 Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 6 / 44

Poisson distribution The Poisson distribution ist given by: The mean is: The variance is: The skewness is: P(r) = µr e µ r! r = µ V [r] = σ 2 = µ v = µ 3 σ 3 = 1 µ The excess kurtosis is: γ 2 = 1 µ 0.6 0.5 0.4 µ = 0.5 0.3 0.2 0.1 0 0 2 4 6 8 10 0.35 0.3 0.25 0.2 µ = 2 0.15 0.1 0.05 0 0 2 4 6 8 10 0.6 0.5 0.4 µ = 1 0.3 0.2 0.1 0 0 2 4 6 8 10 0.35 0.3 0.25 0.2 µ = 4 0.15 0.1 0.05 0 0 2 4 6 8 10 Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 7 / 44

Uniform distribution This probability distribution is constant in between the limits x = a and x = b: f (x) = Mean and variance: { 1 b a a x < b 0 otherwise x = E[x] = a + b 2 V [x] = σ 2 = (b a)2 12 Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 8 / 44

Gaussian distribution The most important probability distribution - also called normal distribution: f (x) = 1 e (x µ)2 2σ 2 2πσ The Gaussian distribution has two parameters, the mean µ and the variance σ 2. The probability distribution with mean µ = 0 and variance σ 2 = 1 is named standard normal distribution or short N(0, 1). The Gaussian distribution can be derived from the binomial distribution for large values of n and r and similarly from the Poisson distribution for large values of µ. Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 9 / 44

Gaussian distribution 1 1 2 2 3 3 dx N(0, 1) = 0.6827 = (1 0.3173) dx N(0, 1) = 0.9545 = (1 0.0455) dx N(0, 1) = 0.9973 = (1 0.0027) FWHM: useful to estimate the standard deviation: FWHM = 2σ 2ln2 = 2.355σ Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 10 / 44

press any key Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 11 / 44

Chi-square distribution If x 1, x 2,..., x n are independend random variables distributed according to the standard Gaussian distribution with mean 0 and variance 1, then the sum u = χ 2 = n i=1 x 2 i ist distributed according to a χ 2 distribution f n (u) = f n (χ 2 ) where n is called the number of degrees of freedom. f n (u) = ( 1 u ) n/2 1 2 2 e u/2 Γ(n/2) The χ 2 distribution has a maximum at (n 2). The mean is found to be n and the variance is 2n. Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 12 / 44

Chi-square distribution 0.3 0.25 0.2 pdf(2,x) pdf(3,x) pdf(4,x) pdf(5,x) pdf(6,x) pdf(7,x) pdf(8,x) pdf(9,x) 0.15 0.1 0.05 0 0 2 4 6 8 10 Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 13 / 44

Chi-square cumulative distribution function The probability for χ 2 n to take on a value in the interval [0, x]. 1 0.8 cdf(2,x) cdf(3,x) cdf(4,x) cdf(5,x) cdf(6,x) cdf(7,x) cdf(8,x) cdf(9,x) 0.6 0.4 0.2 0 0 2 4 6 8 10 Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 14 / 44

χ 2 vs. χ 2 red. Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 15 / 44

Chi-square distribution with 5 d.o.f. 0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 95% c.l. [0.831... 12.83] 0 0 2 4 6 8 10 12 14 Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 16 / 44

Gamma distribution The goal is to calculate the probability density function of f (t) for the time difference t between two events, when events occur at a mean rate λ. Example: the radioactive decay with a mean decay rate λ. The probability density distribution of the gamma distribution is given by: f (x; k) = x k 1 e x Γ(k) mit Γ(z) = 0 t z 1 e t dt; Γ(z+1) = z! this is the wait time t = x from the first to the kth event of Poisson-distributed process with mean µ = 1 an. The generalization for other values of µ is f (x; k, µ) = x k 1 µ k e µx Γ(k) Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 17 / 44

Gamma distribution 1 0.9 1.0*exp(-1.0*x) 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 1 2 3 4 5 Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 18 / 44

Characteristic function If x is a real random variable with the distribution function F(x) and the probability density function f (x), one referred to the expected value of exp(ıtx) as their characteristic function: ϕ(t) = E[exp(ıtx)] so in the case of continuous variables, a Fourier integral with its well-known transforming properties: ϕ(t) = exp(ıtx) f (x)dx f (x) = 1 2π Especially for the algebraic moments one gets: µ n = E[x n ] = ϕ (n) (t) = d n ϕ(t) dt n = ı n ϕ (n) (0) = ı n µ n x n f (x)dx x n exp(ıtx) f (x)dx exp( ıtx) ϕ(t)dt Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 19 / 44

1.5 Theorems The law of large numbers The law of large numbers (LLN) is a theorem that describes the result of performing the same experiment a large number of times. According to the law, the average of the results obtained from a large number of trials should be close to the expected value, and will tend to become closer as more trials are performed. We perform n independent experiments (Bernoulli trials) where the result j occurs n j times. p j = E[h j ] = E[n j /n] The variance of a Binomial distribution is: V [h j ] = σ 2 (h j ) = σ 2 (n j /n) = 1 n 2 σ2 (n j ) = 1 n 2 np j(1 p j ) From the product p j (1 p j ) which is 1 4, we can deduce the law of large numbers: σ 2 (h j ) < 1/n Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 20 / 44

The central limit theorem The central limit theorem (CLT) states conditions under which the mean of a sufficiently large number of independent random variables, each with finite mean and variance, will be approximately normally distributed. Let x i be a sequence of n independent and identically distributed random variables each having finite values of expectation µ and variance σ 2 > 0. In the limit n the random variable w = n i=1 x i will be normally distributed with mean w = n x and variance V [w] = nσ 2. Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 21 / 44

Illustration: The central limit theorem 0.5 0.5 N=1 0.4 Gauss 0.4 N=2 0.3 0.2 0.1 0-3 -2-1 0 1 2 3 0.3 0.2 0.1 0-3 -2-1 0 1 2 3 0.5 0.4 N=3 0.5 0.4 N=10 0.3 0.3 0.2 0.2 0.1 0.1 0-3 -2-1 0 1 2 3 0-3 -2-1 0 1 2 3 The sum of uniformly distributed random variables and the standard normal distribution. Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 22 / 44

Sampling e.g. selecting a random (or representative) subset of a population Sample of 100 measurements: l i /cm n i n i l i /cm n i li 2 /cm 2 18.9 1 18.9 357.21 19.1 1 19.1 364.81 19.2 2 38.4 737.28 19.3 1 19.3 372.49 19.4 4 77.6 1505.44 19.5 3 58.5 1140.75 19.6 9 176.4 3457.44 19.7 8 157.6 3104.72 19.8 11 217.8 4312.44 19.9 9 179.1 3564.09 20.0 5 100.0 2000.00 20.1 7 140.7 2828.07 20.2 8 161.6 3264.32 20.3 9 182.7 3708.81 20.4 6 122.4 2496.96 20.5 3 61.5 1260.75 20.6 2 41.2 848.72 20.7 2 41.4 856.98 20.8 2 41.6 865.28 20.9 2 41.8 873.62 21.0 4 84.0 1764.00 21.2 1 21.2 449.44 100 2002.8 40133.62 N = n i = 100 Mean? Variance? Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 23 / 44

Sampling e.g. selecting a random (or representative) subset of a population Sample of 100 measurements: l i /cm n i n i l i /cm n i li 2 /cm 2 18.9 1 18.9 357.21 19.1 1 19.1 364.81 19.2 2 38.4 737.28 19.3 1 19.3 372.49 19.4 4 77.6 1505.44 19.5 3 58.5 1140.75 19.6 9 176.4 3457.44 19.7 8 157.6 3104.72 19.8 11 217.8 4312.44 19.9 9 179.1 3564.09 20.0 5 100.0 2000.00 20.1 7 140.7 2828.07 20.2 8 161.6 3264.32 20.3 9 182.7 3708.81 20.4 6 122.4 2496.96 20.5 3 61.5 1260.75 20.6 2 41.2 848.72 20.7 2 41.4 856.98 20.8 2 41.6 865.28 20.9 2 41.8 873.62 21.0 4 84.0 1764.00 21.2 1 21.2 449.44 100 2002.8 40133.62 N = n i = 100 l = 1 ni l i = 20.028 cm N ( s 2 1 = ni li 2 1 ( ) ) 2 ni l i N 1 N = 0.2176 cm 2 Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 23 / 44

Sampling e.g. selecting a random (or representative) subset of a population Sample of 100 measurements: l i /cm n i n i l i /cm n i li 2 /cm 2 18.9 1 18.9 357.21 19.1 1 19.1 364.81 19.2 2 38.4 737.28 19.3 1 19.3 372.49 19.4 4 77.6 1505.44 19.5 3 58.5 1140.75 19.6 9 176.4 3457.44 19.7 8 157.6 3104.72 19.8 11 217.8 4312.44 19.9 9 179.1 3564.09 20.0 5 100.0 2000.00 20.1 7 140.7 2828.07 20.2 8 161.6 3264.32 20.3 9 182.7 3708.81 20.4 6 122.4 2496.96 20.5 3 61.5 1260.75 20.6 2 41.2 848.72 20.7 2 41.4 856.98 20.8 2 41.6 865.28 20.9 2 41.8 873.62 21.0 4 84.0 1764.00 21.2 1 21.2 449.44 100 2002.8 40133.62 N = n i = 100 l = 1 ni l i = 20.028 cm N ( s 2 1 = ni li 2 1 ( ) ) 2 ni l i N 1 N l = 0.2176 cm 2 s = l ± N = (20.028 ± 0.047) cm s = s s ± 2(N 1) = (0.466 ± 0.033) cm Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 23 / 44

Sampling 12 10 "length.dat" Gauß(µ=20.028,σ=0.466) Gauß(µ=20.0,σ=0.5) 8 Häufigkeit 6 4 2 0 18.5 19 19.5 20 20.5 21 21.5 Länge / cm Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 24 / 44

Sampling How likeli is it, that the sample was taken from a normal distributed population with the parameters µ = 20.028 cm and σ = 0.466 cm? Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 25 / 44

Sampling How likeli is it, that the sample was taken from a normal distributed population with the parameters µ = 20.028 cm and σ = 0.466 cm? Exercises Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 25 / 44

Numerical calculation of sample mean and variance Well known formulas: x = 1 n x i s 2 = 1 n n 1 i=1 n (x i x) 2. The calculation requires to loop twice over the whole data sample. This can be done in one loop (for large data samples): ( s 2 = 1 n (x i x) 2 = 1 n n ) 2 xi 2 1 x i. n 1 n 1 n i=1 i=1 Two Sums have to be calculated: n S x = x i S xx = i=1 Mean and variance are given by: x = 1 n S x s 2 = 1 n 1 i=1 n i=1 x 2 i i=1 ( S xx 1 ) n S2 x. Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 26 / 44

Numerical calculation of sample mean and variance This may require to subtract large numbers. Because the resolution of the number representation in your computer is often finite this may lead to numerical problems. In this case it is better to use a rough estimate x e (e.g. the first measured value) for the mean: T x = n (x i x e ) T xx = i=1 n (x i x e ) 2 i=1 This leads to x = x e + 1 n T x s 2 = 1 n 1 ( T xx 1 ) n T x 2. Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 27 / 44

1.6 multidimensional distributions Random variables in two dimensions The multidimensional probability distribution f (x, y) of the two random variables x and ỹ is defined by the probability to find the pair of variables ( x, ỹ) in the intervals a x < b and c ỹ < d P(a x < b, c ỹ < d) = d b c a f (x, y) dx dy Normalisation: f (x, y) dx dy = 1 If f (x, y) = h(x) g(y) then the two variables are independend. Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 28 / 44

Random variables in two dimensions The definition of mean and variance is straight forward: < x >= E[x] = x f (x, y) dx dy < y >= E[y] = y f (x, y) dx dy V [x] = (x < x >) 2 f (x, y) dx dy = σx 2 V [y] = (y < y >) 2 f (x, y) dx dy = σy 2 If z is a function of x, y: z = z(x, y) Then z is also a random variable < z > = z(x, y) f (x, y) dx dy σ 2 z = (z < z >) 2 Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 29 / 44

Random variables in two dimensions Example: Expected value of z: < z > = a z(x, y) = a x + b y x f (x, y) dx dy + b = a < x > + b < y > y f (x, y) dx dy uncomplicated Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 30 / 44

Random variables in two dimensions Variance: σ 2 z = = z(x, y) = a x + b y ((a x + b y) (a < x > + b < y >)) 2 ((a x a < x >) + (b y b < y >)) 2 = a 2 (x < x >) 2 +b 2 (y < y >) 2 } {{ } } {{ } σx 2 σy 2 +2ab (x < x >)(y < y >) }{{}?? < (x < x >)(y < y >) >= cov(x, y) covariance = σ xy = (x < x >)(y < y >) f (x, y) dx dy Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 31 / 44

Random variables in two dimensions Normalized covariance: cov(x, y) σ x σ y = ρ xy correlation coefficient gives a dimensionless measure of the level of correlation between two variables : 1 ρ xy 1 Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 32 / 44

Random variables in two dimensions The determinant of any covariance matrix is positive σ xy = σ2 xσy 2 σxy 2 = σxσ 2 y(1 2 ρ 2 ) 0 σ2 x σ xy σ 2 y Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 33 / 44

2-dim Gaussian distribution -2.7-2.8-2.9 Parameter a 2-3 -3.1-3.2-3.3 1.85 1.9 1.95 2 2.05 2.1 2.15 Parameter a 1 Probability content of the covariance ellipse: 39.3% Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 34 / 44

Covariance matrix in n dimensions In order to generalize the variance one defines the covariance matrix to be: V ij = ( x < x >)( x < x >) T The diagonal elements of the matrix V ij are the variances and the non-diagonal elements are the covariances: V ii = var(x i ) = (x i < x i >) 2 f ( x) dx 1 dx 2... dx n V ij = cov(x i, x j ) = (x i < x i >)(x j < x j >) f ( x) dx 1 dx 2... dx n. Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 35 / 44

Covariance matrix in n dimensions The covariance matrix V ij = var(x 1 ) cov(x 1, x 2 )... cov(x 1, x n ) cov(x 2, x 1 ) var(x 2 )... cov(x 2, x n )......... cov(x n, x 1 ) cov(x n, x 2 )... var(x n ) is a symmetric n n matrix: V ij = σ 2 1 σ 12... σ 1n σ 21 σ 2 2... σ 2n......... σ n1 σ n2... σ 2 n Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 36 / 44

1.7 Functions of random variables A function of a random variable is itself a random variable. Suppose x follows a pdf f x (x), consider a function y = y(x), what is the pdf f y (y)? f x (x) y = y(x) f y(y) Consider the interval (x, x + dx) (y, y + dy) In order to conserve the normalization the integrals have to be the same: f x (x)dx = f y (y)dy f y (y) = f x (x(y)) dx dy Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 37 / 44

Transformation of mean and variance Taylor expansion of the mean y(x) = y( x ) + (x x ) dy dx + 1 x= x 2 (x d 2 y x )2 dx 2 +... x= x up to order 2: E[y] y( x ) + E[x x ] dy dx + 1 x= x 2 E[(x x )2 ] d 2 y dx 2 }{{} =0 1 d 2 y y y( x ) + 2 σ2 x dx 2 x= x }{{} frequently disregarded x= x Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 38 / 44

Error propagation For the transformation of the variance we assume y y( x ) and expand y(x) in a neighborhood of the x : ( [ V [y] = E (y y ) 2] = E (x x ) dy ) 2 dx = ( 2 dy dx E x= x ) ( [ (x x ) 2] = error propagation for a single random variable. x= x dy dx x= x ) 2 σ 2 x Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 39 / 44

1.8 Folding Two random variables x and y are defined by their probability distributions f x (x) and f y (y). Obviously the sum w = x + y is also a random variable. The probability distribution for w is f w (w) which is calculated by folding x and y. f w (w) = f x (x)f y (y)δ(w x y) dx dy = f x (x)f y (w x) dx = f y (y)f x (w y) dy f w (w) = f x (x) f y (y) ϕ w (t) = ϕ x (t) ϕ y (t) characteristic function Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 40 / 44

Characteristic function If x is a real random variable with the distribution function F(x) and the probability density function f (x), one referred to the expected value of exp(ıtx) as their characteristic function: ϕ(t) = E[exp(ıtx)] so in the case of continuous variables, a Fourier integral with its well-known transforming properties: ϕ(t) = exp(ıtx) f (x)dx f (x) = 1 2π Especially for the algebraic moments one gets: µ n = E[x n ] = ϕ (n) (t) = d n ϕ(t) dt n = ı n ϕ (n) (0) = ı n µ n x n f (x)dx x n exp(ıtx) f (x)dx exp( ıtx) ϕ(t)dt Dr. Michael O. Distler <distler@uni-mainz.de> Statistics, Data Analysis, and Simulation SS 2015 41 / 44