Simulations. . p.1/25

Size: px

Start display at page:

Download "Simulations. . p.1/25"

Alison Terry
6 years ago
Views:

1 Simulations Computer simulations of realizations of random variables has become indispensable as supplement to theoretical investigations and practical applications.. p.1/25

2 Simulations Computer simulations of realizations of random variables has become indispensable as supplement to theoretical investigations and practical applications. We can easily investigate a large number of scenarios (different P s) and a large number of replications.. p.1/25

3 Simulations Computer simulations of realizations of random variables has become indispensable as supplement to theoretical investigations and practical applications. We can easily investigate a large number of scenarios (different P s) and a large number of replications. We can compute transformed distributions numerically.. p.1/25

4 Simulations Computer simulations of realizations of random variables has become indispensable as supplement to theoretical investigations and practical applications. We can easily investigate a large number of scenarios (different P s) and a large number of replications. We can compute transformed distributions numerically. We can compute (complicated) mean values this is known as Monte Carlo simulation.. p.1/25

5 Simulations Computer simulations of realizations of random variables has become indispensable as supplement to theoretical investigations and practical applications. We can easily investigate a large number of scenarios (different P s) and a large number of replications. We can compute transformed distributions numerically. We can compute (complicated) mean values this is known as Monte Carlo simulation. We can investigate the behaviour of methods for statistical inference.. p.1/25

6 Simulations Computer simulations of realizations of random variables has become indispensable as supplement to theoretical investigations and practical applications. We can easily investigate a large number of scenarios (different P s) and a large number of replications. We can compute transformed distributions numerically. We can compute (complicated) mean values this is known as Monte Carlo simulation. We can investigate the behaviour of methods for statistical inference. But how can the deterministic computer generate the outcome from a probability measure?. p.1/25

7 Generic simulation Two step procedure behind simulation of random variables The computer emulates the generation of independent, identically distributed random variables with the uniform distribution on the unit interval [0, 1].. p.2/25

8 Generic simulation Two step procedure behind simulation of random variables The computer emulates the generation of independent, identically distributed random variables with the uniform distribution on the unit interval [0, 1]. The emulated uniformly distributed random variables are by transformation turned into variables with the desired distribution.. p.2/25

9 Theorem behind The following result is behind the generic simulation procedure for simulation from any P on E: Theorem: Let P 0 denote the uniform distribution on [0, 1] and h : [0, 1] E a map with the transformed probability measure on E being P = h(p 0 ). Then X 1, X 2,...,X n defined by X i = h(u i ) are n iid random variables each with distribution P.. p.3/25

10 The real problem What we need in practice is thus the construction of a transformation that can transform the uniform distribution on [0, 1] to the desired probability distribution.. p.4/25

11 The real problem What we need in practice is thus the construction of a transformation that can transform the uniform distribution on [0, 1] to the desired probability distribution. We focus here on two cases A general method for discrete distributions. A general method for probability measures on R given in terms of the distribution function.. p.4/25

12 But what about the simulation of the independent, uniformly distributed random variables?. p.5/25

13 But what about the simulation of the independent, uniformly distributed random variables? Thats a completely different story. Read D. E. Knuth, ACP, Chapter 3 or trust that R behaves well and that runif works correctly.. p.5/25

14 But what about the simulation of the independent, uniformly distributed random variables? Thats a completely different story. Read D. E. Knuth, ACP, Chapter 3 or trust that R behaves well and that runif works correctly. We rely on a sufficiently good pseudo random number generator with the property that as long as we can not statistically detect differences from what the generator produces and true iid [0, 1]-uniformly distributed random variables, then we live happily in ignorance.. p.5/25

15 Discrete random variables If P is a probability measure on a discrete sample space E given by point probabilities p(x), x E choose for each x E an interval I(x) = (a(x), b(x)] [0, 1]. p.6/25

16 Discrete random variables If P is a probability measure on a discrete sample space E given by point probabilities p(x), x E choose for each x E an interval such that I(x) = (a(x), b(x)] [0, 1] the length, b(x) a(x), of I(x) equals p(x), and the intervals I(x) are mutually disjoint: I(x) I(y) = for x y.. p.6/25

17 Discrete random variables If P is a probability measure on a discrete sample space E given by point probabilities p(x), x E choose for each x E an interval such that I(x) = (a(x), b(x)] [0, 1] the length, b(x) a(x), of I(x) equals p(x), and the intervals I(x) are mutually disjoint: I(x) I(y) = for x y. Letting u 1,...,u n be generated by a pseudo random number generator we define x i = x if u i I(x) for i = 1,...,n. Then x 1,...,x n is a realization of n iid random variables with distribution having point probabilities p(x), x E.. p.6/25

18 Generalized inverse Definition: Let F : R [0, 1] be a distribution function. A function F : (0, 1) R that satisfies F(x) y x F (y) (0) for all x R and y (0, 1) is called a generalized inverse of F.. p.7/25

19 Generalized inverse If F has a true inverse (F is strictly increasing and continuous) then F equals the inverse, F 1, of F.. p.8/25

20 Generalized inverse If F has a true inverse (F is strictly increasing and continuous) then F equals the inverse, F 1, of F. All distribution functions has a generalized inverse we find it by solving the inequality F(x) y.. p.8/25

21 Continuous sample space We will simulate from P on R having distribution function F. First find the generalized inverse, F : (0, 1) R, of F.. p.9/25

22 Continuous sample space We will simulate from P on R having distribution function F. First find the generalized inverse, F : (0, 1) R, of F. Then we let u 1,...,u n be generated by a pseudo random number generator and we define x i = F (u i ) for i = 1,...,n. Then x 1,...,x n is a realization of n iid random variables with distribution having distribution function F.. p.9/25

23 Local alignments Assume that X 1,...,X n and Y 1,...,Y m are in total n + m iid random variables with values in the 20 letter amino acid alphabet E = { A,R,N,D,C,E,Q,G,H,I,L,K,M,F,P,S,T,W,Y,V }.. p.10/25

24 Local alignments Assume that X 1,...,X n and Y 1,...,Y m are in total n + m iid random variables with values in the 20 letter amino acid alphabet E = { A,R,N,D,C,E,Q,G,H,I,L,K,M,F,P,S,T,W,Y,V }. We want to find optimal local alignment and in particular we are interested in the score for optimal local alignment. This is a function h : E n+m R.. p.10/25

25 Local alignments Assume that X 1,...,X n and Y 1,...,Y m are in total n + m iid random variables with values in the 20 letter amino acid alphabet E = { A,R,N,D,C,E,Q,G,H,I,L,K,M,F,P,S,T,W,Y,V }. We want to find optimal local alignment and in particular we are interested in the score for optimal local alignment. This is a function h : E n+m R. An X- and a Y -subsequence that are matched letter by letter, matched letters are given a score, positive or negative, and gaps in the subsequences are given a penalty.. p.10/25

26 Local alignment scores Denote by S n,m = h(x 1,...,X n, Y 1,...,Y m ) the transformed, real valued random variable. What is the distribution of S n,m?. p.11/25

27 Local alignment scores Denote by S n,m = h(x 1,...,X n, Y 1,...,Y m ) the transformed, real valued random variable. What is the distribution of S n,m? We can in principle compute its discrete distribution from the distribution of the X- and Y -variables. p.11/25

28 Local alignment scores Denote by S n,m = h(x 1,...,X n, Y 1,...,Y m ) the transformed, real valued random variable. What is the distribution of S n,m? We can in principle compute its discrete distribution from the distribution of the X- and Y -variables futile and not possible in practice.. p.11/25

29 Local alignment scores Denote by S n,m = h(x 1,...,X n, Y 1,...,Y m ) the transformed, real valued random variable. What is the distribution of S n,m? We can in principle compute its discrete distribution from the distribution of the X- and Y -variables futile and not possible in practice. It is possible to use simulations, but it may be quite time-consuming and not a pratical solution for current database usage.. p.11/25

30 Local alignment scores Denote by S n,m = h(x 1,...,X n, Y 1,...,Y m ) the transformed, real valued random variable. What is the distribution of S n,m? We can in principle compute its discrete distribution from the distribution of the X- and Y -variables futile and not possible in practice. It is possible to use simulations, but it may be quite time-consuming and not a pratical solution for current database usage. Develop a good theoretical approximation.. p.11/25

31 Local alignment scores Under certain conditions on the scoring mechanism and the letter distribution a valid approximation, for n and m large, is for parameters λ, K > 0 P(S n,m x) exp( Knm exp( λx)). p.12/25

32 Local alignment scores Under certain conditions on the scoring mechanism and the letter distribution a valid approximation, for n and m large, is for parameters λ, K > 0 P(S n,m x) exp( Knm exp( λx)) This is a scale location transformation S n,m = log(knm) λ + S n,m λ where S n,m has a Gumbel distribution.. p.12/25

33 Statistical models Example: We measure the expression of out favorite gene number i on a microarray. The additive noise model reads that our measurement can be written as X i = µ i + σ i ǫ i, where µ i R, σ i > 0 and ǫ i has mean 0 and variance 1.. p.13/25

34 Statistical models Example: We measure the expression of out favorite gene number i on a microarray. The additive noise model reads that our measurement can be written as X i = µ i + σ i ǫ i, where µ i R, σ i > 0 and ǫ i has mean 0 and variance 1. If ǫ i N(0, 1) we have X i N(µ i, σi 2 ) and we have fully specified our model with unknown parameters (µ i, σ i ) R (0, ).. p.13/25

35 Statistical models Example: We want to consider pairs of nucleotides (X i, Y i ) that are evolutionary related. We assume that they are independent and identically distributed and that P(X 1 = x, Y 1 = y) = p(x)p t (x, y) p(x)( exp( 4αt)) if x = y = p(x)( exp( 4αt)) if x y.. p.14/25

36 Statistical models Example: We want to consider pairs of nucleotides (X i, Y i ) that are evolutionary related. We assume that they are independent and identically distributed and that P(X 1 = x, Y 1 = y) = p(x)p t (x, y) p(x)( exp( 4αt)) if x = y = p(x)( exp( 4αt)) if x y. The unknown parameters are α > 0 and the four-dimensional probability vector p. Perhaps t is also an unknown parameter.. p.14/25

37 Statistical models We need a sample space E.. p.15/25

38 Statistical models We need a sample space E. We need a parameter space Θ of unknown parameters... p.15/25

39 Statistical models We need a sample space E. We need a parameter space Θ of unknown parameters.... and for each θ Θ we need a probability measure P θ on E.. p.15/25

40 Statistical models We need a sample space E. We need a parameter space Θ of unknown parameters.... and for each θ Θ we need a probability measure P θ on E. We call (P θ ) θ Θ a parameterized family of probability measures.. p.15/25

41 Exponential distribution Let E 0 = [0, ), let θ (0, ), and let P θ be the distribution of n iid exponentially distributed random variables X 1,...,X n with intensity parameter θ.. p.16/25

42 Exponential distribution Let E 0 = [0, ), let θ (0, ), and let P θ be the distribution of n iid exponentially distributed random variables X 1,...,X n with intensity parameter θ. The distribution of X i has density f θ (x) = θ exp( θx) for x 0. The probability measure P θ on E = E0 n = (0, ) n has density f θ (x 1,...,x n ) = θ exp( θx 1 )... θ exp( θx n ) = θ n exp( θ(x x n )).. p.16/25

43 Exponential distribution Let E 0 = [0, ), let θ (0, ), and let P θ be the distribution of n iid exponentially distributed random variables X 1,...,X n with intensity parameter θ. The distribution of X i has density f θ (x) = θ exp( θx) for x 0. The probability measure P θ on E = E0 n = (0, ) n has density f θ (x 1,...,x n ) = θ exp( θx 1 )... θ exp( θx n ) = θ n exp( θ(x x n )). With Θ = (0, ) the family (P θ ) θ Θ of probability measures is a statistical model on E.. p.16/25

44 Estimators Definition: An estimator is a map ˆθ : E Θ. For a given observation x E the value of ˆθ at x, ˆϑ = ˆθ(x), is called the estimate of θ.. p.17/25

45 Estimators Definition: An estimator is a map ˆθ : E Θ. For a given observation x E the value of ˆθ at x, ˆϑ = ˆθ(x), is called the estimate of θ. If X has distribution P θ, the transformed random variable ˆθ(X) is also called the estimator it has distribution ˆθ(P θ ).. p.17/25

46 Identifiability Definition: The parameter θ is said to be identifiable if the map θ P θ is one-to-one. That is, for two different parameters θ 1 and θ 2 the corresponding measures P θ1 and P θ2 differ.. p.18/25

47 Identifiability Definition: The parameter θ is said to be identifiable if the map θ P θ is one-to-one. That is, for two different parameters θ 1 and θ 2 the corresponding measures P θ1 and P θ2 differ. We can not in a meaningful way estimate an unknown parameter that is not identifiable!. p.18/25

48 Simulations Write a function, my.rexp, that takes two parameters such that > tmp <- my.rexp(10, 1) generates the realization of 10 iid random variables with the exponential distribution with parameter λ = 1. How do you make the second parameter to be equal to 1 by default such that > tmp <- my.rexp(10) produces the same result?. p.19/25

49 Solution > my.rexp <- function(n, lambda) { + -log(runif(n))/lambda + } To make λ = 1 by default we define instead > my.rexp <- function(n, lambda = 1) { + -log(runif(n))/lambda + } Note that we have used that if U is uniformly distributed on [0, 1] then 1 U is uniformly distributed on [0, 1] to get rid of unnecessary computations.. p.20/25

50 Maximum of random variables Use > tmp <- replicate(1000, max(rexp(10, 1))) to generate 1000 replications of the maximum of 10 independent exponential random variables. Plot the distribution function for the Gumbel distribution with location parameter log(10) and compare it with > emdf <- function(x) sapply(x, function(x) sum(tmp <= x)/1000) What if we take the max of 100 exponential random variables?. p.21/25

51 Solutions > x <- seq(0, 5, by = 0.1) > plot(x, exp(-exp(-(x - log(10)))), type = "l") > points(x, emdf(x), type = "p", pch = 20, col = "red"). p.22/25

52 Solutions x. p.23/25 exp( exp( (x log(10))))

53 Solutions > tmp <- replicate(1000, max(rexp(100, 1))) > emdf <- function(x) sapply(x, function(x) sum(tmp <= x)/1000) > x <- seq(0, 8, by = 0.1) > plot(x, exp(-exp(-(x - log(100)))), type = "l") > points(x, emdf(x), type = "p", pch = 20, col = "red"). p.24/25

54 Solutions x. p.25/25 exp( exp( (x log(100))))

(NRH: Sections 2.6, 2.7, 2.11, 2.12 (at this point in the course the sections will be difficult to follow))

Curriculum, second lecture: Niels Richard Hansen November 23, 2011 NRH: Handout pages 1-13 PD: Pages 55-75 (NRH: Sections 2.6, 2.7, 2.11, 2.12 (at this point in the course the sections will be difficult