# Numerical integration and importance sampling

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 2 Numerical integration and importance sampling 2.1 Quadrature Consider the numerical evaluation of the integral I(a, b) = b a dx f(x) Rectangle rule: on small interval, construct interpolating function and integrate over interval. Polynomial of degree 0 using mid-point of interval: (a+1)h ah dx f(x) h f ((ah + (a + 1)h)/2). Polynomial of degree 1 passing through points (a 1, f(a 1 )) and (a 2, f(a 2 )): Trapezoidal rule f(x) = f(a 1 ) + x a a2 ( ) 1 a2 a 1 (f(a 2 ) f(a 1 )) dx f(x) = (f(a 1 ) + f(a 2 )). a 2 a 1 a 1 2 Composing trapezoidal rule n times on interval (a, b) with even sub-intervals [kh, (k + 1)h] where k = 0,..., n 1 and h = (b a)/n gives estimate ( ) b dx f(x) b a f(a) + f(b) n 1 + f(a + kh). n 2 a 21 k=1

2 22 2. NUMERICAL INTEGRATION AND IMPORTANCE SAMPLING Simpsons rule: interpolating function of degree 2 composed n times on interval (a, b): b a dx f(x) b a 3n Error bounded by [f(a) + 4f(a + h) + 2f(a + 2h) + 4f(a + 3h) + 2f(a + 4h) + + f(b)]. h (b a) f (4) (ξ). where ξ (a, b) is the point in the domain where the magnitude of the 4th derivative is largest. Rules based on un-even divisions of domain of integration w(x) = weight function. I(a, b) n w if i, where w i = w(x i ) and f i = f(x i ) with choice of x i and w i based on definite integral of polynomials of higher order. Example is Gaussian quadrature with n points based on polynomial of degree 2n 1: well-defined procedure to find {x i } and {w i } (see Numerical Recipes). Error bounds for n-point Gaussian quadrature are (b a) 2n+1 (2n + 1)! (n!) 4 [(2n)!] 3 f (2n) (ξ) for ξ (a, b).. For multi-dimensional integrals, must place n i grid points along each i dimension. Number of points in hyper-grid grows exponentially with dimension. Unsuitable for high-dimensional integrals. 2.2 Importance Sampling and Monte Carlo Suppose integrand f(x) depends on multi-dimensional point x and that integral over hypervolume I = dx f(x) V is non-zero only in specific regions of the domain. We should place higher density of points in region where integrand is large. Define a weight function w(x) that tells us which regions are significant.

3 2.2. IMPORTANCE SAMPLING AND MONTE CARLO 23 Require property w(x) > 0 for any point x in volume. Sometimes use normalized weight function so dx w(x) = 1, though not strictly V necessary. Re-express integral as: I = dx f(x) w(x) w(x). V Idea: Draw a set of N points {x 1,..., x N } from the weight function w(x) then I = 1 N N f(x i ) w(x i ). As N, I I. How does this improve the rate of convergence of the calculation of I? We will see that the statistical uncertainty is related to the variance σi 2 of the estimate of I, namely σ 2 I = 1 N N i I i I i where I i = f(x i) w(x i ) I. and we have assumed that the random variables I i are statistically independent. Here represents the average over the true distribution of f/w that is obtained in the limit N. Vastly different values of ratio f(x i )/w(x i ) lead to large uncertainty. The error is minimized by minimizing σ 2 I. If α w(x i ) = f(x i ), then f(x i )/w(x i ) = α and ( ) 2 f(xi ) f(xi ) = I = α = α 2, w(x i ) w(x i ) and σ 2 I = 0. Note that writing w(x i ) = f(x i )/α requires knowing α = I, the problem we are trying to solve. Generally desire all f(x i )/w(x i ) to be roughly the same for all sampled points x i to mimimize σ 2 I. Example in 1-dimensional integral I = b dx f(x) = b a a dx f(x) w(x) w(x).

4 24 2. NUMERICAL INTEGRATION AND IMPORTANCE SAMPLING Monte-Carlo sampling: use random sampling of points x with weight w(x) to estimate ratio f(x)/w(x). How do we draw sample points with a given weight w(x)? Consider simplest case of a uniform distribution on interval (a, b): w(x) = { 1 b a if x (a, b). 0 otherwise. Use a random number generator that gives a pseudo-random number rand in interval (0, 1). x i = a + (b a) rand, then {x i } are distributed uniformly in the interval (a, b). We find that the estimator of f is then: I = 1 N f(x i ) N w(x i ) = b a N f(x i ) = 1 N N N I k, where each I k = (b a) f(x k ) is an independent estimate of the integral. Like trapezoidal integration but with randomly sampled points {x i } from (a, b) each with uniform weight. How about other choices of weight function w(x)? It is easy to draw uniform y i (0, 1). Now suppose we map the y i to x i via y(x) = x a dz w(z). Note that y(a) = 0 and y(b) = 1 if w(x) is normalized over (a, b). How are the x distributed? Since the y are distributed uniformly over (0, 1), the probability of finding a value of y in the interval is dy(x) = w(x) dx, so the x are distributed with weight w(x). It then follows that the one-dimensional integral can be written in the transformed variables y as: I = b a dx w(x) f(x) w(x) = y(b) y(a) k=1 dy f(x(y)) w(x(y)) = 1 0 dy f(x(y)) w(x(y)) Integral easily evaluated by selecting uniform points y and then solving x(y). Must be able to solve for x(y) to be useful.

5 2.2. IMPORTANCE SAMPLING AND MONTE CARLO 25 Procedure: 1. Select N points y i uniformly on (0, 1) using y i = rand. 2. Compute x i = x(y i ) by inverting y(x) = x dz w(z). a 3. Compute estimator for integral I = 1 N N f(x i)/w(x i ). This procedure is easy to do using simple forms of w(x). Suppose the integrand is strongly peaked around x = x 0. One good choice of w(x) might be a Gaussian weight w(x) = 1 2πσ 2 e (x x 0 )2 2σ 2, where the variance (width) of the Gaussian is treated as a parameter. If I = dx f(x) = dx w(x)f(x)/w(x), can draw randomly from the Gaussian weight by: 1 x y(x) = d x e ( x x 0 ) 2 2σ 2 2πσ 2 = πσ 2 x = x x0 2σ π 0 0 d x e ( x x 0 ) 2 2σ 2 dw e w2 = 1 2 ( ( )) x x0 1 + erf. 2σ Inverting gives x i = x 0 + 2σ ierf(2y i 1), where ierf is the inverse error function with series representation π ierf(z) = (z + π12 ) 2 z3 + 7π2 480 z The estimator of the integral is therefore I = 1 N 2πσ 2 N f(x i )e (x i x 0 ) 2 2σ 2. This estimator reduces the variance σ 2 I if w(x i) and f(x i ) resemble one another. Another way to draw from a Gaussian: Draw 2 numbers, y 1 and y 2 uniformly on (0, 1). Define R = 2 ln y 1 and θ = 2πy 2. Then x 1 = R cos θ and x 2 = R sin θ are distributed with density w(x 1, x 2 ) = 1 2π e x2 1e x2 2

6 26 2. NUMERICAL INTEGRATION AND IMPORTANCE SAMPLING since dy 1 dy 2 = y y 1 x 1 y 2 1 x 2 y 2 x 1 x 2 dx 1dx 2 = w(x 1, x 2 )dx 1 dx Markov Chain Monte Carlo Ensemble averages Generally, we cannot find a simple way of generating the {x i } according to a known w(x i ) for high-dimensional systems by transforming from a uniform distribution. Analytical integrals for coordinate transformation may be invertable only for separable coordinates. Many degrees of freedom are coupled so that joint probability is complicated and not the product of single probabilities. Typical integrals are ensemble averages of the form: A = 1 Z V dr (N) e βu(r(n)) A(r (N) ) = V dr(n) e βu(r(n)) A(r (N) ) V dr(n) e βu(r(n) ) Typical potentials are complicated functions of configuration r (N). A good importance sampling weight would set w(r (N) ) = e βu(r(n)). How do we sample configurations from a general, multi-dimensional weight? Goal: Devise a method to generate a sequence of configurations {r (N) 1,..., r (N) n } in which the probability of finding a configuration r (N) i in the sequence is given by w(r (N) i )dr (N) i. We will do so using a stochastic procedure based on a random walk Markov Chains We will represent a general, multi-dimensional configurational coordinate that identifies the state by the vector X. We wish to generate a sequence of configurations X 1,..., X n where each X t is chosen with probability density P t (X t ). To generate the sequence, we use a discrete-time, time-homogeneous, Markovian random walk. Consider the configuration X 1 at an initial time labelled as 1 that is treated as a random variable drawn from some known initial density P 1 (X).

7 2.3. MARKOV CHAIN MONTE CARLO 27 We assume the stochastic dynamics of the random variable is determined by a timeindependent transition function K(Y X), where the transition function defines the probability density of going from state Y to state X in a time step. Since K is a probability density, it must satisfy dx K(Y X) = 1 K(Y X) 0. The state X t at time t is assumed to be obtained from the state at time X t 1 by a realization of the dynamics, where the probability of all transitions is determined by K. At the second step of the random walk, the new state X 2 is chosen from P 2 (X X 1 ) = K(X 1 X). At the third step, the state X 3 is chosen from P 3 (X X 2 ) = K(X 2 X), and so on, generating the random walk sequence {X 1,..., X n }. If infinitely many realizations of the dynamics is carried out, we find that the distribution of X i for each of the i steps are P 1 (X) for X 1 P 2 (X) = dy K(Y X)P 1 (Y) for X 2 P 3 (X) = dy K(Y X)P 2 (Y) for X 3. P t (X) = dy K(Y X)P t 1 (Y). for X t. The transition function K is ergodic if any state X can be reached from any state Y in a finite number of steps in a non-periodic fashion. If K is ergodic, then the Perron-Frobenius Theorem guarantees the existence of a unique stationary distribution P (X) that satisfies P (X) = dy K(Y X)P (Y) and lim P t (X) = P (X). t Implications 1. From any initial distribution of states P 1 (X), an ergodic K guarantees that states X t will be distributed according to the unique stationary distribution P (X) for large t (many steps of random walk). 2. P (X) is like an eigenvector of matrix K with eigenvalue λ = Goal is then to design an ergodic transition function K so that the stationary or limit distribution is the Bolztmann weight w(x).

8 28 2. NUMERICAL INTEGRATION AND IMPORTANCE SAMPLING Finite State Space To make the Markov chain more concrete, consider a finite state space in which only m different configurations of the system exist. The phase space then can be enumerated as {X 1,... X m }. The transition function is then an m m matrix K, where the element K ij 0 is the transition probability of going from state X j to state X i. Since the matrix elements represent transition probabilities, for any fixed value of j, m K ij = 1. The distribution P t (X) corresponds to a column vector P t = col[a (1) t a (i) t is the probability of state X i at time t. The distribution evolves under the random walk as P t = K P t 1.,... a (m) t ], where The matrix K is regular if there is an integer t 0 such that K t 0 has all positive (non-zero) entries. Then K t for t t 0 has all positive entries. Suppose the matrix K is a regular transition matrix. The following properties hold: Statements following from the Frobenius-Perron theorem 1. The multiplicity of the eigenvalue λ = 1 is one (i.e. the eigenvalue is simple.) Proof. Since K is regular, there exists a transition matrix K = K t 0 with all positive elements. Note that if Ke = λe, then Ke = λ t 0 e, so that all eigenvectors of K with eigenvalue λ are also eigenvectors of K with eigenvalue λ t 0. Let e 1 = col[1, 1,..., 1]/m. Since i K ij = 1 for any j, we have i (K K) ij = 1, and hence i K ij = 1 for any j. The transpose of K therefore satisfies i KT ji = 1 for all j. The vector e 1 is the right eigenvector with eigenvalue λ = 1 since (K T e 1 ) j = 1/m i KT ji, and hence K T e 1 = e 1. Now both K and K T are m m square matrices and have the same eigenvalues (since the characteristic equation is invariant to the transpose operation), so λ = 1 is an eigenvalue of both K and K T. Now suppose K T v = v and all components v j of v are not equal. Let k be the index of the largest component, v k, which we can take to be positive without loss of generality. Thus, v k v j for all j and v k > v l for some l. It then follows that v k = j KT kj v j < j KT kj v k, since all components of K T are non-zero and positive. Thus we conclude that v k < v k, a contradiction. Hence all v k must be

9 2.3. MARKOV CHAIN MONTE CARLO 29 the same, corresponding to eigenvector e 1. Hence e 1 is a unique eigenvector of K T and λ = 1 is a simple root of the characterstic equation. Hence K has a single eigenvector with eigenvalue λ = If the eigenvalue λ 1 is real, then λ < 1. Proof. Suppose K T v = λ t 0 v, with λ t 0 1. It then follows that there is an index k of vector v such that v k v j for all j and v k > v l for some l. Now λ t 0 v k = j KT kj v j < j KT kj v k = v k, or λ t 0 v k < v k. Thus λ t 0 < 1, and hence λ < 1, since λ is real. Similarly, following the same lines of argument and using the fact that v k < v l for some l, we can establish that λ > 1. Hence λ < 1. A similar sort of argument can be used if λ is complex. 3. Under the dynamics of the random walk P t = KP t 1, lim t P t = P 1, where P 1 is the right eigenvector of K with simple eigenvalue λ = 1. Proof. If the matrix K is diagonalizable, the eigenvectors of K form a complete basis (since it can be put in Jordan canonical form). Thus, any initial distribution P s can be expanded in terms of the complete, linearly independent basis {P 1, P 2,... P m } of eigenvectors as P s = b 1 P b m P m, where KP i = λ i P i with λ 1 = 1 and λ i < 1. Now P t = b 1 P 1 + b 2 λ t 2P b m λ t mp m, but lim t λ t i = 0 exponentially fast for i 2. Thus, lim t K t P s = b 1 P 1. Note that if i (P s) i = 1, then i (P t) i = 1 as K maintains the norm. This implies that b 1 = ( i (P 1) i ) 1. It turns out that the condition of detailed balance, which we will define to mean K ij (P 1 ) j = K ji (P 1 ) i allows one to define a symmetric transition matrix K, which is necessarily diagonalizable. More generally, any square matrix is similar to a matrix of Jordan form, with isolated blocks of dimension of the multiplicity of the eigenvalue. Thus the matrix K can be written as K = P KP 1 where the columns of P are the (generalized) eigenvectors of K and K is of form K = 1 0 J 1..., 0 J I where I + 1 is the number of independent eigenvectors. For each eigenvector with

10 30 2. NUMERICAL INTEGRATION AND IMPORTANCE SAMPLING corresponding eigenvalue λ i, J i is of the form λ i λ i 1 J i =. 0 λ i The dimension of the square matrix J i depends on the original matrix K. For any given eigenvalue λ, the sum of the dimensions of the J i for which λ = λ i is equal to the algebraic multiplicity of the eigenvalue. It is easily shown that the nth power of a block J i has elements bounded by ( ) n k λ n k i for a block of size k, which goes to zero exponentially has n goes to infinity. Hence, the matrix K goes to K = Thus it follows that as n goes to infinity ) (K n ) αβ = (P K n P 1 P α1 P 1 1β = (P 1) α, αβ where P 1 is the stationary distribution, since the matrix P 1 1β is the β component of the left eigenvector of K with eigenvalue of 1, which is the constant vector. Consider the explicit case m = 3, with K ij given by K = (K K) ij > 0 for all i and j, so K is regular. Note also that i K ij = 1 for all j. The eigenvalues and eigenvectors of K are λ = 1 = P 1 = (2/7, 2/7, 3/7) λ = 1 6 λ = 1 2 = P 2 = ( 3/14, 3/14, 6/14) = P 3 = ( 3/7, 3/7, 0).

11 2.3. MARKOV CHAIN MONTE CARLO 31 If initially P s = (1, 0, 0), then P 2 = (1/2, 0, 1/2) and P 10 = ( , , ), which differs from P 1 by 0.1% Construction of the transition matrix K(y x) We wish to devise a procedure so that the limiting (stationary) distribution of the random walk is the Boltzmann distribution P eq (x). Break up the transition matrix into two parts, generation of trial states T(y x) and acceptance probability of trial state A(y x) K(y x) = T(y x)a(y x). Will consider definition of acceptance probability given a specified procedure for generating trial states with condition probability T(y x). Write dynamics in term of probabilities at time t Probability of starting in y and ending in x = dy P t (y) K(y x) ( ) Probability of starting in x and remaining in x = P t (x) 1 dy K(x y) so P t+1 (x) = = P t (x) + ( dy P t (y) K(y x) + P t (x) 1 dy ) dy K(x y) ) ( P t (y)k(y x) P t (x)k(x y). Thus, to get P t+1 (x) = P t (x), we want ( ) dy P t (y)k(y x) P t (x)k(x y) = 0. This can be accomplished by requiring microscopic reversibility or detailed balance: P t (y)k(y x) = P t (x)k(x y) P t (y)t(y x)a(y x) = P t (x)t(x y)a(x y).

12 32 2. NUMERICAL INTEGRATION AND IMPORTANCE SAMPLING We desire P t (x) = P t+1 (x) = P eq (x). This places restriction on the acceptance probabilities A(y x) A(x y) = P eq(x)t(x y) P eq (y)t(y x). Note that P eq (x)k(x y) is the probability of observing the sequence {x, y} in the random walk. This must equal the probability of the sequence {y, x} in the walk. Metropolis Solution: Note that if P eq (x)t(x y) > 0 and P eq (y)t(y x) > 0, then we can define ( A(y x) = min 1, P ) eq(x)t(x y) P eq (y)t(y x) ( A(x y) = min 1, P ) eq(y)t(y x). P eq (x)t(x y) This definition satisfies details balance. Proof. Verifying explicitly, we see that P eq (y)t(y x)a(y x) = min (P eq (y)t(y x), P eq (x)t(x y)) P eq (x)t(x y)a(x y) = min (P eq (x)t(x y), P eq (y)t(y x)) as required. = P eq (y)t(y x)a(y x) Procedure guaranteed to generate states with probability proportional to Boltzmann distribution if proposal probability T(y x) generates an ergodic transition probability K. Simple example Suppose we want to evaluate the equilibrium average A at inverse temperature β of some dynamical variable A(x) for a 1-dimensional harmonic oscillator system with potential energy U(x) = kx 2 /2. A(x) = Monte-Carlo procedure dx P eq (x)a(x) P eq (x) = 1 Z e βkx2 /2

13 2.3. MARKOV CHAIN MONTE CARLO Suppose the current state of the system is x 0. We define the proposal probability of a configuration y to be T(x 0 y) = { 1 2 x if y [x 0 x, x 0 + x] 0 otherwise x is fixed to some value representing the maximum displacement of the trial coordinate. y chosen uniformly around current value of x 0. Note that if y is selected, then T(x 0 y) = 1/(2 x) = T(y x 0 ) since x 0 and y are within a distance x of each other. 2. Accept trial y with probability ( A(x 0 y) = min 1, T(y x ) 0)P eq (y) = min ( 1, e β U), T(x 0 y)p eq (x 0 ) where U = U(y) U(x 0 ). Note that if U 0 so the proposal has lower potential energy than the current state, A(x 0 y) = 1 and the trial y is always accepted as the next state x 1 = y. If U > 0, then A(x 0 y) = e β U = q. We must accept the trial y with probability 0 < q < 1. This can be accomplished by picking a random number r uniformly on (0, 1) and then: (a) If r q, then accept configuration x 1 = y (b) If r > q, then reject configuration and set x 1 = x 0 (keep state as is). 3. Repeat steps 1 and 2, and record states {x i }. Markov chain of states (the sequence) {x i } generated, with each x i appearing in the sequence with probability P eq (x i ) (after some number of equilibration steps). After collecting N total configurations, the equilibrium average A can be estimated from A = 1 N N A(x i ) since the importance function w(x) = P eq (x). Rejected states are important to generate states with correct weight. Typically, should neglect the first N s points of Markov chain since it takes some iterations of procedure to generate states with stationary distribution of K(y x).

14 34 2. NUMERICAL INTEGRATION AND IMPORTANCE SAMPLING Called equilibration or burn in time. Often have correlation among adjacent states. Can record states every N corr Monte-Carlo steps (iterations). Why must rejected states be counted? Recall that K(x y) must be normalized to be a probability (or probability density). dy K(x y) = 1 = dy [δ(x y) + (1 δ(x y))] K(x y) hence, the probability to remain in the current state is K(x x) = 1 dy (1 δ(x y)) K(x y) K(x x) is non-zero to insure normalization, so we must see the sequence {..., x, x,... } in the Markov chain. 2.4 Statistical Uncertainties We would like some measure of the reliability of averages computed from a finite set of sampled data. Consider the average x of a quantity x constructed from a finite set of N measurements {x 1,..., x N }, where the x i are random variables drawn from a density ρ(x). x = 1 N x i N Suppose the random variables x i are drawn independently, so that the joint probability satisfies ρ(x 1,... x N ) = ρ(x 1 )ρ(x 2 ) ρ(x N ). Suppose that the intrinsic mean x of the random variable is x = dx xρ(x) and that the intrinsic variance is σ 2, where σ 2 = dx ( x x ) 2 ρ(x). A mesaure of reliability is the standard deviation of x, also known as the standard error σ E = σe 2, where σ2 E is the variance of the finite average x around x in the finite set {x 1,..., x N } σe 2 = ( x x ) 2.

15 2.4. STATISTICAL UNCERTAINTIES 35 We expect 1. Variance (error) decreases with increasing N 2. Error depends on the intrinsic variance σ 2 of the density ρ. Larger variance should mean slower convergence of x to x. Suppose all the intrinsic moments x n of ρ(x) exist. If we define the dimensionless variable z = x x σ E, and note that the probability density of z can be written as P ( z) = δ(z z) = 1 2π dt e it(z z) = 1 2π dt e it z e itz = 1 dt e it z χ N (t), (2.1) 2π where χ N (t) is the characteristic function of P (z). Using the fact that σ E = σ/ N, as will be shown shortly, z can be expressed as N N x i x z = = 1 N ( ) xi x, N σ N σ and hence where χ N (t) = it N ( ) xi x N σ e = χ(u) = e iu(x x )/σ, ( e it N ( x x σ ) ) N = χ(t/ N) N, which, for small arguments u can be expanded as χ(u) = 1 u 2 /2 + iu 3 κ 3 /(6σ 3 ) +..., where κ 3 is the third cumulant of the density ρ(x). Using this expansion, we find ln χ N (t) = t 2 /2 + O(N 1/2 ), and hence χ N (t) = e t2 /2 for large N. Inserting this form in Eq. (2.1) gives P ( z) = 1 2π e z2 /2 and hence we find the central limit theorem result that the averages x are distributed around the intrinsic mean according to the normal distribution N(x; x, σ 2 E) = 2 2σ E 2, 1 e (x x ) 2πσE

16 36 2. NUMERICAL INTEGRATION AND IMPORTANCE SAMPLING provided the number of samples N is large and all intrinsic moments (and hence cumulants) of ρ(x) are finite. If the central limit theorem holds so that the finite averages are normally distributed as above, then the standard error can be used to define confidence intervals since the probability p that the deviations in the average, x x, lie within a factor of c times the standard error is given by cσe cσ E d(x x )N(x; x, σ 2 x) = p. If p = 0.95, defining the 95% confidence intervals, then we find that c = 1.96, and hence the probability that the intrinsic mean x lies in the interval (x 1.96σ E, x σ E ) is P (x 1.96σ E x x σ E ) = If the data are not normally distributed, then higher cumulants of the probability density may be needed (skewness, kurtosis,...). The confidence intervals are defined in terms of integrals of the probability density. How do we calculate σ E or σe 2? We defined the variance as σ 2 E = ( x x ) 2 where x = 1 N i x i Expanding the square, we get σ 2 E = x2 x 2 but x 2 = 1 x N 2 i x j = 1 hence σ 2 E = σ2 /N. i,j N 2 ( x 2 i + i i ) x i x j = 1 N 2 ( N(σ 2 + x 2 ) + N(N 1) x 2) = σ2 N + x 2, However we don t really know what σ 2 is, so we must estimate this quantity. One way to do this is to define the estimator of the standard deviation ˆσ 2 : j i ˆσ 2 = 1 N (x i x) 2. i

17 2.4. STATISTICAL UNCERTAINTIES 37 The average of this estimator is ˆσ 2 = 1 N (x i x) 2 = 1 N i x 2 i 2x i x + x 2. i but using the facts that x i x = x2 N + N 1 N x 2 x 2 = σ2 N + x 2, we get ˆσ 2 E = (N 1)σ2 /N and hence σ 2 = N ˆσ 2 /(N 1). The standard error can therefore be estimated by σ E = ˆσ 2 N 1 Note that σ E decreases as 1/ N 1 Narrowing confidence intervals by a factor of 2 requires roughly 4 times more data. If the x i are not indepenent, then N N eff, where N eff is the effective number of independent configurations. If τ c is the correlation length (number of Monte-Carlo steps over which correlation x i x j differs from x 2 ), then N eff = N/τ c. Another way to estimate the intrinsic variance σ 2 is to use the Jackknife Method: see B. Efron, The annals of statistics 7, 1 (1979). Jackknife procedure is to form N averages from the set {x 1,..., x N } by omitting one data point. For example, the jth average is defined to be x (j) = 1 N 1 N i j x i The result is to form a set of averages {x (1), x (2),... x (N) }. The average x over this Jackknifed set is the same as before since x = 1 N N x (j) = j=1 1 N(N 1) N x i = 1 N j=1 i j N x i.

18 38 2. NUMERICAL INTEGRATION AND IMPORTANCE SAMPLING We define an estimator of the variance Σ 2 of the Jackknifed set to be Σ 2 = 1 N ( x (j) x ) 2. N j=1 which converges to Σ 2 as N gets large. Inserting the definitions of x (j) and x and using the independence of the x i, we see that the estimator is related to the intrinsic variance by ( 1 Σ 2 = N 1 1 ) σ 2 σ 2 = σ 2 = N(N 1) Σ 2. N N(N 1) The standard error can therefore be estimated using the Jackknifed set as Σ E = N 1 N ( x (j) x ) 2. N j=1

### STA 294: Stochastic Processes & Bayesian Nonparametrics

MARKOV CHAINS AND CONVERGENCE CONCEPTS Markov chains are among the simplest stochastic processes, just one step beyond iid sequences of random variables. Traditionally they ve been used in modelling a

### Markov Chains and Stochastic Sampling

Part I Markov Chains and Stochastic Sampling 1 Markov Chains and Random Walks on Graphs 1.1 Structure of Finite Markov Chains We shall only consider Markov chains with a finite, but usually very large,

### 04. Random Variables: Concepts

University of Rhode Island DigitalCommons@URI Nonequilibrium Statistical Physics Physics Course Materials 215 4. Random Variables: Concepts Gerhard Müller University of Rhode Island, gmuller@uri.edu Creative

### Inner product spaces. Layers of structure:

Inner product spaces Layers of structure: vector space normed linear space inner product space The abstract definition of an inner product, which we will see very shortly, is simple (and by itself is pretty

### 8 Error analysis: jackknife & bootstrap

8 Error analysis: jackknife & bootstrap As discussed before, it is no problem to calculate the expectation values and statistical error estimates of normal observables from Monte Carlo. However, often

### COURSE Numerical integration of functions (continuation) 3.3. The Romberg s iterative generation method

COURSE 7 3. Numerical integration of functions (continuation) 3.3. The Romberg s iterative generation method The presence of derivatives in the remainder difficulties in applicability to practical problems

### Chapter 7. Markov chain background. 7.1 Finite state space

Chapter 7 Markov chain background A stochastic process is a family of random variables {X t } indexed by a varaible t which we will think of as time. Time can be discrete or continuous. We will only consider

### SPRING 2006 PRELIMINARY EXAMINATION SOLUTIONS

SPRING 006 PRELIMINARY EXAMINATION SOLUTIONS 1A. Let G be the subgroup of the free abelian group Z 4 consisting of all integer vectors (x, y, z, w) such that x + 3y + 5z + 7w = 0. (a) Determine a linearly

### This is a Gaussian probability centered around m = 0 (the most probable and mean position is the origin) and the mean square displacement m 2 = n,or

Physics 7b: Statistical Mechanics Brownian Motion Brownian motion is the motion of a particle due to the buffeting by the molecules in a gas or liquid. The particle must be small enough that the effects

### [POLS 8500] Review of Linear Algebra, Probability and Information Theory

[POLS 8500] Review of Linear Algebra, Probability and Information Theory Professor Jason Anastasopoulos ljanastas@uga.edu January 12, 2017 For today... Basic linear algebra. Basic probability. Programming

### CSC 446 Notes: Lecture 13

CSC 446 Notes: Lecture 3 The Problem We have already studied how to calculate the probability of a variable or variables using the message passing method. However, there are some times when the structure

### NATIONAL UNIVERSITY OF SINGAPORE PC5215 NUMERICAL RECIPES WITH APPLICATIONS. (Semester I: AY ) Time Allowed: 2 Hours

NATIONAL UNIVERSITY OF SINGAPORE PC55 NUMERICAL RECIPES WITH APPLICATIONS (Semester I: AY 0-) Time Allowed: Hours INSTRUCTIONS TO CANDIDATES. This examination paper contains FIVE questions and comprises

### Topics in linear algebra

Chapter 6 Topics in linear algebra 6.1 Change of basis I want to remind you of one of the basic ideas in linear algebra: change of basis. Let F be a field, V and W be finite dimensional vector spaces over

### 1 Linear Difference Equations

ARMA Handout Jialin Yu 1 Linear Difference Equations First order systems Let {ε t } t=1 denote an input sequence and {y t} t=1 sequence generated by denote an output y t = φy t 1 + ε t t = 1, 2,... with

### Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition Prof. Tesler Math 283 Fall 2016 Also see the separate version of this with Matlab and R commands. Prof. Tesler Diagonalizing

### 18.S34 linear algebra problems (2007)

18.S34 linear algebra problems (2007) Useful ideas for evaluating determinants 1. Row reduction, expanding by minors, or combinations thereof; sometimes these are useful in combination with an induction

### 18.06 Problem Set 8 - Solutions Due Wednesday, 14 November 2007 at 4 pm in

806 Problem Set 8 - Solutions Due Wednesday, 4 November 2007 at 4 pm in 2-06 08 03 Problem : 205+5+5+5 Consider the matrix A 02 07 a Check that A is a positive Markov matrix, and find its steady state

### Higher-order ordinary differential equations

Higher-order ordinary differential equations 1 A linear ODE of general order n has the form a n (x) dn y dx n +a n 1(x) dn 1 y dx n 1 + +a 1(x) dy dx +a 0(x)y = f(x). If f(x) = 0 then the equation is called

### Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester

Physics 403 Parameter Estimation, Correlations, and Error Bars Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Best Estimates and Reliability

### Basic Concepts and Tools in Statistical Physics

Chapter 1 Basic Concepts and Tools in Statistical Physics 1.1 Introduction Statistical mechanics provides general methods to study properties of systems composed of a large number of particles. It establishes

### Uniformly Uniformly-ergodic Markov chains and BSDEs

Uniformly Uniformly-ergodic Markov chains and BSDEs Samuel N. Cohen Mathematical Institute, University of Oxford (Based on joint work with Ying Hu, Robert Elliott, Lukas Szpruch) Centre Henri Lebesgue,

### 1 Proof techniques. CS 224W Linear Algebra, Probability, and Proof Techniques

1 Proof techniques Here we will learn to prove universal mathematical statements, like the square of any odd number is odd. It s easy enough to show that this is true in specific cases for example, 3 2

### FIXED POINT ITERATIONS

FIXED POINT ITERATIONS MARKUS GRASMAIR 1. Fixed Point Iteration for Non-linear Equations Our goal is the solution of an equation (1) F (x) = 0, where F : R n R n is a continuous vector valued mapping in

### 1. General Vector Spaces

1.1. Vector space axioms. 1. General Vector Spaces Definition 1.1. Let V be a nonempty set of objects on which the operations of addition and scalar multiplication are defined. By addition we mean a rule

### Practice Problems For Test 3

Practice Problems For Test 3 Power Series Preliminary Material. Find the interval of convergence of the following. Be sure to determine the convergence at the endpoints. (a) ( ) k (x ) k (x 3) k= k (b)

### I. QUANTUM MONTE CARLO METHODS: INTRODUCTION AND BASICS

I. QUANTUM MONTE CARLO METHODS: INTRODUCTION AND BASICS Markus Holzmann LPMMC, UJF, Grenoble, and LPTMC, UPMC, Paris markus@lptl.jussieu.fr http://www.lptl.jussieu.fr/users/markus (Dated: January 24, 2012)

### x x2 2 + x3 3 x4 3. Use the divided-difference method to find a polynomial of least degree that fits the values shown: (b)

Numerical Methods - PROBLEMS. The Taylor series, about the origin, for log( + x) is x x2 2 + x3 3 x4 4 + Find an upper bound on the magnitude of the truncation error on the interval x.5 when log( + x)

### HOMEWORK PROBLEMS FROM STRANG S LINEAR ALGEBRA AND ITS APPLICATIONS (4TH EDITION)

HOMEWORK PROBLEMS FROM STRANG S LINEAR ALGEBRA AND ITS APPLICATIONS (4TH EDITION) PROFESSOR STEVEN MILLER: BROWN UNIVERSITY: SPRING 2007 1. CHAPTER 1: MATRICES AND GAUSSIAN ELIMINATION Page 9, # 3: Describe

### Lecture 5: Random Walks and Markov Chain

Spectral Graph Theory and Applications WS 20/202 Lecture 5: Random Walks and Markov Chain Lecturer: Thomas Sauerwald & He Sun Introduction to Markov Chains Definition 5.. A sequence of random variables

### Lecture 32: Taylor Series and McLaurin series We saw last day that some functions are equal to a power series on part of their domain.

Lecture 32: Taylor Series and McLaurin series We saw last day that some functions are equal to a power series on part of their domain. For example f(x) = 1 1 x = 1 + x + x2 + x 3 + = ln(1 + x) = x x2 2

### 7.1 Coupling from the Past

Georgia Tech Fall 2006 Markov Chain Monte Carlo Methods Lecture 7: September 12, 2006 Coupling from the Past Eric Vigoda 7.1 Coupling from the Past 7.1.1 Introduction We saw in the last lecture how Markov

### Chapter 6 - Random Processes

EE385 Class Notes //04 John Stensby Chapter 6 - Random Processes Recall that a random variable X is a mapping between the sample space S and the extended real line R +. That is, X : S R +. A random process

### 1 Review of simple harmonic oscillator

MATHEMATICS 7302 (Analytical Dynamics YEAR 2017 2018, TERM 2 HANDOUT #8: COUPLED OSCILLATIONS AND NORMAL MODES 1 Review of simple harmonic oscillator In MATH 1301/1302 you studied the simple harmonic oscillator:

### Problems in Linear Algebra and Representation Theory

Problems in Linear Algebra and Representation Theory (Most of these were provided by Victor Ginzburg) The problems appearing below have varying level of difficulty. They are not listed in any specific

### NONCOMMUTATIVE POLYNOMIAL EQUATIONS. Edward S. Letzter. Introduction

NONCOMMUTATIVE POLYNOMIAL EQUATIONS Edward S Letzter Introduction My aim in these notes is twofold: First, to briefly review some linear algebra Second, to provide you with some new tools and techniques

### Physics 116C Solutions to the Practice Final Exam Fall The method of Frobenius fails because it works only for Fuchsian equations, of the type

Physics 6C Solutions to the Practice Final Exam Fall 2 Consider the differential equation x y +x(x+)y y = () (a) Explain why the method of Frobenius method fails for this problem. The method of Frobenius

### Week 9 Generators, duality, change of measure

Week 9 Generators, duality, change of measure Jonathan Goodman November 18, 013 1 Generators This section describes a common abstract way to describe many of the differential equations related to Markov

### 22m:033 Notes: 7.1 Diagonalization of Symmetric Matrices

m:33 Notes: 7. Diagonalization of Symmetric Matrices Dennis Roseman University of Iowa Iowa City, IA http://www.math.uiowa.edu/ roseman May 3, Symmetric matrices Definition. A symmetric matrix is a matrix

### An introduction to Birkhoff normal form

An introduction to Birkhoff normal form Dario Bambusi Dipartimento di Matematica, Universitá di Milano via Saldini 50, 0133 Milano (Italy) 19.11.14 1 Introduction The aim of this note is to present an

### NATIONAL BOARD FOR HIGHER MATHEMATICS. M. A. and M.Sc. Scholarship Test. September 25, Time Allowed: 150 Minutes Maximum Marks: 30

NATIONAL BOARD FOR HIGHER MATHEMATICS M. A. and M.Sc. Scholarship Test September 25, 2010 Time Allowed: 150 Minutes Maximum Marks: 30 Please read, carefully, the instructions on the following page 1 INSTRUCTIONS

### [y i α βx i ] 2 (2) Q = i=1

Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

### w T 1 w T 2. w T n 0 if i j 1 if i = j

Lyapunov Operator Let A F n n be given, and define a linear operator L A : C n n C n n as L A (X) := A X + XA Suppose A is diagonalizable (what follows can be generalized even if this is not possible -

### Monte Carlo Simulations

Monte Carlo Simulations What are Monte Carlo Simulations and why ones them? Pseudo Random Number generators Creating a realization of a general PDF The Bootstrap approach A real life example: LOFAR simulations

### Chapter 3, 4 Random Variables ENCS Probability and Stochastic Processes. Concordia University

Chapter 3, 4 Random Variables ENCS6161 - Probability and Stochastic Processes Concordia University ENCS6161 p.1/47 The Notion of a Random Variable A random variable X is a function that assigns a real

### Markov Chains Handout for Stat 110

Markov Chains Handout for Stat 0 Prof. Joe Blitzstein (Harvard Statistics Department) Introduction Markov chains were first introduced in 906 by Andrey Markov, with the goal of showing that the Law of

### Jordan normal form notes (version date: 11/21/07)

Jordan normal form notes (version date: /2/7) If A has an eigenbasis {u,, u n }, ie a basis made up of eigenvectors, so that Au j = λ j u j, then A is diagonal with respect to that basis To see this, let

### Lecture 2: September 8

CS294 Markov Chain Monte Carlo: Foundations & Applications Fall 2009 Lecture 2: September 8 Lecturer: Prof. Alistair Sinclair Scribes: Anand Bhaskar and Anindya De Disclaimer: These notes have not been

### SET-A U/2015/16/II/A

. n x dx 3 n n is equal to : 0 4. Let f n (x) = x n n, for each n. Then the series (A) 3 n n (B) n 3 n (n ) (C) n n (n ) (D) n 3 n (n ) f n is n (A) Uniformly convergent on [0, ] (B) Po intwise convergent

### Monte Carlo Integration. Computer Graphics CMU /15-662, Fall 2016

Monte Carlo Integration Computer Graphics CMU 15-462/15-662, Fall 2016 Talk Announcement Jovan Popovic, Senior Principal Scientist at Adobe Research will be giving a seminar on Character Animator -- Monday

### 1.3 Convergence of Regular Markov Chains

Markov Chains and Random Walks on Graphs 3 Applying the same argument to A T, which has the same λ 0 as A, yields the row sum bounds Corollary 0 Let P 0 be the transition matrix of a regular Markov chain

### Diagonalization by a unitary similarity transformation

Physics 116A Winter 2011 Diagonalization by a unitary similarity transformation In these notes, we will always assume that the vector space V is a complex n-dimensional space 1 Introduction A semi-simple

### Lecture 2: From Linear Regression to Kalman Filter and Beyond

Lecture 2: From Linear Regression to Kalman Filter and Beyond Department of Biomedical Engineering and Computational Science Aalto University January 26, 2012 Contents 1 Batch and Recursive Estimation

### av 1 x 2 + 4y 2 + xy + 4z 2 = 16.

74 85 Eigenanalysis The subject of eigenanalysis seeks to find a coordinate system, in which the solution to an applied problem has a simple expression Therefore, eigenanalysis might be called the method

### 4 Uniform convergence

4 Uniform convergence In the last few sections we have seen several functions which have been defined via series or integrals. We now want to develop tools that will allow us to show that these functions

### An Introduction to Entropy and Subshifts of. Finite Type

An Introduction to Entropy and Subshifts of Finite Type Abby Pekoske Department of Mathematics Oregon State University pekoskea@math.oregonstate.edu August 4, 2015 Abstract This work gives an overview

### From Random Numbers to Monte Carlo. Random Numbers, Random Walks, Diffusion, Monte Carlo integration, and all that

From Random Numbers to Monte Carlo Random Numbers, Random Walks, Diffusion, Monte Carlo integration, and all that Random Walk Through Life Random Walk Through Life If you flip the coin 5 times you will

### GRE Math Subject Test #5 Solutions.

GRE Math Subject Test #5 Solutions. 1. E (Calculus) Apply L Hôpital s Rule two times: cos(3x) 1 3 sin(3x) 9 cos(3x) lim x 0 = lim x 2 x 0 = lim 2x x 0 = 9. 2 2 2. C (Geometry) Note that a line segment

### Sudoku and Matrices. Merciadri Luca. June 28, 2011

Sudoku and Matrices Merciadri Luca June 28, 2 Outline Introduction 2 onventions 3 Determinant 4 Erroneous Sudoku 5 Eigenvalues Example 6 Transpose Determinant Trace 7 Antisymmetricity 8 Non-Normality 9

### Model Counting for Logical Theories

Model Counting for Logical Theories Wednesday Dmitry Chistikov Rayna Dimitrova Department of Computer Science University of Oxford, UK Max Planck Institute for Software Systems (MPI-SWS) Kaiserslautern

+ Advanced Sampling Algorithms + Mobashir Mohammad Hirak Sarkar Parvathy Sudhir Yamilet Serrano Llerena Advanced Sampling Algorithms Aditya Kulkarni Tobias Bertelsen Nirandika Wanigasekara Malay Singh

### Metropolis/Variational Monte Carlo. Microscopic Theories of Nuclear Structure, Dynamics and Electroweak Currents June 12-30, 2017, ECT*, Trento, Italy

Metropolis/Variational Monte Carlo Stefano Gandolfi Los Alamos National Laboratory (LANL) Microscopic Theories of Nuclear Structure, Dynamics and Electroweak Currents June 12-30, 2017, ECT*, Trento, Italy

### Ph.D. Katarína Bellová Page 1 Mathematics 2 (10-PHY-BIPMA2) EXAM - Solutions, 20 July 2017, 10:00 12:00 All answers to be justified.

PhD Katarína Bellová Page 1 Mathematics 2 (10-PHY-BIPMA2 EXAM - Solutions, 20 July 2017, 10:00 12:00 All answers to be justified Problem 1 [ points]: For which parameters λ R does the following system

### LS.5 Theory of Linear Systems

LS.5 Theory of Linear Systems 1. General linear ODE systems and independent solutions. We have studied the homogeneous system of ODE s with constant coefficients, (1) x = Ax, where A is an n n matrix of

### IMPORTANT DEFINITIONS AND THEOREMS REFERENCE SHEET

IMPORTANT DEFINITIONS AND THEOREMS REFERENCE SHEET This is a (not quite comprehensive) list of definitions and theorems given in Math 1553. Pay particular attention to the ones in red. Study Tip For each

### AN ELEMENTARY PROOF OF THE SPECTRAL RADIUS FORMULA FOR MATRICES

AN ELEMENTARY PROOF OF THE SPECTRAL RADIUS FORMULA FOR MATRICES JOEL A. TROPP Abstract. We present an elementary proof that the spectral radius of a matrix A may be obtained using the formula ρ(a) lim

### EXERCISE SET 5.1. = (kx + kx + k, ky + ky + k ) = (kx + kx + 1, ky + ky + 1) = ((k + )x + 1, (k + )y + 1)

EXERCISE SET 5. 6. The pair (, 2) is in the set but the pair ( )(, 2) = (, 2) is not because the first component is negative; hence Axiom 6 fails. Axiom 5 also fails. 8. Axioms, 2, 3, 6, 9, and are easily

### SOLUTIONS: ASSIGNMENT Use Gaussian elimination to find the determinant of the matrix. = det. = det = 1 ( 2) 3 6 = 36. v 4.

SOLUTIONS: ASSIGNMENT 9 66 Use Gaussian elimination to find the determinant of the matrix det 1 1 4 4 1 1 1 1 8 8 = det = det 0 7 9 0 0 0 6 = 1 ( ) 3 6 = 36 = det = det 0 0 6 1 0 0 0 6 61 Consider a 4

### Introductory Numerical Analysis

Introductory Numerical Analysis Lecture Notes December 16, 017 Contents 1 Introduction to 1 11 Floating Point Numbers 1 1 Computational Errors 13 Algorithm 3 14 Calculus Review 3 Root Finding 5 1 Bisection

### SQUARE ROOTS OF 2x2 MATRICES 1. Sam Northshield SUNY-Plattsburgh

SQUARE ROOTS OF x MATRICES Sam Northshield SUNY-Plattsburgh INTRODUCTION A B What is the square root of a matrix such as? It is not, in general, A B C D C D This is easy to see since the upper left entry

### Lagrange Multipliers

Optimization with Constraints As long as algebra and geometry have been separated, their progress have been slow and their uses limited; but when these two sciences have been united, they have lent each

### In manycomputationaleconomicapplications, one must compute thede nite n

Chapter 6 Numerical Integration In manycomputationaleconomicapplications, one must compute thede nite n integral of a real-valued function f de ned on some interval I of

### NATIONAL BOARD FOR HIGHER MATHEMATICS. M. A. and M.Sc. Scholarship Test. September 19, Time Allowed: 150 Minutes Maximum Marks: 30

NATIONAL BOARD FOR HIGHER MATHEMATICS M. A. and M.Sc. Scholarship Test September 19, 2015 Time Allowed: 150 Minutes Maximum Marks: 30 Please read, carefully, the instructions on the following page 1 INSTRUCTIONS

### Extra Problems for Math 2050 Linear Algebra I

Extra Problems for Math 5 Linear Algebra I Find the vector AB and illustrate with a picture if A = (,) and B = (,4) Find B, given A = (,4) and [ AB = A = (,4) and [ AB = 8 If possible, express x = 7 as

### Definition (T -invariant subspace) Example. Example

Eigenvalues, Eigenvectors, Similarity, and Diagonalization We now turn our attention to linear transformations of the form T : V V. To better understand the effect of T on the vector space V, we begin

### Stochastic Realization of Binary Exchangeable Processes

Stochastic Realization of Binary Exchangeable Processes Lorenzo Finesso and Cecilia Prosdocimi Abstract A discrete time stochastic process is called exchangeable if its n-dimensional distributions are,

### Markov Chains, Stochastic Processes, and Matrix Decompositions

Markov Chains, Stochastic Processes, and Matrix Decompositions 5 May 2014 Outline 1 Markov Chains Outline 1 Markov Chains 2 Introduction Perron-Frobenius Matrix Decompositions and Markov Chains Spectral

### Chapter Two: Numerical Methods for Elliptic PDEs. 1 Finite Difference Methods for Elliptic PDEs

Chapter Two: Numerical Methods for Elliptic PDEs Finite Difference Methods for Elliptic PDEs.. Finite difference scheme. We consider a simple example u := subject to Dirichlet boundary conditions ( ) u

### Methods of Data Analysis Random numbers, Monte Carlo integration, and Stochastic Simulation Algorithm (SSA / Gillespie)

Methods of Data Analysis Random numbers, Monte Carlo integration, and Stochastic Simulation Algorithm (SSA / Gillespie) Week 1 1 Motivation Random numbers (RNs) are of course only pseudo-random when generated

### 2905 Queueing Theory and Simulation PART IV: SIMULATION

2905 Queueing Theory and Simulation PART IV: SIMULATION 22 Random Numbers A fundamental step in a simulation study is the generation of random numbers, where a random number represents the value of a random

### THE MONTE CARLO METHOD

THE MONTE CARLO METHOD I. INTRODUCTION The Monte Carlo method is often referred to as a computer experiment. One might think of this as a way of conveying the fact that the output of simulations is not

### Monte Carlo (MC) Simulation Methods. Elisa Fadda

Monte Carlo (MC) Simulation Methods Elisa Fadda 1011-CH328, Molecular Modelling & Drug Design 2011 Experimental Observables A system observable is a property of the system state. The system state i is

### Chapter Two Elements of Linear Algebra

Chapter Two Elements of Linear Algebra Previously, in chapter one, we have considered single first order differential equations involving a single unknown function. In the next chapter we will begin to

### One-dimensional Schrödinger equation

Chapter 1 One-dimensional Schrödinger equation In this chapter we will start from the harmonic oscillator to introduce a general numerical methodology to solve the one-dimensional, time-independent Schrödinger

### Section 12.6: Non-homogeneous Problems

Section 12.6: Non-homogeneous Problems 1 Introduction Up to this point all the problems we have considered are we what we call homogeneous problems. This means that for an interval < x < l the problems

### Monte Carlo Methods. PHY 688: Numerical Methods for (Astro)Physics

Monte Carlo Methods Random Numbers How random is random? From Pang: we want Long period before sequence repeats Little correlation between numbers (plot ri+1 vs ri should fill the plane) Fast Typical random

### Monte-Carlo MMD-MA, Université Paris-Dauphine. Xiaolu Tan

Monte-Carlo MMD-MA, Université Paris-Dauphine Xiaolu Tan tan@ceremade.dauphine.fr Septembre 2015 Contents 1 Introduction 1 1.1 The principle.................................. 1 1.2 The error analysis

### Lecture 1 Measure concentration

CSE 29: Learning Theory Fall 2006 Lecture Measure concentration Lecturer: Sanjoy Dasgupta Scribe: Nakul Verma, Aaron Arvey, and Paul Ruvolo. Concentration of measure: examples We start with some examples

### MA4001 Engineering Mathematics 1 Lecture 15 Mean Value Theorem Increasing and Decreasing Functions Higher Order Derivatives Implicit Differentiation

MA4001 Engineering Mathematics 1 Lecture 15 Mean Value Theorem Increasing and Decreasing Functions Higher Order Derivatives Implicit Differentiation Dr. Sarah Mitchell Autumn 2014 Rolle s Theorem Theorem

### Some lecture notes for Math 6050E: PDEs, Fall 2016

Some lecture notes for Math 65E: PDEs, Fall 216 Tianling Jin December 1, 216 1 Variational methods We discuss an example of the use of variational methods in obtaining existence of solutions. Theorem 1.1.

### Advanced Monte Carlo Methods - Computational Challenges

Advanced Monte Carlo Methods - Computational Challenges Ivan Dimov IICT, Bulgarian Academy of Sciences Sofia, February, 2012 Aims To cover important topics from probability theory and statistics. To present

### Lecture 7 Spectral methods

CSE 291: Unsupervised learning Spring 2008 Lecture 7 Spectral methods 7.1 Linear algebra review 7.1.1 Eigenvalues and eigenvectors Definition 1. A d d matrix M has eigenvalue λ if there is a d-dimensional

### Introduction to Bayesian methods in inverse problems

Introduction to Bayesian methods in inverse problems Ville Kolehmainen 1 1 Department of Applied Physics, University of Eastern Finland, Kuopio, Finland March 4 2013 Manchester, UK. Contents Introduction

### Real Analysis Problems

Real Analysis Problems Cristian E. Gutiérrez September 14, 29 1 1 CONTINUITY 1 Continuity Problem 1.1 Let r n be the sequence of rational numbers and Prove that f(x) = 1. f is continuous on the irrationals.

### A Sudoku Submatrix Study

A Sudoku Submatrix Study Merciadri Luca LucaMerciadri@studentulgacbe Abstract In our last article ([1]), we gave some properties of Sudoku matrices We here investigate some properties of the Sudoku submatrices

11.10 Taylor and Maclaurin Series Copyright Cengage Learning. All rights reserved. We start by supposing that f is any function that can be represented by a power series f(x)= c 0 +c 1 (x a)+c 2 (x a)

### Review (Probability & Linear Algebra)

Review (Probability & Linear Algebra) CE-725 : Statistical Pattern Recognition Sharif University of Technology Spring 2013 M. Soleymani Outline Axioms of probability theory Conditional probability, Joint

### STA 256: Statistics and Probability I

Al Nosedal. University of Toronto. Fall 2017 My momma always said: Life was like a box of chocolates. You never know what you re gonna get. Forrest Gump. Exercise 4.1 Let X be a random variable with p(x)

### Probabilistic Graphical Models

2016 Robert Nowak Probabilistic Graphical Models 1 Introduction We have focused mainly on linear models for signals, in particular the subspace model x = Uθ, where U is a n k matrix and θ R k is a vector