# Numerical integration and importance sampling

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 2 Numerical integration and importance sampling 2.1 Quadrature Consider the numerical evaluation of the integral I(a, b) = b a dx f(x) Rectangle rule: on small interval, construct interpolating function and integrate over interval. Polynomial of degree 0 using mid-point of interval: (a+1)h ah dx f(x) h f ((ah + (a + 1)h)/2). Polynomial of degree 1 passing through points (a 1, f(a 1 )) and (a 2, f(a 2 )): Trapezoidal rule f(x) = f(a 1 ) + x a a2 ( ) 1 a2 a 1 (f(a 2 ) f(a 1 )) dx f(x) = (f(a 1 ) + f(a 2 )). a 2 a 1 a 1 2 Composing trapezoidal rule n times on interval (a, b) with even sub-intervals [kh, (k + 1)h] where k = 0,..., n 1 and h = (b a)/n gives estimate ( ) b dx f(x) b a f(a) + f(b) n 1 + f(a + kh). n 2 a 21 k=1

2 22 2. NUMERICAL INTEGRATION AND IMPORTANCE SAMPLING Simpsons rule: interpolating function of degree 2 composed n times on interval (a, b): b a dx f(x) b a 3n Error bounded by [f(a) + 4f(a + h) + 2f(a + 2h) + 4f(a + 3h) + 2f(a + 4h) + + f(b)]. h (b a) f (4) (ξ). where ξ (a, b) is the point in the domain where the magnitude of the 4th derivative is largest. Rules based on un-even divisions of domain of integration w(x) = weight function. I(a, b) n w if i, where w i = w(x i ) and f i = f(x i ) with choice of x i and w i based on definite integral of polynomials of higher order. Example is Gaussian quadrature with n points based on polynomial of degree 2n 1: well-defined procedure to find {x i } and {w i } (see Numerical Recipes). Error bounds for n-point Gaussian quadrature are (b a) 2n+1 (2n + 1)! (n!) 4 [(2n)!] 3 f (2n) (ξ) for ξ (a, b).. For multi-dimensional integrals, must place n i grid points along each i dimension. Number of points in hyper-grid grows exponentially with dimension. Unsuitable for high-dimensional integrals. 2.2 Importance Sampling and Monte Carlo Suppose integrand f(x) depends on multi-dimensional point x and that integral over hypervolume I = dx f(x) V is non-zero only in specific regions of the domain. We should place higher density of points in region where integrand is large. Define a weight function w(x) that tells us which regions are significant.

3 2.2. IMPORTANCE SAMPLING AND MONTE CARLO 23 Require property w(x) > 0 for any point x in volume. Sometimes use normalized weight function so dx w(x) = 1, though not strictly V necessary. Re-express integral as: I = dx f(x) w(x) w(x). V Idea: Draw a set of N points {x 1,..., x N } from the weight function w(x) then I = 1 N N f(x i ) w(x i ). As N, I I. How does this improve the rate of convergence of the calculation of I? We will see that the statistical uncertainty is related to the variance σi 2 of the estimate of I, namely σ 2 I = 1 N N i I i I i where I i = f(x i) w(x i ) I. and we have assumed that the random variables I i are statistically independent. Here represents the average over the true distribution of f/w that is obtained in the limit N. Vastly different values of ratio f(x i )/w(x i ) lead to large uncertainty. The error is minimized by minimizing σ 2 I. If α w(x i ) = f(x i ), then f(x i )/w(x i ) = α and ( ) 2 f(xi ) f(xi ) = I = α = α 2, w(x i ) w(x i ) and σ 2 I = 0. Note that writing w(x i ) = f(x i )/α requires knowing α = I, the problem we are trying to solve. Generally desire all f(x i )/w(x i ) to be roughly the same for all sampled points x i to mimimize σ 2 I. Example in 1-dimensional integral I = b dx f(x) = b a a dx f(x) w(x) w(x).

4 24 2. NUMERICAL INTEGRATION AND IMPORTANCE SAMPLING Monte-Carlo sampling: use random sampling of points x with weight w(x) to estimate ratio f(x)/w(x). How do we draw sample points with a given weight w(x)? Consider simplest case of a uniform distribution on interval (a, b): w(x) = { 1 b a if x (a, b). 0 otherwise. Use a random number generator that gives a pseudo-random number rand in interval (0, 1). x i = a + (b a) rand, then {x i } are distributed uniformly in the interval (a, b). We find that the estimator of f is then: I = 1 N f(x i ) N w(x i ) = b a N f(x i ) = 1 N N N I k, where each I k = (b a) f(x k ) is an independent estimate of the integral. Like trapezoidal integration but with randomly sampled points {x i } from (a, b) each with uniform weight. How about other choices of weight function w(x)? It is easy to draw uniform y i (0, 1). Now suppose we map the y i to x i via y(x) = x a dz w(z). Note that y(a) = 0 and y(b) = 1 if w(x) is normalized over (a, b). How are the x distributed? Since the y are distributed uniformly over (0, 1), the probability of finding a value of y in the interval is dy(x) = w(x) dx, so the x are distributed with weight w(x). It then follows that the one-dimensional integral can be written in the transformed variables y as: I = b a dx w(x) f(x) w(x) = y(b) y(a) k=1 dy f(x(y)) w(x(y)) = 1 0 dy f(x(y)) w(x(y)) Integral easily evaluated by selecting uniform points y and then solving x(y). Must be able to solve for x(y) to be useful.

5 2.2. IMPORTANCE SAMPLING AND MONTE CARLO 25 Procedure: 1. Select N points y i uniformly on (0, 1) using y i = rand. 2. Compute x i = x(y i ) by inverting y(x) = x dz w(z). a 3. Compute estimator for integral I = 1 N N f(x i)/w(x i ). This procedure is easy to do using simple forms of w(x). Suppose the integrand is strongly peaked around x = x 0. One good choice of w(x) might be a Gaussian weight w(x) = 1 2πσ 2 e (x x 0 )2 2σ 2, where the variance (width) of the Gaussian is treated as a parameter. If I = dx f(x) = dx w(x)f(x)/w(x), can draw randomly from the Gaussian weight by: 1 x y(x) = d x e ( x x 0 ) 2 2σ 2 2πσ 2 = πσ 2 x = x x0 2σ π 0 0 d x e ( x x 0 ) 2 2σ 2 dw e w2 = 1 2 ( ( )) x x0 1 + erf. 2σ Inverting gives x i = x 0 + 2σ ierf(2y i 1), where ierf is the inverse error function with series representation π ierf(z) = (z + π12 ) 2 z3 + 7π2 480 z The estimator of the integral is therefore I = 1 N 2πσ 2 N f(x i )e (x i x 0 ) 2 2σ 2. This estimator reduces the variance σ 2 I if w(x i) and f(x i ) resemble one another. Another way to draw from a Gaussian: Draw 2 numbers, y 1 and y 2 uniformly on (0, 1). Define R = 2 ln y 1 and θ = 2πy 2. Then x 1 = R cos θ and x 2 = R sin θ are distributed with density w(x 1, x 2 ) = 1 2π e x2 1e x2 2

6 26 2. NUMERICAL INTEGRATION AND IMPORTANCE SAMPLING since dy 1 dy 2 = y y 1 x 1 y 2 1 x 2 y 2 x 1 x 2 dx 1dx 2 = w(x 1, x 2 )dx 1 dx Markov Chain Monte Carlo Ensemble averages Generally, we cannot find a simple way of generating the {x i } according to a known w(x i ) for high-dimensional systems by transforming from a uniform distribution. Analytical integrals for coordinate transformation may be invertable only for separable coordinates. Many degrees of freedom are coupled so that joint probability is complicated and not the product of single probabilities. Typical integrals are ensemble averages of the form: A = 1 Z V dr (N) e βu(r(n)) A(r (N) ) = V dr(n) e βu(r(n)) A(r (N) ) V dr(n) e βu(r(n) ) Typical potentials are complicated functions of configuration r (N). A good importance sampling weight would set w(r (N) ) = e βu(r(n)). How do we sample configurations from a general, multi-dimensional weight? Goal: Devise a method to generate a sequence of configurations {r (N) 1,..., r (N) n } in which the probability of finding a configuration r (N) i in the sequence is given by w(r (N) i )dr (N) i. We will do so using a stochastic procedure based on a random walk Markov Chains We will represent a general, multi-dimensional configurational coordinate that identifies the state by the vector X. We wish to generate a sequence of configurations X 1,..., X n where each X t is chosen with probability density P t (X t ). To generate the sequence, we use a discrete-time, time-homogeneous, Markovian random walk. Consider the configuration X 1 at an initial time labelled as 1 that is treated as a random variable drawn from some known initial density P 1 (X).

7 2.3. MARKOV CHAIN MONTE CARLO 27 We assume the stochastic dynamics of the random variable is determined by a timeindependent transition function K(Y X), where the transition function defines the probability density of going from state Y to state X in a time step. Since K is a probability density, it must satisfy dx K(Y X) = 1 K(Y X) 0. The state X t at time t is assumed to be obtained from the state at time X t 1 by a realization of the dynamics, where the probability of all transitions is determined by K. At the second step of the random walk, the new state X 2 is chosen from P 2 (X X 1 ) = K(X 1 X). At the third step, the state X 3 is chosen from P 3 (X X 2 ) = K(X 2 X), and so on, generating the random walk sequence {X 1,..., X n }. If infinitely many realizations of the dynamics is carried out, we find that the distribution of X i for each of the i steps are P 1 (X) for X 1 P 2 (X) = dy K(Y X)P 1 (Y) for X 2 P 3 (X) = dy K(Y X)P 2 (Y) for X 3. P t (X) = dy K(Y X)P t 1 (Y). for X t. The transition function K is ergodic if any state X can be reached from any state Y in a finite number of steps in a non-periodic fashion. If K is ergodic, then the Perron-Frobenius Theorem guarantees the existence of a unique stationary distribution P (X) that satisfies P (X) = dy K(Y X)P (Y) and lim P t (X) = P (X). t Implications 1. From any initial distribution of states P 1 (X), an ergodic K guarantees that states X t will be distributed according to the unique stationary distribution P (X) for large t (many steps of random walk). 2. P (X) is like an eigenvector of matrix K with eigenvalue λ = Goal is then to design an ergodic transition function K so that the stationary or limit distribution is the Bolztmann weight w(x).

8 28 2. NUMERICAL INTEGRATION AND IMPORTANCE SAMPLING Finite State Space To make the Markov chain more concrete, consider a finite state space in which only m different configurations of the system exist. The phase space then can be enumerated as {X 1,... X m }. The transition function is then an m m matrix K, where the element K ij 0 is the transition probability of going from state X j to state X i. Since the matrix elements represent transition probabilities, for any fixed value of j, m K ij = 1. The distribution P t (X) corresponds to a column vector P t = col[a (1) t a (i) t is the probability of state X i at time t. The distribution evolves under the random walk as P t = K P t 1.,... a (m) t ], where The matrix K is regular if there is an integer t 0 such that K t 0 has all positive (non-zero) entries. Then K t for t t 0 has all positive entries. Suppose the matrix K is a regular transition matrix. The following properties hold: Statements following from the Frobenius-Perron theorem 1. The multiplicity of the eigenvalue λ = 1 is one (i.e. the eigenvalue is simple.) Proof. Since K is regular, there exists a transition matrix K = K t 0 with all positive elements. Note that if Ke = λe, then Ke = λ t 0 e, so that all eigenvectors of K with eigenvalue λ are also eigenvectors of K with eigenvalue λ t 0. Let e 1 = col[1, 1,..., 1]/m. Since i K ij = 1 for any j, we have i (K K) ij = 1, and hence i K ij = 1 for any j. The transpose of K therefore satisfies i KT ji = 1 for all j. The vector e 1 is the right eigenvector with eigenvalue λ = 1 since (K T e 1 ) j = 1/m i KT ji, and hence K T e 1 = e 1. Now both K and K T are m m square matrices and have the same eigenvalues (since the characteristic equation is invariant to the transpose operation), so λ = 1 is an eigenvalue of both K and K T. Now suppose K T v = v and all components v j of v are not equal. Let k be the index of the largest component, v k, which we can take to be positive without loss of generality. Thus, v k v j for all j and v k > v l for some l. It then follows that v k = j KT kj v j < j KT kj v k, since all components of K T are non-zero and positive. Thus we conclude that v k < v k, a contradiction. Hence all v k must be

9 2.3. MARKOV CHAIN MONTE CARLO 29 the same, corresponding to eigenvector e 1. Hence e 1 is a unique eigenvector of K T and λ = 1 is a simple root of the characterstic equation. Hence K has a single eigenvector with eigenvalue λ = If the eigenvalue λ 1 is real, then λ < 1. Proof. Suppose K T v = λ t 0 v, with λ t 0 1. It then follows that there is an index k of vector v such that v k v j for all j and v k > v l for some l. Now λ t 0 v k = j KT kj v j < j KT kj v k = v k, or λ t 0 v k < v k. Thus λ t 0 < 1, and hence λ < 1, since λ is real. Similarly, following the same lines of argument and using the fact that v k < v l for some l, we can establish that λ > 1. Hence λ < 1. A similar sort of argument can be used if λ is complex. 3. Under the dynamics of the random walk P t = KP t 1, lim t P t = P 1, where P 1 is the right eigenvector of K with simple eigenvalue λ = 1. Proof. If the matrix K is diagonalizable, the eigenvectors of K form a complete basis (since it can be put in Jordan canonical form). Thus, any initial distribution P s can be expanded in terms of the complete, linearly independent basis {P 1, P 2,... P m } of eigenvectors as P s = b 1 P b m P m, where KP i = λ i P i with λ 1 = 1 and λ i < 1. Now P t = b 1 P 1 + b 2 λ t 2P b m λ t mp m, but lim t λ t i = 0 exponentially fast for i 2. Thus, lim t K t P s = b 1 P 1. Note that if i (P s) i = 1, then i (P t) i = 1 as K maintains the norm. This implies that b 1 = ( i (P 1) i ) 1. It turns out that the condition of detailed balance, which we will define to mean K ij (P 1 ) j = K ji (P 1 ) i allows one to define a symmetric transition matrix K, which is necessarily diagonalizable. More generally, any square matrix is similar to a matrix of Jordan form, with isolated blocks of dimension of the multiplicity of the eigenvalue. Thus the matrix K can be written as K = P KP 1 where the columns of P are the (generalized) eigenvectors of K and K is of form K = 1 0 J 1..., 0 J I where I + 1 is the number of independent eigenvectors. For each eigenvector with

10 30 2. NUMERICAL INTEGRATION AND IMPORTANCE SAMPLING corresponding eigenvalue λ i, J i is of the form λ i λ i 1 J i =. 0 λ i The dimension of the square matrix J i depends on the original matrix K. For any given eigenvalue λ, the sum of the dimensions of the J i for which λ = λ i is equal to the algebraic multiplicity of the eigenvalue. It is easily shown that the nth power of a block J i has elements bounded by ( ) n k λ n k i for a block of size k, which goes to zero exponentially has n goes to infinity. Hence, the matrix K goes to K = Thus it follows that as n goes to infinity ) (K n ) αβ = (P K n P 1 P α1 P 1 1β = (P 1) α, αβ where P 1 is the stationary distribution, since the matrix P 1 1β is the β component of the left eigenvector of K with eigenvalue of 1, which is the constant vector. Consider the explicit case m = 3, with K ij given by K = (K K) ij > 0 for all i and j, so K is regular. Note also that i K ij = 1 for all j. The eigenvalues and eigenvectors of K are λ = 1 = P 1 = (2/7, 2/7, 3/7) λ = 1 6 λ = 1 2 = P 2 = ( 3/14, 3/14, 6/14) = P 3 = ( 3/7, 3/7, 0).

11 2.3. MARKOV CHAIN MONTE CARLO 31 If initially P s = (1, 0, 0), then P 2 = (1/2, 0, 1/2) and P 10 = ( , , ), which differs from P 1 by 0.1% Construction of the transition matrix K(y x) We wish to devise a procedure so that the limiting (stationary) distribution of the random walk is the Boltzmann distribution P eq (x). Break up the transition matrix into two parts, generation of trial states T(y x) and acceptance probability of trial state A(y x) K(y x) = T(y x)a(y x). Will consider definition of acceptance probability given a specified procedure for generating trial states with condition probability T(y x). Write dynamics in term of probabilities at time t Probability of starting in y and ending in x = dy P t (y) K(y x) ( ) Probability of starting in x and remaining in x = P t (x) 1 dy K(x y) so P t+1 (x) = = P t (x) + ( dy P t (y) K(y x) + P t (x) 1 dy ) dy K(x y) ) ( P t (y)k(y x) P t (x)k(x y). Thus, to get P t+1 (x) = P t (x), we want ( ) dy P t (y)k(y x) P t (x)k(x y) = 0. This can be accomplished by requiring microscopic reversibility or detailed balance: P t (y)k(y x) = P t (x)k(x y) P t (y)t(y x)a(y x) = P t (x)t(x y)a(x y).

12 32 2. NUMERICAL INTEGRATION AND IMPORTANCE SAMPLING We desire P t (x) = P t+1 (x) = P eq (x). This places restriction on the acceptance probabilities A(y x) A(x y) = P eq(x)t(x y) P eq (y)t(y x). Note that P eq (x)k(x y) is the probability of observing the sequence {x, y} in the random walk. This must equal the probability of the sequence {y, x} in the walk. Metropolis Solution: Note that if P eq (x)t(x y) > 0 and P eq (y)t(y x) > 0, then we can define ( A(y x) = min 1, P ) eq(x)t(x y) P eq (y)t(y x) ( A(x y) = min 1, P ) eq(y)t(y x). P eq (x)t(x y) This definition satisfies details balance. Proof. Verifying explicitly, we see that P eq (y)t(y x)a(y x) = min (P eq (y)t(y x), P eq (x)t(x y)) P eq (x)t(x y)a(x y) = min (P eq (x)t(x y), P eq (y)t(y x)) as required. = P eq (y)t(y x)a(y x) Procedure guaranteed to generate states with probability proportional to Boltzmann distribution if proposal probability T(y x) generates an ergodic transition probability K. Simple example Suppose we want to evaluate the equilibrium average A at inverse temperature β of some dynamical variable A(x) for a 1-dimensional harmonic oscillator system with potential energy U(x) = kx 2 /2. A(x) = Monte-Carlo procedure dx P eq (x)a(x) P eq (x) = 1 Z e βkx2 /2

13 2.3. MARKOV CHAIN MONTE CARLO Suppose the current state of the system is x 0. We define the proposal probability of a configuration y to be T(x 0 y) = { 1 2 x if y [x 0 x, x 0 + x] 0 otherwise x is fixed to some value representing the maximum displacement of the trial coordinate. y chosen uniformly around current value of x 0. Note that if y is selected, then T(x 0 y) = 1/(2 x) = T(y x 0 ) since x 0 and y are within a distance x of each other. 2. Accept trial y with probability ( A(x 0 y) = min 1, T(y x ) 0)P eq (y) = min ( 1, e β U), T(x 0 y)p eq (x 0 ) where U = U(y) U(x 0 ). Note that if U 0 so the proposal has lower potential energy than the current state, A(x 0 y) = 1 and the trial y is always accepted as the next state x 1 = y. If U > 0, then A(x 0 y) = e β U = q. We must accept the trial y with probability 0 < q < 1. This can be accomplished by picking a random number r uniformly on (0, 1) and then: (a) If r q, then accept configuration x 1 = y (b) If r > q, then reject configuration and set x 1 = x 0 (keep state as is). 3. Repeat steps 1 and 2, and record states {x i }. Markov chain of states (the sequence) {x i } generated, with each x i appearing in the sequence with probability P eq (x i ) (after some number of equilibration steps). After collecting N total configurations, the equilibrium average A can be estimated from A = 1 N N A(x i ) since the importance function w(x) = P eq (x). Rejected states are important to generate states with correct weight. Typically, should neglect the first N s points of Markov chain since it takes some iterations of procedure to generate states with stationary distribution of K(y x).

14 34 2. NUMERICAL INTEGRATION AND IMPORTANCE SAMPLING Called equilibration or burn in time. Often have correlation among adjacent states. Can record states every N corr Monte-Carlo steps (iterations). Why must rejected states be counted? Recall that K(x y) must be normalized to be a probability (or probability density). dy K(x y) = 1 = dy [δ(x y) + (1 δ(x y))] K(x y) hence, the probability to remain in the current state is K(x x) = 1 dy (1 δ(x y)) K(x y) K(x x) is non-zero to insure normalization, so we must see the sequence {..., x, x,... } in the Markov chain. 2.4 Statistical Uncertainties We would like some measure of the reliability of averages computed from a finite set of sampled data. Consider the average x of a quantity x constructed from a finite set of N measurements {x 1,..., x N }, where the x i are random variables drawn from a density ρ(x). x = 1 N x i N Suppose the random variables x i are drawn independently, so that the joint probability satisfies ρ(x 1,... x N ) = ρ(x 1 )ρ(x 2 ) ρ(x N ). Suppose that the intrinsic mean x of the random variable is x = dx xρ(x) and that the intrinsic variance is σ 2, where σ 2 = dx ( x x ) 2 ρ(x). A mesaure of reliability is the standard deviation of x, also known as the standard error σ E = σe 2, where σ2 E is the variance of the finite average x around x in the finite set {x 1,..., x N } σe 2 = ( x x ) 2.

15 2.4. STATISTICAL UNCERTAINTIES 35 We expect 1. Variance (error) decreases with increasing N 2. Error depends on the intrinsic variance σ 2 of the density ρ. Larger variance should mean slower convergence of x to x. Suppose all the intrinsic moments x n of ρ(x) exist. If we define the dimensionless variable z = x x σ E, and note that the probability density of z can be written as P ( z) = δ(z z) = 1 2π dt e it(z z) = 1 2π dt e it z e itz = 1 dt e it z χ N (t), (2.1) 2π where χ N (t) is the characteristic function of P (z). Using the fact that σ E = σ/ N, as will be shown shortly, z can be expressed as N N x i x z = = 1 N ( ) xi x, N σ N σ and hence where χ N (t) = it N ( ) xi x N σ e = χ(u) = e iu(x x )/σ, ( e it N ( x x σ ) ) N = χ(t/ N) N, which, for small arguments u can be expanded as χ(u) = 1 u 2 /2 + iu 3 κ 3 /(6σ 3 ) +..., where κ 3 is the third cumulant of the density ρ(x). Using this expansion, we find ln χ N (t) = t 2 /2 + O(N 1/2 ), and hence χ N (t) = e t2 /2 for large N. Inserting this form in Eq. (2.1) gives P ( z) = 1 2π e z2 /2 and hence we find the central limit theorem result that the averages x are distributed around the intrinsic mean according to the normal distribution N(x; x, σ 2 E) = 2 2σ E 2, 1 e (x x ) 2πσE

16 36 2. NUMERICAL INTEGRATION AND IMPORTANCE SAMPLING provided the number of samples N is large and all intrinsic moments (and hence cumulants) of ρ(x) are finite. If the central limit theorem holds so that the finite averages are normally distributed as above, then the standard error can be used to define confidence intervals since the probability p that the deviations in the average, x x, lie within a factor of c times the standard error is given by cσe cσ E d(x x )N(x; x, σ 2 x) = p. If p = 0.95, defining the 95% confidence intervals, then we find that c = 1.96, and hence the probability that the intrinsic mean x lies in the interval (x 1.96σ E, x σ E ) is P (x 1.96σ E x x σ E ) = If the data are not normally distributed, then higher cumulants of the probability density may be needed (skewness, kurtosis,...). The confidence intervals are defined in terms of integrals of the probability density. How do we calculate σ E or σe 2? We defined the variance as σ 2 E = ( x x ) 2 where x = 1 N i x i Expanding the square, we get σ 2 E = x2 x 2 but x 2 = 1 x N 2 i x j = 1 hence σ 2 E = σ2 /N. i,j N 2 ( x 2 i + i i ) x i x j = 1 N 2 ( N(σ 2 + x 2 ) + N(N 1) x 2) = σ2 N + x 2, However we don t really know what σ 2 is, so we must estimate this quantity. One way to do this is to define the estimator of the standard deviation ˆσ 2 : j i ˆσ 2 = 1 N (x i x) 2. i

17 2.4. STATISTICAL UNCERTAINTIES 37 The average of this estimator is ˆσ 2 = 1 N (x i x) 2 = 1 N i x 2 i 2x i x + x 2. i but using the facts that x i x = x2 N + N 1 N x 2 x 2 = σ2 N + x 2, we get ˆσ 2 E = (N 1)σ2 /N and hence σ 2 = N ˆσ 2 /(N 1). The standard error can therefore be estimated by σ E = ˆσ 2 N 1 Note that σ E decreases as 1/ N 1 Narrowing confidence intervals by a factor of 2 requires roughly 4 times more data. If the x i are not indepenent, then N N eff, where N eff is the effective number of independent configurations. If τ c is the correlation length (number of Monte-Carlo steps over which correlation x i x j differs from x 2 ), then N eff = N/τ c. Another way to estimate the intrinsic variance σ 2 is to use the Jackknife Method: see B. Efron, The annals of statistics 7, 1 (1979). Jackknife procedure is to form N averages from the set {x 1,..., x N } by omitting one data point. For example, the jth average is defined to be x (j) = 1 N 1 N i j x i The result is to form a set of averages {x (1), x (2),... x (N) }. The average x over this Jackknifed set is the same as before since x = 1 N N x (j) = j=1 1 N(N 1) N x i = 1 N j=1 i j N x i.

18 38 2. NUMERICAL INTEGRATION AND IMPORTANCE SAMPLING We define an estimator of the variance Σ 2 of the Jackknifed set to be Σ 2 = 1 N ( x (j) x ) 2. N j=1 which converges to Σ 2 as N gets large. Inserting the definitions of x (j) and x and using the independence of the x i, we see that the estimator is related to the intrinsic variance by ( 1 Σ 2 = N 1 1 ) σ 2 σ 2 = σ 2 = N(N 1) Σ 2. N N(N 1) The standard error can therefore be estimated using the Jackknifed set as Σ E = N 1 N ( x (j) x ) 2. N j=1

### Topics in Statistical Mechanics: The Foundations of Molecular Simulation

Topics in Statistical Mechanics: The Foundations of Molecular Simulation J. Schofield/R. van Zon CHM1464H Fall 2008, Lecture Notes 2 About the course The Foundations of Molecular Simulation This course

### Markov Processes. Stochastic process. Markov process

Markov Processes Stochastic process movement through a series of well-defined states in a way that involves some element of randomness for our purposes, states are microstates in the governing ensemble

### Multiple Random Variables

Multiple Random Variables Joint Probability Density Let X and Y be two random variables. Their joint distribution function is F ( XY x, y) P X x Y y. F XY ( ) 1, < x

### Spring 2012 Math 541B Exam 1

Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote

### Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo

Group Prof. Daniel Cremers 11. Sampling Methods: Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative

### Integration, differentiation, and root finding. Phys 420/580 Lecture 7

Integration, differentiation, and root finding Phys 420/580 Lecture 7 Numerical integration Compute an approximation to the definite integral I = b Find area under the curve in the interval Trapezoid Rule:

### Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain

### Monte Carlo. Lecture 15 4/9/18. Harvard SEAS AP 275 Atomistic Modeling of Materials Boris Kozinsky

Monte Carlo Lecture 15 4/9/18 1 Sampling with dynamics In Molecular Dynamics we simulate evolution of a system over time according to Newton s equations, conserving energy Averages (thermodynamic properties)

### REVIEW OF DIFFERENTIAL CALCULUS

REVIEW OF DIFFERENTIAL CALCULUS DONU ARAPURA 1. Limits and continuity To simplify the statements, we will often stick to two variables, but everything holds with any number of variables. Let f(x, y) be

### Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov

### Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods

Prof. Daniel Cremers 14. Sampling Methods Sampling Methods Sampling Methods are widely used in Computer Science as an approximation of a deterministic algorithm to represent uncertainty without a parametric

### Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods

Prof. Daniel Cremers 14. Sampling Methods Sampling Methods Sampling Methods are widely used in Computer Science as an approximation of a deterministic algorithm to represent uncertainty without a parametric

### MS 3011 Exercises. December 11, 2013

MS 3011 Exercises December 11, 2013 The exercises are divided into (A) easy (B) medium and (C) hard. If you are particularly interested I also have some projects at the end which will deepen your understanding

### Random Walks A&T and F&S 3.1.2

Random Walks A&T 110-123 and F&S 3.1.2 As we explained last time, it is very difficult to sample directly a general probability distribution. - If we sample from another distribution, the overlap will

### Random processes and probability distributions. Phys 420/580 Lecture 20

Random processes and probability distributions Phys 420/580 Lecture 20 Random processes Many physical processes are random in character: e.g., nuclear decay (Poisson distributed event count) P (k, τ) =

### 6.842 Randomness and Computation March 3, Lecture 8

6.84 Randomness and Computation March 3, 04 Lecture 8 Lecturer: Ronitt Rubinfeld Scribe: Daniel Grier Useful Linear Algebra Let v = (v, v,..., v n ) be a non-zero n-dimensional row vector and P an n n

### STA 294: Stochastic Processes & Bayesian Nonparametrics

MARKOV CHAINS AND CONVERGENCE CONCEPTS Markov chains are among the simplest stochastic processes, just one step beyond iid sequences of random variables. Traditionally they ve been used in modelling a

### MATH 205C: STATIONARY PHASE LEMMA

MATH 205C: STATIONARY PHASE LEMMA For ω, consider an integral of the form I(ω) = e iωf(x) u(x) dx, where u Cc (R n ) complex valued, with support in a compact set K, and f C (R n ) real valued. Thus, I(ω)

### 1. Introductory Examples

1. Introductory Examples We introduce the concept of the deterministic and stochastic simulation methods. Two problems are provided to explain the methods: the percolation problem, providing an example

### Physics 250 Green s functions for ordinary differential equations

Physics 25 Green s functions for ordinary differential equations Peter Young November 25, 27 Homogeneous Equations We have already discussed second order linear homogeneous differential equations, which

### Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods

Prof. Daniel Cremers 11. Sampling Methods Sampling Methods Sampling Methods are widely used in Computer Science as an approximation of a deterministic algorithm to represent uncertainty without a parametric

### Today: Fundamentals of Monte Carlo

Today: Fundamentals of Monte Carlo What is Monte Carlo? Named at Los Alamos in 1940 s after the casino. Any method which uses (pseudo)random numbers as an essential part of the algorithm. Stochastic -

### x 3y 2z = 6 1.2) 2x 4y 3z = 8 3x + 6y + 8z = 5 x + 3y 2z + 5t = 4 1.5) 2x + 8y z + 9t = 9 3x + 5y 12z + 17t = 7

Linear Algebra and its Applications-Lab 1 1) Use Gaussian elimination to solve the following systems x 1 + x 2 2x 3 + 4x 4 = 5 1.1) 2x 1 + 2x 2 3x 3 + x 4 = 3 3x 1 + 3x 2 4x 3 2x 4 = 1 x + y + 2z = 4 1.4)

### Definition A finite Markov chain is a memoryless homogeneous discrete stochastic process with a finite number of states.

Chapter 8 Finite Markov Chains A discrete system is characterized by a set V of states and transitions between the states. V is referred to as the state space. We think of the transitions as occurring

### Convex Optimization CMU-10725

Convex Optimization CMU-10725 Simulated Annealing Barnabás Póczos & Ryan Tibshirani Andrey Markov Markov Chains 2 Markov Chains Markov chain: Homogen Markov chain: 3 Markov Chains Assume that the state

### Monte Carlo Methods in Statistical Mechanics

Monte Carlo Methods in Statistical Mechanics Mario G. Del Pópolo Atomistic Simulation Centre School of Mathematics and Physics Queen s University Belfast Belfast Mario G. Del Pópolo Statistical Mechanics

### Monte Carlo Methods. Leon Gu CSD, CMU

Monte Carlo Methods Leon Gu CSD, CMU Approximate Inference EM: y-observed variables; x-hidden variables; θ-parameters; E-step: q(x) = p(x y, θ t 1 ) M-step: θ t = arg max E q(x) [log p(y, x θ)] θ Monte

### Markov Chain Monte Carlo The Metropolis-Hastings Algorithm

Markov Chain Monte Carlo The Metropolis-Hastings Algorithm Anthony Trubiano April 11th, 2018 1 Introduction Markov Chain Monte Carlo (MCMC) methods are a class of algorithms for sampling from a probability

### 21 Linear State-Space Representations

ME 132, Spring 25, UC Berkeley, A Packard 187 21 Linear State-Space Representations First, let s describe the most general type of dynamic system that we will consider/encounter in this class Systems may

### Lecture XI. Approximating the Invariant Distribution

Lecture XI Approximating the Invariant Distribution Gianluca Violante New York University Quantitative Macroeconomics G. Violante, Invariant Distribution p. 1 /24 SS Equilibrium in the Aiyagari model G.

### 04. Random Variables: Concepts

University of Rhode Island DigitalCommons@URI Nonequilibrium Statistical Physics Physics Course Materials 215 4. Random Variables: Concepts Gerhard Müller University of Rhode Island, gmuller@uri.edu Creative

### Markov Chains and Stochastic Sampling

Part I Markov Chains and Stochastic Sampling 1 Markov Chains and Random Walks on Graphs 1.1 Structure of Finite Markov Chains We shall only consider Markov chains with a finite, but usually very large,

### 16 : Markov Chain Monte Carlo (MCMC)

10-708: Probabilistic Graphical Models 10-708, Spring 2014 16 : Markov Chain Monte Carlo MCMC Lecturer: Matthew Gormley Scribes: Yining Wang, Renato Negrinho 1 Sampling from low-dimensional distributions

### Inner product spaces. Layers of structure:

Inner product spaces Layers of structure: vector space normed linear space inner product space The abstract definition of an inner product, which we will see very shortly, is simple (and by itself is pretty

### Today: Fundamentals of Monte Carlo

Today: Fundamentals of Monte Carlo What is Monte Carlo? Named at Los Alamos in 940 s after the casino. Any method which uses (pseudo)random numbers as an essential part of the algorithm. Stochastic - not

### Today: Fundamentals of Monte Carlo

Today: Fundamentals of Monte Carlo What is Monte Carlo? Named at Los Alamos in 1940 s after the casino. Any method which uses (pseudo)random numbers as an essential part of the algorithm. Stochastic -

### 8 Error analysis: jackknife & bootstrap

8 Error analysis: jackknife & bootstrap As discussed before, it is no problem to calculate the expectation values and statistical error estimates of normal observables from Monte Carlo. However, often

### [POLS 8500] Review of Linear Algebra, Probability and Information Theory

[POLS 8500] Review of Linear Algebra, Probability and Information Theory Professor Jason Anastasopoulos ljanastas@uga.edu January 12, 2017 For today... Basic linear algebra. Basic probability. Programming

### Linear Algebra: Matrix Eigenvalue Problems

CHAPTER8 Linear Algebra: Matrix Eigenvalue Problems Chapter 8 p1 A matrix eigenvalue problem considers the vector equation (1) Ax = λx. 8.0 Linear Algebra: Matrix Eigenvalue Problems Here A is a given

### COURSE Numerical integration of functions (continuation) 3.3. The Romberg s iterative generation method

COURSE 7 3. Numerical integration of functions (continuation) 3.3. The Romberg s iterative generation method The presence of derivatives in the remainder difficulties in applicability to practical problems

### Math 456: Mathematical Modeling. Tuesday, April 9th, 2018

Math 456: Mathematical Modeling Tuesday, April 9th, 2018 The Ergodic theorem Tuesday, April 9th, 2018 Today 1. Asymptotic frequency (or: How to use the stationary distribution to estimate the average amount

### Markov Chains, Random Walks on Graphs, and the Laplacian

Markov Chains, Random Walks on Graphs, and the Laplacian CMPSCI 791BB: Advanced ML Sridhar Mahadevan Random Walks! There is significant interest in the problem of random walks! Markov chain analysis! Computer

### Monte Carlo and cold gases. Lode Pollet.

Monte Carlo and cold gases Lode Pollet lpollet@physics.harvard.edu 1 Outline Classical Monte Carlo The Monte Carlo trick Markov chains Metropolis algorithm Ising model critical slowing down Quantum Monte

### Numerical methods for lattice field theory

Numerical methods for lattice field theory Mike Peardon Trinity College Dublin August 9, 2007 Mike Peardon (Trinity College Dublin) Numerical methods for lattice field theory August 9, 2007 1 / 24 Numerical

### Systems of Linear ODEs

P a g e 1 Systems of Linear ODEs Systems of ordinary differential equations can be solved in much the same way as discrete dynamical systems if the differential equations are linear. We will focus here

### SPRING 2006 PRELIMINARY EXAMINATION SOLUTIONS

SPRING 006 PRELIMINARY EXAMINATION SOLUTIONS 1A. Let G be the subgroup of the free abelian group Z 4 consisting of all integer vectors (x, y, z, w) such that x + 3y + 5z + 7w = 0. (a) Determine a linearly

### This is a Gaussian probability centered around m = 0 (the most probable and mean position is the origin) and the mean square displacement m 2 = n,or

Physics 7b: Statistical Mechanics Brownian Motion Brownian motion is the motion of a particle due to the buffeting by the molecules in a gas or liquid. The particle must be small enough that the effects

### Markov Chain Monte Carlo Simulations and Their Statistical Analysis An Overview

Markov Chain Monte Carlo Simulations and Their Statistical Analysis An Overview Bernd Berg FSU, August 30, 2005 Content 1. Statistics as needed 2. Markov Chain Monte Carlo (MC) 3. Statistical Analysis

### AP Calculus Chapter 9: Infinite Series

AP Calculus Chapter 9: Infinite Series 9. Sequences a, a 2, a 3, a 4, a 5,... Sequence: A function whose domain is the set of positive integers n = 2 3 4 a n = a a 2 a 3 a 4 terms of the sequence Begin

### Stochastic process for macro

Stochastic process for macro Tianxiao Zheng SAIF 1. Stochastic process The state of a system {X t } evolves probabilistically in time. The joint probability distribution is given by Pr(X t1, t 1 ; X t2,

### 6 Markov Chain Monte Carlo (MCMC)

6 Markov Chain Monte Carlo (MCMC) The underlying idea in MCMC is to replace the iid samples of basic MC methods, with dependent samples from an ergodic Markov chain, whose limiting (stationary) distribution

### Advanced Monte Carlo Methods Problems

Advanced Monte Carlo Methods Problems September-November, 2012 Contents 1 Integration with the Monte Carlo method 2 1.1 Non-uniform random numbers.......................... 2 1.2 Gaussian RNG..................................

### STOCHASTIC PROCESSES Basic notions

J. Virtamo 38.3143 Queueing Theory / Stochastic processes 1 STOCHASTIC PROCESSES Basic notions Often the systems we consider evolve in time and we are interested in their dynamic behaviour, usually involving

### Monte Carlo Integration I

Monte Carlo Integration I Digital Image Synthesis Yung-Yu Chuang with slides by Pat Hanrahan and Torsten Moller Introduction L o p,ωo L e p,ωo s f p,ωo,ω i L p,ω i cosθ i i d The integral equations generally

### Iterative Methods for Solving A x = b

Iterative Methods for Solving A x = b A good (free) online source for iterative methods for solving A x = b is given in the description of a set of iterative solvers called templates found at netlib: http

### CSC 446 Notes: Lecture 13

CSC 446 Notes: Lecture 3 The Problem We have already studied how to calculate the probability of a variable or variables using the message passing method. However, there are some times when the structure

### NATIONAL UNIVERSITY OF SINGAPORE PC5215 NUMERICAL RECIPES WITH APPLICATIONS. (Semester I: AY ) Time Allowed: 2 Hours

NATIONAL UNIVERSITY OF SINGAPORE PC55 NUMERICAL RECIPES WITH APPLICATIONS (Semester I: AY 0-) Time Allowed: Hours INSTRUCTIONS TO CANDIDATES. This examination paper contains FIVE questions and comprises

### 6.207/14.15: Networks Lectures 4, 5 & 6: Linear Dynamics, Markov Chains, Centralities

6.207/14.15: Networks Lectures 4, 5 & 6: Linear Dynamics, Markov Chains, Centralities 1 Outline Outline Dynamical systems. Linear and Non-linear. Convergence. Linear algebra and Lyapunov functions. Markov

### Chapter 7. Markov chain background. 7.1 Finite state space

Chapter 7 Markov chain background A stochastic process is a family of random variables {X t } indexed by a varaible t which we will think of as time. Time can be discrete or continuous. We will only consider

### THE INVERSE FUNCTION THEOREM

THE INVERSE FUNCTION THEOREM W. PATRICK HOOPER The implicit function theorem is the following result: Theorem 1. Let f be a C 1 function from a neighborhood of a point a R n into R n. Suppose A = Df(a)

### Linear Algebra. Matrices Operations. Consider, for example, a system of equations such as x + 2y z + 4w = 0, 3x 4y + 2z 6w = 0, x 3y 2z + w = 0.

Matrices Operations Linear Algebra Consider, for example, a system of equations such as x + 2y z + 4w = 0, 3x 4y + 2z 6w = 0, x 3y 2z + w = 0 The rectangular array 1 2 1 4 3 4 2 6 1 3 2 1 in which the

### April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning

for for Advanced Topics in California Institute of Technology April 20th, 2017 1 / 50 Table of Contents for 1 2 3 4 2 / 50 History of methods for Enrico Fermi used to calculate incredibly accurate predictions

### 2tdt 1 y = t2 + C y = which implies C = 1 and the solution is y = 1

Lectures - Week 11 General First Order ODEs & Numerical Methods for IVPs In general, nonlinear problems are much more difficult to solve than linear ones. Unfortunately many phenomena exhibit nonlinear

### CSC 2541: Bayesian Methods for Machine Learning

CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 3 More Markov Chain Monte Carlo Methods The Metropolis algorithm isn t the only way to do MCMC. We ll

### The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision

The Particle Filter Non-parametric implementation of Bayes filter Represents the belief (posterior) random state samples. by a set of This representation is approximate. Can represent distributions that

### Linear Algebra March 16, 2019

Linear Algebra March 16, 2019 2 Contents 0.1 Notation................................ 4 1 Systems of linear equations, and matrices 5 1.1 Systems of linear equations..................... 5 1.2 Augmented

### Markov Chain Monte Carlo

Chapter 5 Markov Chain Monte Carlo MCMC is a kind of improvement of the Monte Carlo method By sampling from a Markov chain whose stationary distribution is the desired sampling distributuion, it is possible

### 8. Diagonalization.

8. Diagonalization 8.1. Matrix Representations of Linear Transformations Matrix of A Linear Operator with Respect to A Basis We know that every linear transformation T: R n R m has an associated standard

### 1.2. Direction Fields: Graphical Representation of the ODE and its Solution Let us consider a first order differential equation of the form dy

.. Direction Fields: Graphical Representation of the ODE and its Solution Let us consider a first order differential equation of the form dy = f(x, y). In this section we aim to understand the solution

### Stochastic Particle Methods for Rarefied Gases

CCES Seminar WS 2/3 Stochastic Particle Methods for Rarefied Gases Julian Köllermeier RWTH Aachen University Supervisor: Prof. Dr. Manuel Torrilhon Center for Computational Engineering Science Mathematics

### Stochastic processes. MAS275 Probability Modelling. Introduction and Markov chains. Continuous time. Markov property

Chapter 1: and Markov chains Stochastic processes We study stochastic processes, which are families of random variables describing the evolution of a quantity with time. In some situations, we can treat

### Midterm for Introduction to Numerical Analysis I, AMSC/CMSC 466, on 10/29/2015

Midterm for Introduction to Numerical Analysis I, AMSC/CMSC 466, on 10/29/2015 The test lasts 1 hour and 15 minutes. No documents are allowed. The use of a calculator, cell phone or other equivalent electronic

### Data Analysis I. Dr Martin Hendry, Dept of Physics and Astronomy University of Glasgow, UK. 10 lectures, beginning October 2006

Astronomical p( y x, I) p( x, I) p ( x y, I) = p( y, I) Data Analysis I Dr Martin Hendry, Dept of Physics and Astronomy University of Glasgow, UK 10 lectures, beginning October 2006 4. Monte Carlo Methods

### MTH4101 CALCULUS II REVISION NOTES. 1. COMPLEX NUMBERS (Thomas Appendix 7 + lecture notes) ax 2 + bx + c = 0. x = b ± b 2 4ac 2a. i = 1.

MTH4101 CALCULUS II REVISION NOTES 1. COMPLEX NUMBERS (Thomas Appendix 7 + lecture notes) 1.1 Introduction Types of numbers (natural, integers, rationals, reals) The need to solve quadratic equations:

### Lecture 4: Numerical solution of ordinary differential equations

Lecture 4: Numerical solution of ordinary differential equations Department of Mathematics, ETH Zürich General explicit one-step method: Consistency; Stability; Convergence. High-order methods: Taylor

### Stat 516, Homework 1

Stat 516, Homework 1 Due date: October 7 1. Consider an urn with n distinct balls numbered 1,..., n. We sample balls from the urn with replacement. Let N be the number of draws until we encounter a ball

### Econ 204 Supplement to Section 3.6 Diagonalization and Quadratic Forms. 1 Diagonalization and Change of Basis

Econ 204 Supplement to Section 3.6 Diagonalization and Quadratic Forms De La Fuente notes that, if an n n matrix has n distinct eigenvalues, it can be diagonalized. In this supplement, we will provide

### Lecture 2 : CS6205 Advanced Modeling and Simulation

Lecture 2 : CS6205 Advanced Modeling and Simulation Lee Hwee Kuan 21 Aug. 2013 For the purpose of learning stochastic simulations for the first time. We shall only consider probabilities on finite discrete

### Georgia Tech PHYS 6124 Mathematical Methods of Physics I

Georgia Tech PHYS 612 Mathematical Methods of Physics I Instructor: Predrag Cvitanović Fall semester 2012 Homework Set #5 due October 2, 2012 == show all your work for maximum credit, == put labels, title,

### Computational Physics (6810): Session 13

Computational Physics (6810): Session 13 Dick Furnstahl Nuclear Theory Group OSU Physics Department April 14, 2017 6810 Endgame Various recaps and followups Random stuff (like RNGs :) Session 13 stuff

### Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can

### MTH739U/P: Topics in Scientific Computing Autumn 2016 Week 6

MTH739U/P: Topics in Scientific Computing Autumn 16 Week 6 4.5 Generic algorithms for non-uniform variates We have seen that sampling from a uniform distribution in [, 1] is a relatively straightforward

### Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition Prof. Tesler Math 283 Fall 2016 Also see the separate version of this with Matlab and R commands. Prof. Tesler Diagonalizing

### 1 Review of di erential calculus

Review of di erential calculus This chapter presents the main elements of di erential calculus needed in probability theory. Often, students taking a course on probability theory have problems with concepts

### Linear Algebra Solutions 1

Math Camp 1 Do the following: Linear Algebra Solutions 1 1. Let A = and B = 3 8 5 A B = 3 5 9 A + B = 9 11 14 4 AB = 69 3 16 BA = 1 4 ( 1 3. Let v = and u = 5 uv = 13 u v = 13 v u = 13 Math Camp 1 ( 7

### Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition Prof. Tesler Math 283 Fall 2018 Also see the separate version of this with Matlab and R commands. Prof. Tesler Diagonalizing

### Bare-bones outline of eigenvalue theory and the Jordan canonical form

Bare-bones outline of eigenvalue theory and the Jordan canonical form April 3, 2007 N.B.: You should also consult the text/class notes for worked examples. Let F be a field, let V be a finite-dimensional

### 18.S34 linear algebra problems (2007)

18.S34 linear algebra problems (2007) Useful ideas for evaluating determinants 1. Row reduction, expanding by minors, or combinations thereof; sometimes these are useful in combination with an induction

Section 6.6 Gaussian Quadrature Key Terms: Method of undetermined coefficients Nonlinear systems Gaussian quadrature Error Legendre polynomials Inner product Adapted from http://pathfinder.scar.utoronto.ca/~dyer/csca57/book_p/node44.html

### Topics in linear algebra

Chapter 6 Topics in linear algebra 6.1 Change of basis I want to remind you of one of the basic ideas in linear algebra: change of basis. Let F be a field, V and W be finite dimensional vector spaces over

### ACM 116: Lectures 3 4

1 ACM 116: Lectures 3 4 Joint distributions The multivariate normal distribution Conditional distributions Independent random variables Conditional distributions and Monte Carlo: Rejection sampling Variance

### Pattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods

Pattern Recognition and Machine Learning Chapter 11: Sampling Methods Elise Arnaud Jakob Verbeek May 22, 2008 Outline of the chapter 11.1 Basic Sampling Algorithms 11.2 Markov Chain Monte Carlo 11.3 Gibbs

### 1 Linear Difference Equations

ARMA Handout Jialin Yu 1 Linear Difference Equations First order systems Let {ε t } t=1 denote an input sequence and {y t} t=1 sequence generated by denote an output y t = φy t 1 + ε t t = 1, 2,... with

### Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression

### On prediction and density estimation Peter McCullagh University of Chicago December 2004

On prediction and density estimation Peter McCullagh University of Chicago December 2004 Summary Having observed the initial segment of a random sequence, subsequent values may be predicted by calculating

### 1 9/5 Matrices, vectors, and their applications

1 9/5 Matrices, vectors, and their applications Algebra: study of objects and operations on them. Linear algebra: object: matrices and vectors. operations: addition, multiplication etc. Algorithms/Geometric

### Estimation of uncertainties using the Guide to the expression of uncertainty (GUM)

Estimation of uncertainties using the Guide to the expression of uncertainty (GUM) Alexandr Malusek Division of Radiological Sciences Department of Medical and Health Sciences Linköping University 2014-04-15