A Practical Guide to Quasi-Monte Carlo Methods

Size: px
Start display at page:

Download "A Practical Guide to Quasi-Monte Carlo Methods"

Transcription

1 A Practical Guide to Quasi-Monte Carlo Methods Frances Y. Kuo and Dirk Nuyens Abstract These notes are prepared for the short course on High-dimensional Integration: the Quasi-Monte Carlo Way, to be held at National Chiao Tung University and National Taiwan University in November We will cover basic theory and practical usage of quasi-monte Carlo methods, with a demo on the software packages. Our aim is to make these notes easily accessible to non-experts, including students, practitioners, and potential new collaborators. We discuss only the essential concepts and hide away most of the technical details. We do not cite references in the text, but references for further reading are provided in the final section. The sections marked with * contain more theoretical background and are targeted at potential collaborators who wish to gain a deeper understanding. These sections are not necessary for students and practitioners who just want to try out quasi-monte Carlo methods for the first time. Date: 7 November 2016 Frances Y. Kuo School of Mathematics and Statistics, University of New South Wales, Sydney NSW 2052, Australia f.kuo@unsw.edu.au Dirk Nuyens Department of Computer Science, KU Leuven, Celestijnenlaan 200A, 3001 Leuven, Belgium dirk.nuyens@cs.kuleuven.be 1

2

3 Contents 1 Introduction High dimensional integration Monte Carlo method Quasi-Monte Carlo methods Lattice points Generating vector Random shifting and practical error estimation Fast component-by-component construction Lattice sequences A taste of the theoretical error analysis* Digital nets Digital net property Digital construction Sobol sequences Polynomial lattice rules Random digital shifting and scrambling Higher order nets by interlacing Toy applications Transformation to the unit cube Option pricing Maximum likelihood PDE with a random coefficient Software demo A simple test function The difficulty of our test function Some technical details* Usage of random number generators Monte Carlo approximation Quasi-Monte Carlo approximation Using standard lattice point generators Applying the theory*

4 4 Contents 5.9 Constructing point sets Sobol sequences, digital sequences, and interlacing Small project Further reading... 49

5 1.2 Monte Carlo method 5 1 Introduction High dimensional problems are coming to play an ever more important role in applications. They pose immense challenges for practical computation, because of a nearly inevitable tendency for the cost of computation to increase exponentially with dimension. Effective and efficient methods that do not suffer from this curse of dimensionality are in great demand. Quasi-Monte Carlo (QMC) methods can lift this curse and we will show you how. 1.1 High dimensional integration We begin with an integral formulated over the s-dimensional unit cube [0,1] s, 1 1 I (f ) = f (x 1,...,x s )dx 1 dx s = f (x)dx, 0 0 [0,1] s where the number of integration variables the dimensionality s is large, e.g., hundreds or thousands or more. (Note that an expectation can be written as an integral. Later we will discuss the important question of how to transform an integral from practical applications into this form.) One approach that comes to mind is to approximate this integral by a product rule, i.e., each one-dimensional integral is approximated by your favorite onedimensional quadrature rule, e.g., rectangle rule, Simpson rule, Gauss rule, etc. But this would not work: with 100 integration variables, even if you have just 2 quadrature points in each coordinate direction, then you would require evaluations of the integrand f and your computation would never finish in your life time! So, forget about product rules in high dimensions! (There is a class of methods calledsparse grids which cleverly leaves out some product points; that s a story for another day.) 1.2 Monte Carlo method The Monte Carlo method, or MC method in short, approximates the integral by averaging random samples of the function Q n (f ) = 1 n n 1 k=0 f (t k ), (1)

6 6 1 Introduction where the sample points t 0,...,t n 1 are independent and uniformly distributed over the unit cube. This is a very simple and widely used method. It can be deployed as long as the integrand is square integrable. Apart from the ease of use, the Monte Carlo method has the advantage of producing an unbiased estimate of the integral, i.e., E[Q n (f )] = I(f ). It can be easily shown that the root-mean-square error of the Monte Carlo method satisfies E I (f ) Q n (f ) 2 = σ(f ) n, where σ 2 (f ):= I (f 2 ) (I (f )) 2 is the variance of f. So we say that the Monte Carlo method converges like order 1/ n, and we write O(1/ n). In concrete terms, this means that if you want to reduce your error in half, then you need to use 4 times as many sample points. This convergence rate is often too slow for practical applications. The variance of f is generally not explicitly known, but in practice we can estimate the root-mean-square error by E I (f ) Q n (f ) 2 1 n(n 1) (f (t k ) Q n (f )) 2. n 1 k=0 1.3 Quasi-Monte Carlo methods Quasi-Monte Carlo methods, orqmc methods in short, take the same form (1) as the Monte Carlo method in the unit cube, but instead of generating the sample points t k randomly, we choose them deterministically in a clever way to be more uniformly distributed than random points, so that they have a faster rate of convergence. All QMC theoretical error bounds take the common form of a product I (f ) Q n (f ) D(t 0,...,t n 1 )V (f ), (2) with one factor depending only on the points and the other depending only on the integrand. In the classical theory these two factors are called the discrepancy of the points and the variation of f, respectively. If the integrand f has sufficient smoothness, e.g., can be differentiated once with respect to each variable, then classical theory tells us that certain QMC methods can converge like O((logn) s /n); they are referred to as low-discrepancy sequences. The convergence rates can be even higher for periodic integrands. The drawback of the classical QMC theory is that the error bound and implied constant grow exponentially with dimension s, so the theory is not useful when s is very large. A remedy is provided in modern QMC theory by working with weighted function spaces: the error bound can be independent of s as long as the

7 1.3 Quasi-Monte Carlo methods 7 integrand f has the appropriate property that there is some varying degree of importance between the variables. A taste of this modern theory is given in 2.5. We then have a very similar, but modern, interpretation of (2) in the form I (f ) Q n (f ) e γ (t 0,...,t n 1 ) f γ, where the first factor is now called the worst case error of the QMC method in a weighted function space with weights γ, and the second factor is the norm of f in that same weighted space. There are two main families of QMC methods: lattice rules and digital nets. They represent different approaches to achieve uniformity of the points. Here we will introduce these methods, providing a bit more details on lattice rules while touching only some basic principles of digital nets.

8 8 2 Lattice points 2 Lattice points Lattice rules have been around since the 1950s and they are very easy to specify and use: all you need is one integer vector with s components. 2.1 Generating vector Given an integer vector z = (z 1,...,z s ) known as the generating vector, a (rank-1) lattice rule with n points takes the form Q n (f ) = 1 n n 1 k=0 f ({ }) kz = 1 n n n 1 k=0 f ( kz mod n n ), (3) where the braces around a vector indicate that we take the fractional parts of each component in the vector, e.g., {(1.8, 2.3)} = (0.8, 0.3), which is clearly equivalent to carrying out the modulo n operation in the numerator as indicated in (3). Figure 1 (left) illustrates a 64 point lattice rule in 2D. The quality of the lattice rule depends on the choice of the generating vector. Due to the modulo operation, it suffices to consider the values from 1 up to n 1, leaving out 0 which is clearly a bad choice. Furthermore, we restrict the values to those relatively prime to n, to ensure that every one-dimensional projection of the n points yields n distinct values. Thus we write z U s n, with U n := {z Z :1 z n 1 and gcd(z,n) = 1}. For theoretical analysis we often assume that n is prime to simplify some number theory arguments. For practical application we often take n to be a power of 2. The total number of possible choices for the generating vector is then (n 1) s and (n/2) s, respectively. Even if we have a criterion to assess the quality of the generating vectors, there are simply too many choices to carry out an exhaustive search when n and s are large. Later we will return to this issue of constructing a good generating vector. 2.2 Random shifting and practical error estimation We can shift the points of a lattice rule by any vector of real numbers Δ = (Δ 1,...,Δ s ), to obtain a shifted lattice rule Q n (f ) = 1 n n 1 k=0 f ({ }) kz n + Δ.

9 2.2 Random shifting and practical error estimation 9 Fig. 1 Applying a (0.1, 0.3)-shift to a 64-point lattice rule in two dimensions: left original lattice rule, middle moving all points by (0.1,0.3), right wrapping the points back inside the unit cube. Due to the fractional part function, we may restrict the shift to Δ [0,1) s. Figure 1 (right) illustrates the result of shifting a 64-point lattice rule in 2D by the vector (0.1, 0.3). Clearly we see the regular structure of the lattice points are preserved. A randomly shifted lattice rule provides an unbiased approximation of the integral (this applies to all QMC methods), while using multiple shifts allows us to obtain a practical error estimate in the same way as the Monte Carlo method. It works as follows. We generate q independent random shifts Δ (i) for i = 0,...,q 1 from the uniform distribution on [0,1] s. For the same fixed lattice generating vector z, we compute the q different shifted lattice rule approximations and denote them by Q (i) n (f )fori = 0,...,q 1. We take the average Q n,q (f ) = 1 q q 1 i=0 Q (i) n (f ) = 1 q q 1 i=0 ( 1 n n 1 k=0 f ({ kz n + Δ(i) }) ) as our final approximation to the integral. Then an estimate for the root-meansquare error of Q n,q (f ) is given by E I (f ) Q n,q (f ) 2 1 q(q 1) q 1 i=0 (Q (i) n (f ) Q n,q (f )) 2. Here the expectation is taken with respect to the random shifts. The total number of function evaluations in Q n,q (f )isqn. Typically, we take q to be small, e.g., q = 16 or 32. For a fair comparison with the Monte Carlo method, we should therefore take n MC = qn QMC samples in the Monte Carlo method.

10 10 2 Lattice points 2.3 Fast component-by-component construction Recall that the components of the generating vector can be restricted to the set U n,e.g.,u n = {1,2,3,...,n 1} when n is prime, and U n = {1,3,5,...,n 1} when n is a power of 2. There are far too many choices in high dimensions. Suppose that we have a computable criterion for assessing the quality of a generating vector in dimension s, and we denote it by E s (z 1,...,z s ), and it is the smaller the better. Then we can use the component-by-component construction to find a generating vector: 1. Set z 1 = 1 (because in one dimension all choices are the same). 2. Choose z 2 from the set U n so that E 2 (z 1, z 2 ) is minimized. 3. Choose z 3 from the set U n so that E 3 (z 1, z 2, z 3 ) is minimized. 4. Choose z 4 from the set U n so that E 4 (z 1, z 2, z 3, z 4 ) is minimized The fact that such a greedy algorithm can produce good generating vectors is justified by theory, and we will say more about this in 2.5. The computational cost of the algorithm depends on the form of this criterion E s (z 1,...,z s ). We have the fast component-by-component construction: in some favourable situations the cost is O(snlogn) operations, i.e., linearly in dimension s and almost linearly in the number of points n. This means that we can really construct generating vectors in tens of thousands of dimensions and millions of points! The magic behind the fast component-by-component construction is that in many cases the algorithm requires the evaluation of a matrix-vector multiplication with the matrix of the form [ ( )] kz mod n ω n z U n,1 k n 1 for some function ω.whenn is prime, we can permute the rows and columns of this matrix to obtain a circulant matrix so that the matrix-vector multiplication which typically requires O(n 2 ) operations can be done in O(n logn) operations using Fast Fourier Transforms. When n is not prime it gets more complicated but similar cost savings can be made. Figure 2 illustrates the structure of such a matrix (left) for n = 53 and the corresponding matrix after permutation (right). 2.4 Lattice sequences Recall that the formula for obtaining the kth point of an n-point lattice rule with generating vector z is { } k t k = n z. (4)

11 2.4 Lattice sequences Fig. 2 Circulant permutation for n = 53 (prime) for fast component-by-component construction. This gives rise to a so-called closed QMC method: the generation of the points depends on knowing n in advance. This is inconvenient in practice, because if we want to change the number of points we would need to generate all of the points from scratch. An open QMC method, on the contrary, allows you to keep adding points as you wish while keeping all existing points; they are therefore referred to as sequences and are also said to be extensible. In a lattice sequence in base 2, the formula is changed to t k = { φ 2 (k) z }, (5) where φ 2 ( )istheradical inverse function in base 2: loosely speaking, if we have the index k = ( k 2 k 1 k 0 ) 2 in binary representation, then φ 2 (k) = (0.k 0 k 1 k 2 ) 2 is obtained by mirroring the bits of k around the binary point. For example, if k = 6 = (110) 2 then φ 2 (k) = (0.011) 2 = The formula (5) does not require you to know n in advance, and so in practice you can add more points to your lattice rule approximation until you are satisfied with the error. When n = 2 m for any m 1, the formulas (4) and (5) produce the same set of points, only the ordering of the points are different. Therefore, if you want the points of an extensible lattice rule only at exact powers of 2, you can avoid the radical inverse function and still use the formula (4) to get your points. For example, if you already have n = 2 m points for some m, then to double the number of points all you need to do is to use the formula (4) with n replaced by 2n, and then consider only those points generated by the odd indices k. All of the above extends trivially to base b 2. We know how to construct a good generating vector for a lattice sequence using the fast component-by-component construction, i.e., this generating vector can be used for many different values of n. Figure 3 illustrates the nested structure in the matrices when we work with powers of 2 and how we explore the nested circulant structure.

12 Lattice points 1 Natural ordering of the indices 4 Symmetric reduction after application of B 2 kernel function Grouping on divisors 3 Generator ordering of the indices Fig. 3 Number theoretic permutations on a matrix with n = 128 (power of 2) for fast componentby-component construction. 2.5 A taste of the theoretical error analysis* Here we discuss some key elements of the theory and construction for lattice rules. It is not necessary to understand all of these to be able to use lattice rules. We therefore mark this subsection as optional material (*). The reader may skip to the next section. Weighted Sobolev space: what kind of integrands can we handle? In the modern analysis of randomly shifted lattice rules, we assume that the integrand f belongs to a weighted Sobolev space of functions whose mixed first derivatives are square-integrable, with the norm given by f 2 γ = 1 u {1,...,s} γ u [0,1] u ( u ) 2 f (x)dx u dx u. (6) [0,1] s u x u There are different variants of the norm, but ultimately it is a way to measure the regularity and variability of the function. Okay, this is a hell of a formula to take in. Let us explain what it means step by step. There are 2 s possible subsets u of the coordinate indices {1,...,s}. Let us pick a simple example first, say, s = 5andu = {1,3,4}. Then we separate the active variables x u = (x 1, x 3, x 4 ) from the inactive variables x u = (x 2, x 5 ), and consider

13 2.5 A taste of the theoretical error analysis* 13 u f x u = 3 f x 1 x 3 x 4. This is called a mixed first derivative because we never differentiate more than once with respect to each variable, even though it looks like a 3rd order derivative in the regular sense. According to the norm, we should integrate out the inactive variables, square the result, and then integrate out the active variables: ( ) 2 f (x 1, x 3, x 4 ; x 2, x 5 )dx 2 dx 5 dx 1 dx 3 dx 4. (7) x 1 x 3 x We do this for each of the 2 s subsets of {1,...,s} and then sum up the results, but with weights γ u > 0 acting as relative scaling. A large value for (7) means that f is more variable in the projection of (x 1, x 3, x 4 ), and we need a larger weight γ {1,3,4} to compensate it in the norm if we want f to have norm 1. We denote the norm with a subscript γ to emphasize the important role played by the weights. We will see below that under appropriate conditions on the weights we can obtain error bounds that are independent of the dimension s. In practice we would choose the weights to match the characteristics of a given integrand. The simplest form of weights are the so-called product weights: we assume that there is one weight γ j > 0 associated with each variable x j so that γ u = j uγ j, e.g., γ {1,3,4} = γ 1 γ 3 γ 4. Typically we also assume that γ 1 γ 2 >0, indicating that the variables are labeled in the order of decreasing importance. Another form of weights that have become popular in recent times are called POD weights, or product and order dependent weights. There is an additional sequence of numbers Γ l such that γ u = Γ u j u γ j, i.e., the weights have an extra multiplying factor which depends on the number of elements in the set u, hence the name order dependent. POD weights arise from some PDE applications and often some factorials u! appear; we won t discuss them further here. Worst case error: how do we assess the quality of a lattice rule? The worst case error for a shifted lattice rule in our weighted Sobolev space is defined to be the largest possible error for any function with norm at most 1, i.e., e γ (z,δ):= sup I (f ) Q n (f ). f γ 1 This means that for any given f in our weighted Sobolev space, we have the lattice rule error bound I (f ) Q n (f ) e γ (z,δ) f γ.

14 14 2 Lattice points For a randomly shifted lattice rule, we have the root-mean-square error bound E I (f ) Q n (f ) 2 eγ sh (z) f γ, (8) where the expectation is with respect to the random shift Δ, and where e sh γ (z):= [0,1] s e 2 γ(z,δ)dδ is called the shift-averaged worst case error. Notice the separation of the dependence of the error bounds on the points from the dependence on the integrand, similarly to (2), but here the weights enter both factors. There is a trade-off: large weights lead to a small norm but a large worst case error, and vice versa. Our weighted Sobolev space happens to be a reproducing kernel Hilbert space. We will not go into any details here, other than saying that this provides a very powerful set of tools for analysis and we have explicit computable formulas for e γ (z,δ)andeγ sh (z). In particular, with product weights we know that [eγ sh (z)]2 = n 1 n s k=0 j =1 ( ({ })) kzj 1 + γ j B 2, (9) n where B 2 (x) = x 2 x + 1/6 for x [0,1] is the Bernoulli polynomial of degree 2. Component-by-component construction: how do we find a good lattice generating vector? Given weights γ j and a generating vector z, we can evaluate (9) in O(ns) operations. Theoretically we could do this for each of the (n 1) s choices of generating vectors when n is prime and then pick the vector with the smallest worst case error. This is however not practically possible when s is large. We will choose the generating vector by the component-by-component construction: given n, s max, and weights γ u, 1. Set z 1 = For s = 2,3,..., s max, choose z s in U n to minimize [e sh γ (z 1,...,z s 1, z s )] 2. To prove that this gives a good lattice rule we use mathematical induction, combined with the so-called averaging argument. First we present a simple result to illustrate the proof technique. Theorem 1. Let n γ 1 /6 be prime. A lattice rule can be constructed by the componentby-component algorithm such that [e sh γ (z 1,...,z s 1, z s )] 2 1 n s j =1 ( 1 + γ ) j. 6

15 2.5 A taste of the theoretical error analysis* 15 By taking the square root on both sides, we see that this simple result gives us only the O(1/ n) convergence rate, the same as the Monte Carlo method. On the other hand, since s j =1 ( 1 + γ ) j = exp 6 ( s log j =1 ( 1 + γ ) ) j 1 exp( 6 6 s γ j ), we see that our error bound can be independent of the dimension s provided that the sum of the infinite sequence of weights is finite, i.e., γ j <. j =1 Here is a brief synopsis of the proof.from (9) we can write the error expression in s dimensions in terms of the error expression in s 1 dimensions as follows: [e sh γ (z 1,...,z s 1, z s )] 2 = [e sh γ (z 1,...,z s 1 )] 2 + γ s n n 1 k=0 [ B 2 ({ kzs n j =1 }) s 1 ( ({ })) ] kzj 1 + γ j B 2. j =1 n Then we take the average of this expression over all choices of z s from U n.whenn is prime this means that we take A(z 1,...,z s 1 ) = 1 n 1 n 1 [e sh z s =1 γ (z 1,...,z s 1, z s )] 2. Since the only dependence on z s is in the first B 2 factor, we end up having to compute 1 n 1 ({ }) kzs B 2, n 1 n z s =1 which equals 1/6 if k = 0 and 1/(6n) otherwise. We combine this with the induction hypothesis on [e sh γ (z 1,...,z s 1 )] 2 to show that A(z 1,...,z s 1 ) 1 n s j =1 ( 1 + γ ) j. 6 Now since we take z s to be the value that minimizes [e sh γ (z 1,...,z s 1, z s )] 2,itmust be bounded by the average A(z 1,...,z s 1 ) and in turn bounded by the required upper bound. A more sophisticated averaging argument can be used to prove that we can get close to O(1/n) convergence. We state the result below for general weights γ u and general n. Theorem 2. A lattice rule can be constructed by the component-by-component algorithm such that

16 16 2 Lattice points e sh γ (z) ( 1 U n γ λ u u {1,...,s} ( ) ) 2ζ(2λ) u 1/(2λ) (2π 2 ) λ for all λ (1/2,1], where ζ(x) = k=1 k x is the Riemann zeta function. We have U n =n 1 when n is prime, U n =n/2 when n is a power of 2, and more generally U n n/2 when n is a power of a prime. The convergence rate close to O(1/n) is obtained by taking the parameter λ arbitrarily close to 1/2. This imposes stronger decay requirements on the weights γ u if we want to end up with a bound that is independent of s. In particular, if we have product weights, then to have a convergence rate close to O(1/n) we need γj <. j =1 We concede that this theorem is way too technical for the purpose of this introduction, but we just want to provide a taste of the analysis, as the heading of this subsection foreshadowed.

17 3.1 Digital net property 17 3 Digital nets Here we take a very informal approach to introduce the family of digital nets. 3.1 Digital net property Loosely speaking, the general principle of digital nets is all about getting the same number of points in various allowable sub-divisions of the unit cube. This is similar in spirit to the Sudoku game! Figure 4 illustrates the digital net property in 2D with 16 points. We can partition the unit square into 16 rectangles of the same shape and size. There is exactly one point in each rectangle (points on the top and right boundaries count toward the next rectangle), and this must hold for all of the 5 possible ways to sub-divide the unit square. Fig. 4 Illustration of a (0,4,2)-net in base 2: every elementary interval of volume 1/16 contains exactly one of the 16 points. A point that lies on the dividing line counts toward the interval above or to the right.

18 18 3 Digital nets This property generalizes to base b 2: instead of halving each time, we subdivide into b equal partitions. The property also generalizes to include a quality parameter called t-value : each allowable sub-division (formally called elementary intervals) contains exactly b t points. The smaller t is, the finer we can sub-divide the unit cube, and the more uniformly distributed the points are. Such a point set with n = b m points in s dimensions is called a (t,m, s)-net. Figure 4 is an example of a (0,4,2)-net in base 2. A(t, s)-sequences is a sequence of points in s dimensions such that if we chop the sequence into consecutive blocks of b m points then every block is a (t,m, s)- net. 3.2 Digital construction Needless to say, we cannot design digital nets in high dimensions by hand drawing rectangles or boxes. We construct digital nets by a digital construction scheme. Recall that to construct lattice rules we need a generating vector of integers one integer per dimension. To construct a digital net we need a vector of generating matrices C 1,...,C s one generating matrix per dimension. Here is how it works in base 2 (easily generalizes to base b 2). Suppose we want n = 2 m points. To get the j th component of the kth point, we write k = (k m 1 k 1 k 0 ) 2 in binary representation, take the m m binary matrix C j for dimension j, and compute y 1 y 2. y m = C j k 0 k 1. k m 1, (10) where all additions and multiplications are carried out modulo 2. Then the j th component of the kth point is (0.y 1 y 2 y m ) 2. Just as the case that the choice of generating vector determines the quality of a lattice rule, here the choice of the generating matrices determines the quality of a digital net the corresponding t-value of the net can be small (good) or large (bad). In the case of a lattice rule we need only one integer value z j in dimension j, but here we need to specify m 2 entries for the binary matrix C j. Finding good generating matrices can be a difficult task. Below we discuss two special cases of digital net construction. Before we proceed, note that we can think of C j in (10) as the top-left hand corner of some bigger matrix and it does not even have to be a square matrix.

19 3.4 Polynomial lattice rules Sobol sequences Sobol points are a popular example of digital nets in base 2, and they have been around long before the general concept of digital nets took shape. (Sobol is a Russian name the is not a typo! It denotes a soft pronunciation of the letter l.) To generate Sobol points, we need one primitive polynomial and some initial direction numbers for every dimension. Primitive polynomials have specific properties which we will not go into here, but it is well-known how many there are of a given degree and also what they are. Since we need a different primitive polynomial for each dimension, and since the quality of the Sobol points deteriorates when the degree of the polynomial increases, we arrange all the primitive polynomials in order of increasing degree so that we use up all the lower degree polynomials first. The initial direction numbers are used to kick start some recurrence relation involving the coefficients of the primitive polynomial in each dimension. These eventually lead to the entries of the generating matrix C j. Again we will not go into the technical details here. Many software packages include implementation of Sobol generators, e.g., Matlab, NAG, QuantLib. We also provide our own Sobol generators for more than 20,000 dimensions. 3.4 Polynomial lattice rules Another good way to get digital nets is by polynomial lattice rule construction they are actually digital nets rather than lattice rules, but in some formulation they mimic lattice rules and hence the name. Instead of having one generating vector of integers z 1,...,z s,weneedagenerating vector of polynomials q 1 (χ),...,q s (χ) one polynomial per dimension. Let p(χ) be a polynomial of degree m with binary coefficients. In dimension j, we have a polynomial q j (χ) of degree at most m 1 with binary coefficients. We find the binary digits u 1,u 2,...in q j (χ) p(χ) = u 1 χ + u 2 χ 2 + u 3 χ 3 + by equating coefficients in q j (χ) = (u 1 /χ+u 2 /χ 2 +u 3 /χ 3 + )p(χ), noting that all additions and multiplications are to be done modulo 2. Then we set u 1 u 2 u 3 u m u 2 u u m+1 C j = u u m u m u m+1 u m+2 u 2m 1

20 20 3 Digital nets The polynomial p(χ) is called the modulus, and it does not play a crucial role. The quality of a polynomial lattice rule is determined by the choice of the generating polynomials q 1 (χ),...,q s (χ). Nowadays we have theory and algorithm to find good polynomials using fast component-by-component construction, analogously to lattice rules. All of these generalize to base b Random digital shifting and scrambling To preserve the digital net property, we need a different kind of randomization strategy other than shifting, which preserves the lattice structure. One simple strategy is by digital shifting. Instead of taking {t k + Δ} for the kth point, we do t k Δ which means we carry out the exclusive-or operation on the binary bits of the vector components. For example, if x = = (0.101) 2 and y = = (0.001) 2, then x y = (0.100) 2 = 0.5. Scrambling is a more sophisticated randomization technique which can improve on the convergence rate of digital nets by an extra factor of 1/ n in some circumstances. Figure 5 illustrates the concept of scrambling as a sequence of pictures in 2D where slices are randomly swapped following some allowable conditions that preserve the digital net property. As for lattice rules, randomization of digital nets provide an unbiased estimate to the integral approximation as well as a practical error estimate. 3.6 Higher order nets by interlacing There is a strategy called interlacing which can turn a regular digital net into higher order digital net. Higher order digital nets can achieve O(1/n α ) convergence if the integrand is roughly α times differentiable in each variable. We need a different function space setting to the one in 2.5, and the theory is quite challenging. Conceptually, to get a higher order digital net in s dimensions with interlacing factor α, we take a regular digital net in α s dimensions, and then interlace every block of α dimensions. Interlacing works as follows: if we have x = (0.x 1 x 2 x 3 ) 2, y = (0.y 1 y 2 y 3 ) 2 and z = (0.z 1 z 2 z 3 ) 2, then the result of interlacing these three numbers is (0.x 1 y 1 z 1 x 2 y 2 z 2 x 3 y 3 z 3 ) 2. This corresponds to an interlacing factor of 3, and we end up with a number that has three times as many bits as the original numbers.

21 3.6 Higher order nets by interlacing 21 (a) (b) (c) (d) (e) (f) (g) (h) Fig. 5 Owen s scrambling in base 2: (a) original; (b) swap left and right halves; (c) swap 3rd and 4th vertical quarters; (d) swap 3rd and 4th, 7th and last vertical eighths; (e) swap 3rd and 4th, 7th and 8th, 9th and 10th, 15th and last sixteenths; (f) swap 1st and 2nd horizontal quarters; (g) swap 1st and 2nd, 5th and 6th, 7th and last horizontal eighths; (h) swap 3rd and 4th, 7th and 8th, 9th and 10th, 15th and last horizontal sixteenths. In practice an efficient way to implement higher order digital nets is by interlacing the rows of the generating matrices of the regular digital net, and then generate the points from these expanded matrices by allowing non-square matrices in (10). Note that precision is a practical issue for higher order digital nets: if we want n = 2 m points with interlacing factor α, then under standard double precision machine numbers we can only manage when αm 53. For example, we can only get up to order α = 3 with 2 16 points.

22 22 4 Toy applications 4 Toy applications In this section we outline three integrals which arise from grossly simplified models of practical applications. We begin with a discussion on the transformation needed to bring the integral into the unit cube. 4.1 Transformation to the unit cube Question 1. Given an integral g (y)φ(y)dy, where φ : R R is some univariate probability density function, i.e., φ(y) 0 for all y R and φ(y)dy = 1, how do we transform the integral into [0,1]? Answer. Let Φ : R [0,1] denote the cumulative distribution function of φ,i.e., Φ(y) = y φ(t)dt and and let Φ 1 : [0,1] R denote its inverse. Then we use the substitution (or change of variables) to obtain x = Φ(y) y = Φ 1 (x), g (y)φ(y)dy = 1 with the transformed integrand f := g Φ 1. Question 2. Is this the only way? 0 g (Φ 1 (x))dx = 1 0 f (x)dx, Answer. No. We can divide and multiply by any other probability density function φ, and then map to[0, 1] using its inverse cumulative distribution function Φ 1 : g (y)φ(y)dy = = = = g (y)φ(y) φ(y) g (y) φ(y)dy, g ( Φ 1 (x))dx f (x)dx, φ(y)dy g (y) := g (y)φ(y) φ(y)

23 4.1 Transformation to the unit cube 23 giving a different transformed integrand f := g Φ 1. Ideally we would like to use a density function which leads to an easy integrand in the unit cube. This is related to the concept of importance sampling for the Monte Carlo method. Question 3. How does this transformation generalize to s dimensions? Answer. If we have a product of univariate densities, then we can apply the mapping Φ 1 componentwise y = Φ 1 (x) = (Φ 1 (x 1 ),,Φ 1 (x s )) to obtain g (y) R s s φ(y j )dy = g (Φ 1 (x))dx = f (x)dx. [0,1] s [0,1] s j =1 Remember that we can always divide and multiply to get such a product: (some ugly function of y) s (some ugly function of y)dy = s R s R s j =1 φ(y φ(y j )dy. j ) j =1 Question 4. How do we tackle the multivariate normal density which occurs in many integrals from practical models? Answer. If the multivariate normal density is the dominating part of the entire integrand, then factorize the covariance matrix Σ, i.e., find an s s matrix A such that Σ = AA, (11) and then use the substitution (treating all vectors as column vectors) y = Az followed by z = Φ 1 (x) to obtain g (y) exp( 1 2 y Σ 1 y) dy (12) R s (2π) s det(σ) = g (Az) exp( 1 2 z z) dz R s (2π) s s exp( 1 2 z2 j ) = g (Az) dz = g (AΦ 1 (x))dx = f (x)dx. R s 2π [0,1] s [0,1] s j =1 The factorization (11)isnot unique. Two obvious choices are 1. the Cholesky factorzation with lower triangular matrix A,or

24 24 4 Toy applications 2. the principal components factorization which is given by A = [ λ 1 η 1 ; ; λ s η s ], where (λ j,η j ) s j =1 denotes the eigenpairs of Σ, with ordered eigenvalues λ 1 λ 2 λ s and unit-length column eigenvectors η 1,...,η s. Other choices are possible. Question 5. What if the multivariate normal density is not the dominating part of the entire integrand? Answer. In that case, other transformation steps would be required to capture the main feature of the entire integrand. 4.2 Option pricing Following the Black-Scholes model, the integral arising from pricing an arithmetic average Asian call option takes the general form of (12), with and g (y) = e rt max ( 1 s ) s S t j (y j ) K,0, j =1 S t j (y j ) = S 0 exp ((r σ2 2 ) ) jt s + σy j. where r is the risk-free interest rate, σ is the volatility, and S 0 is the initial asset price. The variables y = (y 1,...,y s ) correspond to a discretization of the underlying Brownian motion over a time interval [0,T ], and the covariance matrix has entries Σ ij = (T /s)min(i, j ). The payoff function g (y) compares the average of the asset prices S t j at the discrete times with the strike price K, and takes their difference if it is positive, or the value 0 if the difference is negative. It is widely accepted that QMC methods work especially well for such problems if we take the principal component construction approach to facorize the covariance matrix Σ. The success of QMC for option pricing problems cannot be explained by the standard theory due to the kink in the integrand. Lots of new QMC theory have been developed with this problem in mind. Parameters for numerical experiment: S 0 = 100 (dollars), K = 100 (dollars), T = 1(year),r = 0.1, σ = 0.2, and s = 256.

25 4.4 PDE with a random coefficient Maximum likelihood One example of a time series Poisson likelihood model involves an integral of the form (12), with s exp(τ j (β + y j ) e β+y j ) g (y) =. τ j! j =1 Here β R is a model parameter, τ 1,...,τ s {0,1,...}are the count data, and Σ is a Toeplitz covariance matrix with Σ ij = σ 2 κ i j /(1 κ 2 ), where σ 2 is the variance and κ ( 1,1) is the autoregression coefficient. The obvious way to transform this integral into the unit cube by factorizing Σ would yield a very spiky function f. Instead, it is better to consider q(y) and the multivariate normal density together and then perform some change of variables with the effect of recentering and rescaling the whole integrand, before mapping to the unit cube. We have some new QMC theory that can explain the success of this approach. Parameters for numerical experiment: β = 0.7, σ 2 = 0.3, κ = 0.5, and τ = (2,0,3,2,0,2,1,4,2,1,8,5,2,3,6,2,2,0,0,1,0,7,2,5,1) for s = 25 (we have more data beyond 25 dimensions). This example came from Kuo, Dunsmuir, Sloan, Wand & Womersley (2008). 4.4 PDE with a random coefficient We consider the following model of a parametric elliptic Dirichlet problem x (a(x, y) x u(x, y)) = f (x) for x D and y [ 1 2, 1 2 ]s, (13) u(x, y) = 0 for x D, where D R d, d {1,2,3}, is a bounded spatial domain with a Lipschitz boundary D, and where the parametric diffusion coefficient is (after truncating an infinite series to s terms) s a(x, y) = a(x) + y j ψ j (x), (14) The parameter vector y is distributed on [ 1 2, 1 2 ]s with the uniform probability measure. We assume that a L (D) and j =1 ψ j L (D) <, and that 0 < a min a(x, y) a max < for all x and y. We refer to this as the uniform case. We are interested in the expected value of some bounded linear function G of the solution u, which is an integral of the form [ 1 2, 1 G(u(, y))dy. 2 ]s j =1

26 26 4 Toy applications Note that in this problem we are integrating with respect to the parameter vector y, and not the spatial variables x. A QMC Finite Element approximation to this integral is 1 n n 1 k=0 ( G (u h, t k 1 )), 2 where u h denotes the Finite Element weak solution. Essentially, we generate a number of QMC points (either deterministic or randomized) and translate them to the cube [ 1 2, 1 2 ]s. Each such translated QMC point gives a different value of y and we solve the corresponding PDE and then apply the functional G. We finally take the average of all solutions. By now there is a large body of literature on applying QMC methods to these and related problems, including the so-called lognormal case which give rise to an integral over R s with a normal density. QMC methods are relatively new to these applications and they are proven to be very competitive to other well established methods. Parameters for numerical experiment: f (x) = 100x 1, G(u(, y)) = D u(x, y)dx, d = 2, a(x) = 1, s = 100 (or any other number), and ψ j (x) = λ j sin(k 1,j πx 1 )sin(k 2,j πx 2 ), where the sequence of pairs ( (k 1,j,k 2,j ) ) j 1 is an ordering of the elements of Z+ Z + such that λ j = (k 2 1,j + k2 2,j ) 2 is a non-increasing sequence. (In other words, we form the pairs of positive integers, order them according to the reciprical of the sum of the squared components, and then keep the first s pairs.) This example came from Dick, Kuo, Le Gia & Schwab (2016).

27 5.1 A simple test function 27 5 Software demo We will demonstrate how to apply QMC methods to your favorite integrals/expectations. We will consider a simple test function throughout this section, instead of taking more complicated examples as in the previous section (e.g., where one function evaluation could involve solving a PDE). Matlab will be used as the lingua franca in the examples, but further down you can find Python and C++ code. If you know at least one of these languages (or any similar language) then you should be able to understand what is going on, especially after comparing the three different implementations in 5.7.Froma computational point of view it is important that we vectorize our function evaluations, see 5.3 for an explanation. The exposition is such that the better code comes at the end. All numerical tests in this chapter were run on the same old laptop which has a 1.8 GHz Intel Core i7 (2 core s) 4 GB under Mac OS Sierra with Matlab R2016a, Python with NumPy and clang A simple test function We consider the following example function taken from Gantner & Schwab (2016): g (x) := exp ( c ) s x j j b = j =1 s exp(cx j j b ). (15) For testing purposes it is nice to know the exact value of the integral. Since g is a product of one-dimensional functions and we can write down the solution of the one-dimensional integrals, we find I (g ):= g (x)dx = [0,1] s s j =1 j =1 exp(c j b ) 1 cj b, c 0. Let us define g in Matlab as a vectorized inline function taking multiple vectors at once as an s n array and returning a 1 n array of results: % x is an [s-by-n] matrix, n vectors of size s; vectorized g(x): g c, b) exp(c * (1:size(x,1)).^(-b) * x); % note that vectors are considered to be columns and so the % product above is an inner product, summing over the dimensions In fact we will define g slightly different as we will not just pass in multiple vectors at once, but also different shifted versions of these vectors as an s n m array:

28 28 5 Software demo g c, b) reshape(... exp(c * (1:size(x,1)).^(-b) * x(:,:)),... % 'as above' 1, size(x,2), size(x,3)... % 1-by-n-by-[whatever left] ); Of course more complicated functions would better be defined in a separate file. E.g.: we could define g in a separate file with the same function signature (we call this file gfun.m, and thus this function s name is gfun, to distinguish from the inline definition above): function y = gfun(x, c, b) % function y = gfun(x, c, b) % % Vectorized evaluation of the example function % \[ g(x) := \exp( c \sum_{j=1}^s x_j j^{-b} ) \] % % Inputs: % x array of s-dimensional vectors, [s-by-n] array % or [s-by-n-by-m] array (or even deeper nesting, % but the first dimension should be the dimension s) % c scaling parameter, scalar % b dimension decay parameter to diminish influence of % higher dimensions, scalar, b >= 0 (b = 0 is no decay) % Outputs: % y function value for each input vector, [1-by-n] array % or [s-by-n-by-m] array (or deeper, but same as x) % % Note: the array x (and also the resulting array y) can have more % than two dimensions, e.g., x could be [s-by-n-by-m] and then the % resulting y will be [1-by-n-by-m]. This is to accommodate for % multiple versions of a point set (e.g., for shifted point sets). y = reshape(... exp(c * (1:size(x,1)).^(-b) * x(:,:)),... max((1:ndims(x) ~= 1).* size(x), 1)... % first dim to 1 ); This version is even more general as it allows x to have any shape as long as the leading dimension is s (the dimensionality of the points), and it will return an array of the same shape but with the first dimension set to 1 (mapping vectors into function values). We are now ready to fix some parameters: % parameters of the g-function s = 100; % number of dimensions c = 1; % c-parameter b = 2; % decay-parameter Then we can calculate its exact integral value: a = c * (1:s).^(-b); exact = prod(expm1(a)./a); % or as a function (repeating the 'a' twice): gexact c, b) prod(expm1(c*(1:s).^(-b))./(c*(1:s).^(-b))); % notice we use expm1(a) and not exp(a) - 1

29 5.3 Some technical details* The difficulty of our test function y exp(x) 1 exp(x/5) exp( x/5) 0 exp( x) exp( 5x) 1 x Fig. 6 Interpretation of the combined parameters c and b for the function g in a single coordinate. The effect is multiplied in multiple dimensions. The parameters c and b specify how difficult the function g will be. We can use Figure 6 as a guideline. It is clear we have a product of such one-dimensional exponential functions. Looking at the figure, we see that if the argument to the exponential function is a small number then the function is nearly linear and approaching a constant function. Note that constant functions are ridiculously easy to integrate. The extreme case would be c = 0; in that case both MC and QMC methods will give the exact value of the integral already with one function value. The larger the value of c (positive or negative) the more we deviate from a linear function and we need more and more samples to determine its integral. For negative c with very large magnitude, the function is essentially 0 except for g (0) = 1, so it is a peaky function and rather hard to integrate. In the multivariate case, the parameter b, withb 0, modifies how quickly we converge to a constant function as the dimension j increases. When the number of dimensions increases, the deviation from the constant function is multiplied. 5.3 Some technical details* Floating point precision Note the usage of the function expm1 in calculating the exact integral of g. This is useful for small arguments when exp(x) 1 since then it is more accurate to compute the right-most expression instead of the middle expression of

30 30 5 Software demo exp(x) 1 = ) (1 + x + x2 2! + x3 3! + 1 = x + x2 2! + x3 3! +, because of floating point arithmetic where 1 + ɛ is rounded to 1 for ɛ smaller than the floating point precision. Vectorization of (interpreted) code It is often a good idea to vectorize code in interpreted languages (e.g., Matlab and Python). Vectorized code often runs much faster than for-loops. Once you get used to vectorized code, in terms of matrices and vectors, it is less error prone and easier to read. In Matlab we vectorize by using matrix and vector operations, as well as use array operations on each element by using a dot in front of the expression:.*,./,.^. You can look up vectorization in the Matlab documentation. A very useful function is reshape which does not cost any computational or memory effort. It just takes the same block of data but interprets the data as if the elements are to be interpreted with a different shape. Matlab uses column-major format, which means that for x an s n m array, the first s consecutive numbers in the data block is the part x(1:s,1,1) = x(:,1,1). This is the same as in Fortran. C and C++ however use row-major format in which the last dimension iterates over consecutive elements. In Python, using NumPy, the default is row-major but it can be chosen on an array by array basis. The way we have chosen to lay out our collections of s-dimensional vectors in Matlab is such that the vectors are stored consecutively in memory. In this way the data needed to evaluate one function value is localized in memory. If we would have to implement such a vectorization in Python with row-major format, then we would formulate x as an m n s array instead. That is what we will choose in when making a Python implementation. Parallelization We note that MC and QMC methods are embarrassingly parallel. This is a technical term to mean that all computations are independent and so can be distributed (probably preferably in blocks) straightforwardly over multiple cores. The accumulation of the results can be done with a simple reduce operation. 5.4 Usage of random number generators It is good practice to have reproducible results (for debugging or when testing optimizations, or checking the results in this text). We will use the Mersenne Twister as the random number generator for our MC simulations and we will set

31 5.5 Monte Carlo approximation 31 its initial state to a fixed value such that we can repeat our experiment and get exactly the same random numbers. Similarly we will use the combined recursive generator from L Ecuyer for the random shifting in case of the QMC approximations. rng_mc = RandStream('mt19937ar', 'Seed', 1234); rng_shifts = RandStream('mrg32k3a', 'Seed', 1234); In Matlab you can now draw random numbers from the Mersenne Twister by doing x = rand(rng_mc, s, n)to obtain an s n array. 5.5 Monte Carlo approximation We are now ready to do a first approximation of the integral using MC method. We will use ten thousand samples to get a MC approximation. (Of course in our test case we do know the exact value of the integral.) tic; N = 1e5; % number of samples G = g(rand(rng_mc, s, N), c, b); % evaluate at once, mean but easy MC_Q = mean(g); MC_std = std(g)/sqrt(n); t = toc; fprintf('mc_q = %g (error=%g, std=%g, N=%d) in %f sec\n',... MC_Q, abs(mc_q - gexact(s, c, b)), MC_std, N, t); This gives us MC_Q = (error= , std= , N=100000) in sec Without resetting the seed and re-running the above code 9 times gives the following output: MC_Q = (error= , std= , N=100000) in sec MC_Q = (error= , std= , N=100000) in sec MC_Q = (error= , std= , N=100000) in sec MC_Q = (error= , std= , N=100000) in sec MC_Q = (error= , std= , N=100000) in sec MC_Q = (error= , std= , N=100000) in sec MC_Q = (error= , std= , N=100000) in sec MC_Q = (error= , std= , N=100000) in sec MC_Q = (error= , std= , N=100000) in sec In Figure 7 we plot the results of these 10 approximations to obtain estimates to I (g ) as well as σ 2 (g ). In Figure 8 we see the standard error plotted in terms of the number of samples used. We can clearly see from the figure that the convergence is 1/ N, as expected. For our test function we can actually calculate σ 2 (g ), and so we also plotted σ(g )/ N as a dashed reference line. Using 10 5 random samples we observe an estimated standard error of If we would want to divide this error by 10 then we will need to take 100 times more samples.

Quasi-Monte Carlo integration over the Euclidean space and applications

Quasi-Monte Carlo integration over the Euclidean space and applications Quasi-Monte Carlo integration over the Euclidean space and applications f.kuo@unsw.edu.au University of New South Wales, Sydney, Australia joint work with James Nichols (UNSW) Journal of Complexity 30

More information

Application of QMC methods to PDEs with random coefficients

Application of QMC methods to PDEs with random coefficients Application of QMC methods to PDEs with random coefficients a survey of analysis and implementation Frances Kuo f.kuo@unsw.edu.au University of New South Wales, Sydney, Australia joint work with Ivan Graham

More information

Fast evaluation of mixed derivatives and calculation of optimal weights for integration. Hernan Leovey

Fast evaluation of mixed derivatives and calculation of optimal weights for integration. Hernan Leovey Fast evaluation of mixed derivatives and calculation of optimal weights for integration Humboldt Universität zu Berlin 02.14.2012 MCQMC2012 Tenth International Conference on Monte Carlo and Quasi Monte

More information

Tutorial on quasi-monte Carlo methods

Tutorial on quasi-monte Carlo methods Tutorial on quasi-monte Carlo methods Josef Dick School of Mathematics and Statistics, UNSW, Sydney, Australia josef.dick@unsw.edu.au Comparison: MCMC, MC, QMC Roughly speaking: Markov chain Monte Carlo

More information

Part III. Quasi Monte Carlo methods 146/349

Part III. Quasi Monte Carlo methods 146/349 Part III Quasi Monte Carlo methods 46/349 Outline Quasi Monte Carlo methods 47/349 Quasi Monte Carlo methods Let y = (y,...,y N ) be a vector of independent uniform random variables in Γ = [0,] N, u(y)

More information

APPLIED MATHEMATICS REPORT AMR04/16 FINITE-ORDER WEIGHTS IMPLY TRACTABILITY OF MULTIVARIATE INTEGRATION. I.H. Sloan, X. Wang and H.

APPLIED MATHEMATICS REPORT AMR04/16 FINITE-ORDER WEIGHTS IMPLY TRACTABILITY OF MULTIVARIATE INTEGRATION. I.H. Sloan, X. Wang and H. APPLIED MATHEMATICS REPORT AMR04/16 FINITE-ORDER WEIGHTS IMPLY TRACTABILITY OF MULTIVARIATE INTEGRATION I.H. Sloan, X. Wang and H. Wozniakowski Published in Journal of Complexity, Volume 20, Number 1,

More information

Quasi-Monte Carlo Methods for Applications in Statistics

Quasi-Monte Carlo Methods for Applications in Statistics Quasi-Monte Carlo Methods for Applications in Statistics Weights for QMC in Statistics Vasile Sinescu (UNSW) Weights for QMC in Statistics MCQMC February 2012 1 / 24 Quasi-Monte Carlo Methods for Applications

More information

Hot new directions for QMC research in step with applications

Hot new directions for QMC research in step with applications Hot new directions for QMC research in step with applications Frances Kuo f.kuo@unsw.edu.au University of New South Wales, Sydney, Australia Frances Kuo @ UNSW Australia p.1 Outline Motivating example

More information

FAST QMC MATRIX-VECTOR MULTIPLICATION

FAST QMC MATRIX-VECTOR MULTIPLICATION FAST QMC MATRIX-VECTOR MULTIPLICATION JOSEF DICK, FRANCES Y. KUO, QUOC T. LE GIA, CHRISTOPH SCHWAB Abstract. Quasi-Monte Carlo (QMC) rules 1/N N 1 n=0 f(y na) can be used to approximate integrals of the

More information

MTH Linear Algebra. Study Guide. Dr. Tony Yee Department of Mathematics and Information Technology The Hong Kong Institute of Education

MTH Linear Algebra. Study Guide. Dr. Tony Yee Department of Mathematics and Information Technology The Hong Kong Institute of Education MTH 3 Linear Algebra Study Guide Dr. Tony Yee Department of Mathematics and Information Technology The Hong Kong Institute of Education June 3, ii Contents Table of Contents iii Matrix Algebra. Real Life

More information

What s new in high-dimensional integration? designing for applications

What s new in high-dimensional integration? designing for applications What s new in high-dimensional integration? designing for applications Ian H. Sloan i.sloan@unsw.edu.au The University of New South Wales UNSW Australia ANZIAM NSW/ACT, November, 2015 The theme High dimensional

More information

chapter 12 MORE MATRIX ALGEBRA 12.1 Systems of Linear Equations GOALS

chapter 12 MORE MATRIX ALGEBRA 12.1 Systems of Linear Equations GOALS chapter MORE MATRIX ALGEBRA GOALS In Chapter we studied matrix operations and the algebra of sets and logic. We also made note of the strong resemblance of matrix algebra to elementary algebra. The reader

More information

arxiv: v3 [math.na] 18 Sep 2016

arxiv: v3 [math.na] 18 Sep 2016 Infinite-dimensional integration and the multivariate decomposition method F. Y. Kuo, D. Nuyens, L. Plaskota, I. H. Sloan, and G. W. Wasilkowski 9 September 2016 arxiv:1501.05445v3 [math.na] 18 Sep 2016

More information

(x 1 +x 2 )(x 1 x 2 )+(x 2 +x 3 )(x 2 x 3 )+(x 3 +x 1 )(x 3 x 1 ).

(x 1 +x 2 )(x 1 x 2 )+(x 2 +x 3 )(x 2 x 3 )+(x 3 +x 1 )(x 3 x 1 ). CMPSCI611: Verifying Polynomial Identities Lecture 13 Here is a problem that has a polynomial-time randomized solution, but so far no poly-time deterministic solution. Let F be any field and let Q(x 1,...,

More information

Chapter 1 Computer Arithmetic

Chapter 1 Computer Arithmetic Numerical Analysis (Math 9372) 2017-2016 Chapter 1 Computer Arithmetic 1.1 Introduction Numerical analysis is a way to solve mathematical problems by special procedures which use arithmetic operations

More information

1 What is the area model for multiplication?

1 What is the area model for multiplication? for multiplication represents a lovely way to view the distribution property the real number exhibit. This property is the link between addition and multiplication. 1 1 What is the area model for multiplication?

More information

Uniform Random Number Generators

Uniform Random Number Generators JHU 553.633/433: Monte Carlo Methods J. C. Spall 25 September 2017 CHAPTER 2 RANDOM NUMBER GENERATION Motivation and criteria for generators Linear generators (e.g., linear congruential generators) Multiple

More information

LU Factorization. LU factorization is the most common way of solving linear systems! Ax = b LUx = b

LU Factorization. LU factorization is the most common way of solving linear systems! Ax = b LUx = b AM 205: lecture 7 Last time: LU factorization Today s lecture: Cholesky factorization, timing, QR factorization Reminder: assignment 1 due at 5 PM on Friday September 22 LU Factorization LU factorization

More information

Lifting the Curse of Dimensionality

Lifting the Curse of Dimensionality Lifting the Curse of Dimensionality Frances Y. Kuo and Ian H. Sloan Introduction Richard Bellman [1] coined the phrase the curse of dimensionality to describe the extraordinarily rapid growth in the difficulty

More information

Symmetries and Polynomials

Symmetries and Polynomials Symmetries and Polynomials Aaron Landesman and Apurva Nakade June 30, 2018 Introduction In this class we ll learn how to solve a cubic. We ll also sketch how to solve a quartic. We ll explore the connections

More information

Chapter 4. Solving Systems of Equations. Chapter 4

Chapter 4. Solving Systems of Equations. Chapter 4 Solving Systems of Equations 3 Scenarios for Solutions There are three general situations we may find ourselves in when attempting to solve systems of equations: 1 The system could have one unique solution.

More information

Algorithms (II) Yu Yu. Shanghai Jiaotong University

Algorithms (II) Yu Yu. Shanghai Jiaotong University Algorithms (II) Yu Yu Shanghai Jiaotong University Chapter 1. Algorithms with Numbers Two seemingly similar problems Factoring: Given a number N, express it as a product of its prime factors. Primality:

More information

The Kernel Trick, Gram Matrices, and Feature Extraction. CS6787 Lecture 4 Fall 2017

The Kernel Trick, Gram Matrices, and Feature Extraction. CS6787 Lecture 4 Fall 2017 The Kernel Trick, Gram Matrices, and Feature Extraction CS6787 Lecture 4 Fall 2017 Momentum for Principle Component Analysis CS6787 Lecture 3.1 Fall 2017 Principle Component Analysis Setting: find the

More information

MTH 2032 Semester II

MTH 2032 Semester II MTH 232 Semester II 2-2 Linear Algebra Reference Notes Dr. Tony Yee Department of Mathematics and Information Technology The Hong Kong Institute of Education December 28, 2 ii Contents Table of Contents

More information

Notes for CS542G (Iterative Solvers for Linear Systems)

Notes for CS542G (Iterative Solvers for Linear Systems) Notes for CS542G (Iterative Solvers for Linear Systems) Robert Bridson November 20, 2007 1 The Basics We re now looking at efficient ways to solve the linear system of equations Ax = b where in this course,

More information

Scientific Computing: Monte Carlo

Scientific Computing: Monte Carlo Scientific Computing: Monte Carlo Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 Course MATH-GA.2043 or CSCI-GA.2112, Spring 2012 April 5th and 12th, 2012 A. Donev (Courant Institute)

More information

Example: 2x y + 3z = 1 5y 6z = 0 x + 4z = 7. Definition: Elementary Row Operations. Example: Type I swap rows 1 and 3

Example: 2x y + 3z = 1 5y 6z = 0 x + 4z = 7. Definition: Elementary Row Operations. Example: Type I swap rows 1 and 3 Linear Algebra Row Reduced Echelon Form Techniques for solving systems of linear equations lie at the heart of linear algebra. In high school we learn to solve systems with or variables using elimination

More information

MAS114: Solutions to Exercises

MAS114: Solutions to Exercises MAS114: s to Exercises Up to week 8 Note that the challenge problems are intended to be difficult! Doing any of them is an achievement. Please hand them in on a separate piece of paper if you attempt them.

More information

Differentiation and Integration

Differentiation and Integration Differentiation and Integration (Lectures on Numerical Analysis for Economists II) Jesús Fernández-Villaverde 1 and Pablo Guerrón 2 February 12, 2018 1 University of Pennsylvania 2 Boston College Motivation

More information

Unitary Dynamics and Quantum Circuits

Unitary Dynamics and Quantum Circuits qitd323 Unitary Dynamics and Quantum Circuits Robert B. Griffiths Version of 20 January 2014 Contents 1 Unitary Dynamics 1 1.1 Time development operator T.................................... 1 1.2 Particular

More information

If something is repeated, look for what stays unchanged!

If something is repeated, look for what stays unchanged! LESSON 5: INVARIANTS Invariants are very useful in problems concerning algorithms involving steps that are repeated again and again (games, transformations). While these steps are repeated, the system

More information

Prime Number Theory and the Riemann Zeta-Function

Prime Number Theory and the Riemann Zeta-Function 5262589 - Recent Perspectives in Random Matrix Theory and Number Theory Prime Number Theory and the Riemann Zeta-Function D.R. Heath-Brown Primes An integer p N is said to be prime if p and there is no

More information

Progress in high-dimensional numerical integration and its application to stochastic optimization

Progress in high-dimensional numerical integration and its application to stochastic optimization Progress in high-dimensional numerical integration and its application to stochastic optimization W. Römisch Humboldt-University Berlin Department of Mathematics www.math.hu-berlin.de/~romisch Short course,

More information

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A.

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A. AMSC/CMSC 661 Scientific Computing II Spring 2005 Solution of Sparse Linear Systems Part 2: Iterative methods Dianne P. O Leary c 2005 Solving Sparse Linear Systems: Iterative methods The plan: Iterative

More information

Number Systems III MA1S1. Tristan McLoughlin. December 4, 2013

Number Systems III MA1S1. Tristan McLoughlin. December 4, 2013 Number Systems III MA1S1 Tristan McLoughlin December 4, 2013 http://en.wikipedia.org/wiki/binary numeral system http://accu.org/index.php/articles/1558 http://www.binaryconvert.com http://en.wikipedia.org/wiki/ascii

More information

Some Background Material

Some Background Material Chapter 1 Some Background Material In the first chapter, we present a quick review of elementary - but important - material as a way of dipping our toes in the water. This chapter also introduces important

More information

Topic 15 Notes Jeremy Orloff

Topic 15 Notes Jeremy Orloff Topic 5 Notes Jeremy Orloff 5 Transpose, Inverse, Determinant 5. Goals. Know the definition and be able to compute the inverse of any square matrix using row operations. 2. Know the properties of inverses.

More information

HOW TO LOOK AT MINKOWSKI S THEOREM

HOW TO LOOK AT MINKOWSKI S THEOREM HOW TO LOOK AT MINKOWSKI S THEOREM COSMIN POHOATA Abstract. In this paper we will discuss a few ways of thinking about Minkowski s lattice point theorem. The Minkowski s Lattice Point Theorem essentially

More information

Divide and Conquer. Maximum/minimum. Median finding. CS125 Lecture 4 Fall 2016

Divide and Conquer. Maximum/minimum. Median finding. CS125 Lecture 4 Fall 2016 CS125 Lecture 4 Fall 2016 Divide and Conquer We have seen one general paradigm for finding algorithms: the greedy approach. We now consider another general paradigm, known as divide and conquer. We have

More information

1. Introductory Examples

1. Introductory Examples 1. Introductory Examples We introduce the concept of the deterministic and stochastic simulation methods. Two problems are provided to explain the methods: the percolation problem, providing an example

More information

Review of Statistical Terminology

Review of Statistical Terminology Review of Statistical Terminology An experiment is a process whose outcome is not known with certainty. The experiment s sample space S is the set of all possible outcomes. A random variable is a function

More information

Math Circle: Recursion and Induction

Math Circle: Recursion and Induction Math Circle: Recursion and Induction Prof. Wickerhauser 1 Recursion What can we compute, using only simple formulas and rules that everyone can understand? 1. Let us use N to denote the set of counting

More information

MAS114: Exercises. October 26, 2018

MAS114: Exercises. October 26, 2018 MAS114: Exercises October 26, 2018 Note that the challenge problems are intended to be difficult! Doing any of them is an achievement. Please hand them in on a separate piece of paper if you attempt them.

More information

Uniform random numbers generators

Uniform random numbers generators Uniform random numbers generators Lecturer: Dmitri A. Moltchanov E-mail: moltchan@cs.tut.fi http://www.cs.tut.fi/kurssit/tlt-2707/ OUTLINE: The need for random numbers; Basic steps in generation; Uniformly

More information

7.1 Indefinite Integrals Calculus

7.1 Indefinite Integrals Calculus 7.1 Indefinite Integrals Calculus Learning Objectives A student will be able to: Find antiderivatives of functions. Represent antiderivatives. Interpret the constant of integration graphically. Solve differential

More information

1 Basic Combinatorics

1 Basic Combinatorics 1 Basic Combinatorics 1.1 Sets and sequences Sets. A set is an unordered collection of distinct objects. The objects are called elements of the set. We use braces to denote a set, for example, the set

More information

8 STOCHASTIC SIMULATION

8 STOCHASTIC SIMULATION 8 STOCHASTIC SIMULATIO 59 8 STOCHASTIC SIMULATIO Whereas in optimization we seek a set of parameters x to minimize a cost, or to maximize a reward function J( x), here we pose a related but different question.

More information

THE N-VALUE GAME OVER Z AND R

THE N-VALUE GAME OVER Z AND R THE N-VALUE GAME OVER Z AND R YIDA GAO, MATT REDMOND, ZACH STEWARD Abstract. The n-value game is an easily described mathematical diversion with deep underpinnings in dynamical systems analysis. We examine

More information

Fundamentals of Mathematics I

Fundamentals of Mathematics I Fundamentals of Mathematics I Kent State Department of Mathematical Sciences Fall 2008 Available at: http://www.math.kent.edu/ebooks/10031/book.pdf August 4, 2008 Contents 1 Arithmetic 2 1.1 Real Numbers......................................................

More information

Elementary Linear Algebra

Elementary Linear Algebra Matrices J MUSCAT Elementary Linear Algebra Matrices Definition Dr J Muscat 2002 A matrix is a rectangular array of numbers, arranged in rows and columns a a 2 a 3 a n a 2 a 22 a 23 a 2n A = a m a mn We

More information

Gregory's quadrature method

Gregory's quadrature method Gregory's quadrature method Gregory's method is among the very first quadrature formulas ever described in the literature, dating back to James Gregory (638-675). It seems to have been highly regarded

More information

Math 396. Quotient spaces

Math 396. Quotient spaces Math 396. Quotient spaces. Definition Let F be a field, V a vector space over F and W V a subspace of V. For v, v V, we say that v v mod W if and only if v v W. One can readily verify that with this definition

More information

GALOIS GROUPS OF CUBICS AND QUARTICS (NOT IN CHARACTERISTIC 2)

GALOIS GROUPS OF CUBICS AND QUARTICS (NOT IN CHARACTERISTIC 2) GALOIS GROUPS OF CUBICS AND QUARTICS (NOT IN CHARACTERISTIC 2) KEITH CONRAD We will describe a procedure for figuring out the Galois groups of separable irreducible polynomials in degrees 3 and 4 over

More information

Review of matrices. Let m, n IN. A rectangle of numbers written like A =

Review of matrices. Let m, n IN. A rectangle of numbers written like A = Review of matrices Let m, n IN. A rectangle of numbers written like a 11 a 12... a 1n a 21 a 22... a 2n A =...... a m1 a m2... a mn where each a ij IR is called a matrix with m rows and n columns or an

More information

Basic counting techniques. Periklis A. Papakonstantinou Rutgers Business School

Basic counting techniques. Periklis A. Papakonstantinou Rutgers Business School Basic counting techniques Periklis A. Papakonstantinou Rutgers Business School i LECTURE NOTES IN Elementary counting methods Periklis A. Papakonstantinou MSIS, Rutgers Business School ALL RIGHTS RESERVED

More information

Chapter 3: Polynomial and Rational Functions

Chapter 3: Polynomial and Rational Functions Chapter 3: Polynomial and Rational Functions 3.1 Polynomial Functions A polynomial on degree n is a function of the form P(x) = a n x n + a n 1 x n 1 + + a 1 x 1 + a 0, where n is a nonnegative integer

More information

Abstract. 2. We construct several transcendental numbers.

Abstract. 2. We construct several transcendental numbers. Abstract. We prove Liouville s Theorem for the order of approximation by rationals of real algebraic numbers. 2. We construct several transcendental numbers. 3. We define Poissonian Behaviour, and study

More information

1 Introduction & Objective

1 Introduction & Objective Signal Processing First Lab 13: Numerical Evaluation of Fourier Series Pre-Lab and Warm-Up: You should read at least the Pre-Lab and Warm-up sections of this lab assignment and go over all exercises in

More information

CHAPTER 1. REVIEW: NUMBERS

CHAPTER 1. REVIEW: NUMBERS CHAPTER. REVIEW: NUMBERS Yes, mathematics deals with numbers. But doing math is not number crunching! Rather, it is a very complicated psychological process of learning and inventing. Just like listing

More information

Katholieke Universiteit Leuven Department of Computer Science

Katholieke Universiteit Leuven Department of Computer Science Extensions of Fibonacci lattice rules Ronald Cools Dirk Nuyens Report TW 545, August 2009 Katholieke Universiteit Leuven Department of Computer Science Celestijnenlaan 200A B-3001 Heverlee (Belgium Extensions

More information

[Disclaimer: This is not a complete list of everything you need to know, just some of the topics that gave people difficulty.]

[Disclaimer: This is not a complete list of everything you need to know, just some of the topics that gave people difficulty.] Math 43 Review Notes [Disclaimer: This is not a complete list of everything you need to know, just some of the topics that gave people difficulty Dot Product If v (v, v, v 3 and w (w, w, w 3, then the

More information

Pseudo-Random Numbers Generators. Anne GILLE-GENEST. March 1, Premia Introduction Definitions Good generators...

Pseudo-Random Numbers Generators. Anne GILLE-GENEST. March 1, Premia Introduction Definitions Good generators... 14 pages 1 Pseudo-Random Numbers Generators Anne GILLE-GENEST March 1, 2012 Contents Premia 14 1 Introduction 2 1.1 Definitions............................. 2 1.2 Good generators..........................

More information

Chapter Four Gelfond s Solution of Hilbert s Seventh Problem (Revised January 2, 2011)

Chapter Four Gelfond s Solution of Hilbert s Seventh Problem (Revised January 2, 2011) Chapter Four Gelfond s Solution of Hilbert s Seventh Problem (Revised January 2, 2011) Before we consider Gelfond s, and then Schneider s, complete solutions to Hilbert s seventh problem let s look back

More information

Linear algebra for MATH2601: Theory

Linear algebra for MATH2601: Theory Linear algebra for MATH2601: Theory László Erdős August 12, 2000 Contents 1 Introduction 4 1.1 List of crucial problems............................... 5 1.2 Importance of linear algebra............................

More information

Mini-project in scientific computing

Mini-project in scientific computing Mini-project in scientific computing Eran Treister Computer Science Department, Ben-Gurion University of the Negev, Israel. March 7, 2018 1 / 30 Scientific computing Involves the solution of large computational

More information

Introduction to Algorithms

Introduction to Algorithms Lecture 1 Introduction to Algorithms 1.1 Overview The purpose of this lecture is to give a brief overview of the topic of Algorithms and the kind of thinking it involves: why we focus on the subjects that

More information

Next topics: Solving systems of linear equations

Next topics: Solving systems of linear equations Next topics: Solving systems of linear equations 1 Gaussian elimination (today) 2 Gaussian elimination with partial pivoting (Week 9) 3 The method of LU-decomposition (Week 10) 4 Iterative techniques:

More information

Multivariate Distributions

Multivariate Distributions IEOR E4602: Quantitative Risk Management Spring 2016 c 2016 by Martin Haugh Multivariate Distributions We will study multivariate distributions in these notes, focusing 1 in particular on multivariate

More information

Linear Algebra II. 2 Matrices. Notes 2 21st October Matrix algebra

Linear Algebra II. 2 Matrices. Notes 2 21st October Matrix algebra MTH6140 Linear Algebra II Notes 2 21st October 2010 2 Matrices You have certainly seen matrices before; indeed, we met some in the first chapter of the notes Here we revise matrix algebra, consider row

More information

THE MINIMAL POLYNOMIAL AND SOME APPLICATIONS

THE MINIMAL POLYNOMIAL AND SOME APPLICATIONS THE MINIMAL POLYNOMIAL AND SOME APPLICATIONS KEITH CONRAD. Introduction The easiest matrices to compute with are the diagonal ones. The sum and product of diagonal matrices can be computed componentwise

More information

Discrete Mathematics and Probability Theory Fall 2013 Vazirani Note 1

Discrete Mathematics and Probability Theory Fall 2013 Vazirani Note 1 CS 70 Discrete Mathematics and Probability Theory Fall 013 Vazirani Note 1 Induction Induction is a basic, powerful and widely used proof technique. It is one of the most common techniques for analyzing

More information

Quasi- Monte Carlo Multiple Integration

Quasi- Monte Carlo Multiple Integration Chapter 6 Quasi- Monte Carlo Multiple Integration Introduction In some sense, this chapter fits within Chapter 4 on variance reduction; in some sense it is stratification run wild. Quasi-Monte Carlo methods

More information

Optimal Randomized Algorithms for Integration on Function Spaces with underlying ANOVA decomposition

Optimal Randomized Algorithms for Integration on Function Spaces with underlying ANOVA decomposition Optimal Randomized on Function Spaces with underlying ANOVA decomposition Michael Gnewuch 1 University of Kaiserslautern, Germany October 16, 2013 Based on Joint Work with Jan Baldeaux (UTS Sydney) & Josef

More information

Basic elements of number theory

Basic elements of number theory Cryptography Basic elements of number theory Marius Zimand By default all the variables, such as a, b, k, etc., denote integer numbers. Divisibility a 0 divides b if b = a k for some integer k. Notation

More information

Basic elements of number theory

Basic elements of number theory Cryptography Basic elements of number theory Marius Zimand 1 Divisibility, prime numbers By default all the variables, such as a, b, k, etc., denote integer numbers. Divisibility a 0 divides b if b = a

More information

Linear Algebra. Min Yan

Linear Algebra. Min Yan Linear Algebra Min Yan January 2, 2018 2 Contents 1 Vector Space 7 1.1 Definition................................. 7 1.1.1 Axioms of Vector Space..................... 7 1.1.2 Consequence of Axiom......................

More information

Applications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices

Applications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices Applications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices Vahid Dehdari and Clayton V. Deutsch Geostatistical modeling involves many variables and many locations.

More information

CS 450 Numerical Analysis. Chapter 8: Numerical Integration and Differentiation

CS 450 Numerical Analysis. Chapter 8: Numerical Integration and Differentiation Lecture slides based on the textbook Scientific Computing: An Introductory Survey by Michael T. Heath, copyright c 2018 by the Society for Industrial and Applied Mathematics. http://www.siam.org/books/cl80

More information

Lecture 12 : Recurrences DRAFT

Lecture 12 : Recurrences DRAFT CS/Math 240: Introduction to Discrete Mathematics 3/1/2011 Lecture 12 : Recurrences Instructor: Dieter van Melkebeek Scribe: Dalibor Zelený DRAFT Last few classes we talked about program correctness. We

More information

THE DYNAMICS OF SUCCESSIVE DIFFERENCES OVER Z AND R

THE DYNAMICS OF SUCCESSIVE DIFFERENCES OVER Z AND R THE DYNAMICS OF SUCCESSIVE DIFFERENCES OVER Z AND R YIDA GAO, MATT REDMOND, ZACH STEWARD Abstract. The n-value game is a dynamical system defined by a method of iterated differences. In this paper, we

More information

Where do pseudo-random generators come from?

Where do pseudo-random generators come from? Computer Science 2426F Fall, 2018 St. George Campus University of Toronto Notes #6 (for Lecture 9) Where do pseudo-random generators come from? Later we will define One-way Functions: functions that are

More information

Sparse Linear Systems. Iterative Methods for Sparse Linear Systems. Motivation for Studying Sparse Linear Systems. Partial Differential Equations

Sparse Linear Systems. Iterative Methods for Sparse Linear Systems. Motivation for Studying Sparse Linear Systems. Partial Differential Equations Sparse Linear Systems Iterative Methods for Sparse Linear Systems Matrix Computations and Applications, Lecture C11 Fredrik Bengzon, Robert Söderlund We consider the problem of solving the linear system

More information

arxiv: v1 [math.na] 5 May 2011

arxiv: v1 [math.na] 5 May 2011 ITERATIVE METHODS FOR COMPUTING EIGENVALUES AND EIGENVECTORS MAYSUM PANJU arxiv:1105.1185v1 [math.na] 5 May 2011 Abstract. We examine some numerical iterative methods for computing the eigenvalues and

More information

arxiv: v1 [physics.comp-ph] 22 Jul 2010

arxiv: v1 [physics.comp-ph] 22 Jul 2010 Gaussian integration with rescaling of abscissas and weights arxiv:007.38v [physics.comp-ph] 22 Jul 200 A. Odrzywolek M. Smoluchowski Institute of Physics, Jagiellonian University, Cracov, Poland Abstract

More information

Fundamentals of Linear Algebra. Marcel B. Finan Arkansas Tech University c All Rights Reserved

Fundamentals of Linear Algebra. Marcel B. Finan Arkansas Tech University c All Rights Reserved Fundamentals of Linear Algebra Marcel B. Finan Arkansas Tech University c All Rights Reserved 2 PREFACE Linear algebra has evolved as a branch of mathematics with wide range of applications to the natural

More information

Review Questions REVIEW QUESTIONS 71

Review Questions REVIEW QUESTIONS 71 REVIEW QUESTIONS 71 MATLAB, is [42]. For a comprehensive treatment of error analysis and perturbation theory for linear systems and many other problems in linear algebra, see [126, 241]. An overview of

More information

Lecture Notes in Mathematics. Arkansas Tech University Department of Mathematics. The Basics of Linear Algebra

Lecture Notes in Mathematics. Arkansas Tech University Department of Mathematics. The Basics of Linear Algebra Lecture Notes in Mathematics Arkansas Tech University Department of Mathematics The Basics of Linear Algebra Marcel B. Finan c All Rights Reserved Last Updated November 30, 2015 2 Preface Linear algebra

More information

Some Notes on Linear Algebra

Some Notes on Linear Algebra Some Notes on Linear Algebra prepared for a first course in differential equations Thomas L Scofield Department of Mathematics and Statistics Calvin College 1998 1 The purpose of these notes is to present

More information

Linear Algebra: Lecture Notes. Dr Rachel Quinlan School of Mathematics, Statistics and Applied Mathematics NUI Galway

Linear Algebra: Lecture Notes. Dr Rachel Quinlan School of Mathematics, Statistics and Applied Mathematics NUI Galway Linear Algebra: Lecture Notes Dr Rachel Quinlan School of Mathematics, Statistics and Applied Mathematics NUI Galway November 6, 23 Contents Systems of Linear Equations 2 Introduction 2 2 Elementary Row

More information

, p 1 < p 2 < < p l primes.

, p 1 < p 2 < < p l primes. Solutions Math 347 Homework 1 9/6/17 Exercise 1. When we take a composite number n and factor it into primes, that means we write it as a product of prime numbers, usually in increasing order, using exponents

More information

LU Factorization. Marco Chiarandini. DM559 Linear and Integer Programming. Department of Mathematics & Computer Science University of Southern Denmark

LU Factorization. Marco Chiarandini. DM559 Linear and Integer Programming. Department of Mathematics & Computer Science University of Southern Denmark DM559 Linear and Integer Programming LU Factorization Marco Chiarandini Department of Mathematics & Computer Science University of Southern Denmark [Based on slides by Lieven Vandenberghe, UCLA] Outline

More information

CS 542G: Conditioning, BLAS, LU Factorization

CS 542G: Conditioning, BLAS, LU Factorization CS 542G: Conditioning, BLAS, LU Factorization Robert Bridson September 22, 2008 1 Why some RBF Kernel Functions Fail We derived some sensible RBF kernel functions, like φ(r) = r 2 log r, from basic principles

More information

SUBGROUPS OF CYCLIC GROUPS. 1. Introduction In a group G, we denote the (cyclic) group of powers of some g G by

SUBGROUPS OF CYCLIC GROUPS. 1. Introduction In a group G, we denote the (cyclic) group of powers of some g G by SUBGROUPS OF CYCLIC GROUPS KEITH CONRAD 1. Introduction In a group G, we denote the (cyclic) group of powers of some g G by g = {g k : k Z}. If G = g, then G itself is cyclic, with g as a generator. Examples

More information

Factoring. there exists some 1 i < j l such that x i x j (mod p). (1) p gcd(x i x j, n).

Factoring. there exists some 1 i < j l such that x i x j (mod p). (1) p gcd(x i x j, n). 18.310 lecture notes April 22, 2015 Factoring Lecturer: Michel Goemans We ve seen that it s possible to efficiently check whether an integer n is prime or not. What about factoring a number? If this could

More information

Mon Jan Improved acceleration models: linear and quadratic drag forces. Announcements: Warm-up Exercise:

Mon Jan Improved acceleration models: linear and quadratic drag forces. Announcements: Warm-up Exercise: Math 2250-004 Week 4 notes We will not necessarily finish the material from a given day's notes on that day. We may also add or subtract some material as the week progresses, but these notes represent

More information

Part II. Number Theory. Year

Part II. Number Theory. Year Part II Year 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2017 Paper 3, Section I 1G 70 Explain what is meant by an Euler pseudoprime and a strong pseudoprime. Show that 65 is an Euler

More information

Lecture 7: More Arithmetic and Fun With Primes

Lecture 7: More Arithmetic and Fun With Primes IAS/PCMI Summer Session 2000 Clay Mathematics Undergraduate Program Advanced Course on Computational Complexity Lecture 7: More Arithmetic and Fun With Primes David Mix Barrington and Alexis Maciel July

More information

This appendix provides a very basic introduction to linear algebra concepts.

This appendix provides a very basic introduction to linear algebra concepts. APPENDIX Basic Linear Algebra Concepts This appendix provides a very basic introduction to linear algebra concepts. Some of these concepts are intentionally presented here in a somewhat simplified (not

More information

1 GSW Sets of Systems

1 GSW Sets of Systems 1 Often, we have to solve a whole series of sets of simultaneous equations of the form y Ax, all of which have the same matrix A, but each of which has a different known vector y, and a different unknown

More information

Linear Algebra. Preliminary Lecture Notes

Linear Algebra. Preliminary Lecture Notes Linear Algebra Preliminary Lecture Notes Adolfo J. Rumbos c Draft date May 9, 29 2 Contents 1 Motivation for the course 5 2 Euclidean n dimensional Space 7 2.1 Definition of n Dimensional Euclidean Space...........

More information