Monte Carlo Methods for Statistical Inference: Variance Reduction Techniques
|
|
- Patience Higgins
- 6 years ago
- Views:
Transcription
1 Monte Carlo Methods for Statistical Inference: Variance Reduction Techniques Hung Chen Department of Mathematics National Taiwan University 3rd March 2004 Meet at NS 104 On Wednesday from 9:10 to 12. 1
2 Numerical Integration Outline 1. Introduction 2. Quadrature Integration 3. Composite Rules 4. Richardson's Improvement Formula 5. Improper integrals Monte Carlo Methods 1. Introduction 2. Variance Reduction Techniques 3. Importance Sampling References: 2
3 { Lange, K. (1999) Numerical Analysis for Statisticians. Springer-Verlag, New York { Robert, C.P. and Casella, G. (1999). Monte Carlo Statistical Methods. Springer Verlag. { Thisted, R.A. (1996). Elements of Statistical Computing: Numerical Computing Chapman & Hall. { An Introduction to R by William N. Venables, David M. Smith ( #DownloadableBooks) 3
4 Monte-Carlo Integration Integration is fundamental to statistical inference. Evaluation of probabilities, means, variances, and mean squared error can all be thought of as integrals. Very often it is not feasible to solve for the integral of a given function via analytical techniques and alternative methods are adapted. The approximate answers are presented in this lecture. R Suppose we wish to evaluate I = f (x)dx. Riemann integral: The denition starts with { Divide [a; b] into n-disjoint intervals 4x i, such that 4
5 [ i 4x i = [a; b] and f4x i g is called a partition P of [a; b]. { The mesh of this partition is dened to be the largest size of sub-intervals, mesh(p) = max i j 4x i j. { Dene a nite sum, S n = nx i=1 where x i 2 4x i is any point. f (x i )4x i ; { If the quantity lim meshp#0 S n exists, then it is called the integral of f on [a; b] and is denoted by R b a f (x)dx. This construction demonstrates that any numerical approximation of R b a f (x)dx will have two features: 5
6 (i) Selection of samples points which partition the interval (ii) A nite number of function evaluations on these sample points R Now consider the problem of evaluating = (x)f (x)dx where f (x) is a density function. Using the law of large numbers, we can evaluate easily. Sample X 1 ; : : : ; X n independently from f and form ^ = 1 n V ar( ^) = 1 n Z nx i=1 (x i ); [(x) ] 2 f (x)dx: 6
7 The precision of ^ is proportion to 1= p n. In numerical integration, n points can achieve the precision of O(1=n 4 ). Question: We can use Riemann integral to evaluate denite integrals. Then why do we need Monte Carlo integration? { As the number of dimensions d increases, the number of points n required to achieve a fair estimate of integral would increase dramatically, i.e., proportional to n d. { Even when the value d is small, if the function to integrated is irregular, then it would inecient to use the regular methods of integration. Curse of Dimensionality: 7
8 { It is known that the subeciency of numerical methods compared with simulation algorithms for dimension d larger than 4 since the error is then of order O(n 4=d ). { The intuitive reason behind this phenomenon is that a numerical approach like the Riemann sum method basically covers the whole space with a grid. When the dimension of the space increases, the number of points on the grid necessary to obtain a given precision increases too. 8
9 Numerical Integration Also named as \quadrature," it is related to the evaluation of the integral I = Z b a f (x)dx: It is equivalent to solving for the value I = y(b) the dierential equation dy dx = f (x) with the boundary condition y(a) = 0. When f is a simple function, I can be evaluated easily. The underlying idea is to approximate f by a simple 9
10 function which can be easily integrated on [a; b] and which agrees with f on the sampled points. The technique of nding a smooth curve passing through a set of points is also called curve tting. { To implement this idea is to sample N + 1 points and nd an order-n polynomial passing through those points. { The integral of f over that region (containing N + 1-points) can be approximated by the integral of the polynomial over the same region. { Given N + 1 sample points there is a unique polynomial passing through these points, though there are several methods to obtain it. We will use the Lagrange's method to nd this polynomial, which 10
11 we will call Lagrange's interpolating polynomial. Fit a polynomial to the sample points over the whole interval [a; b] we may end up with a high order polynomial which itself might be dicult to determine its coecients. Focus on a smaller region in [a; b], lets say [x k ; x k+n ], containing the points x k ; x k+1 ; : : : ; x k+n. Let P k+i = (x k+i ; f (x k+i )) be the pairs of the sampled points and the function values; they are called knots. Let p k;k+n (x) denote the polynomial of degree less than or equal to n that interpolates P k ; P k+1 ; : : : ; P k+n. Now the question becomes Given P k ; : : : ; P k+n, nd the polynomial p k;k+n (x) such 11
12 Sample code: that p k;k+n (x k+i ) = f (x k+i ); 0 i n: To understand the construction of p k;k+n (x), we look at the case n = 0 rst. It is the so-called the extended midpoint rule of nding I. 1. Pick N large. 2. Let x i = a + (i 1=2)h for i = 1; : : : ; N where h = (b a)=n. 3. Let f i = f (x i ). 4. Then I h P i f i. emr<- function(f,a,b,n=1000){ h<- (b-a)/n 12
13 } h*sum(f(seq(a+h/2,b,by=h))) This is the simplest thing to do. Extending these ideas to an arbitrary value of n, the polynomial takes the form where p k;k+n (x) = L k+i (x) = nx i=0 Y j=0;j6=i f (x k+i )L k+i (x); x x k+j : x k+i x k+j 13
14 Note that L k+i (x) 8 < : 1; x = x k+i 0; x = x k+j ; j 6= i else; in between Therefore, p k;k+n (x) satises the requirement that it should pass through the n + 1 knots and is an order-n polynomial. This leads to f (x) p n (x) = f (x k )L k (x)+f (x k+1 )L k+1 (x)+ +f (x k+n )L k+n ( Approximate R x k+n Z xk+n x k f (x)dx by the quantity R x k+n x k p k;k+n (x)dx. x k f (x)dx w k f (x k )+w k+1 f (x k+1 )+ +w k+n f (x k+n ); 14
15 where w k+i = Z xk+n x k L k+i (x)dx: calculate Calculation of these weights to derive a few well-known numerical integration techniques. Assume that the sample points x k ; x k+1 ; : : :, are uniformly spaced with the spacing h > 0. Any point x 2 [x k ; x k+n ] can now be represented by x = x k + sh, where s takes values 1; 2; 3; : : : ; n at the sample points and other values in between. The weight is L k+i (x k + sh) = Y j=0;j6=i s i j j 15
16 or w k+i = h Z n 0 L k+i (x k + hs)ds: { For n = 1, the weights are given by w k = h R 1 sds 0 = h=2 and w k+1 = h R 1 (1 s)ds = h=2. The integral 0 value is given by Z xk+1 f (x)dx h [f (x k x 2 ) + f (x )]: k+1 k This approximation is called the Trapezoidal rule, because the integral is equal to the area of the trapezoid formed by the two knots. 16
17 { For n = 2, the weights are given by w k = h w k+1 = h Z 2 Z0 2 Z (s2 3s + 2)ds = h 3 ; 1 2 (s2 2s)ds = 4h 3 ; w k+2 = h 0 2 (s2 s)ds h = : 3 The integral value is given by Z xk+1 1 x k f (x)dx h 3 [f (x k ) + 4f (x k+1 ) + f (x k+2 )]: This rule is called the Simpson's 1=3 rule. { For n = 3, we obtain Simpson's-3=8 rule given by Z xk+1 x k f (x)dx 3h 8 ff (x k ) + 3[f (x k+1 ) + f (x k+2 )] + f (x k+3 17
18 Composite Rules: Considering the whole space, we will divide it into n sub-intervals of equal width. Utilize a sliding window on [a; b] by including only a small number of these sub-intervals at a time. That is, Z b a f (x)dx = N=n X k=0 Z xk+n 1 x k f (x)dx; where each of the integrals on the right side can be approximated using the basic rules derived earlier. The summation of basic rules over sub-intervals to obtain an approximation over [a; b] gives rise to composite rules. 18
19 Composite Trapezoidal Rule: I h f (a) + f (b) 2 + n X1 i=1 1 f (x A i ) : The error is this approximation is given by 1 12 h2 f (2) ()(b a) for 2 (a; b). Composite Simpson's-1=3 Rule: The number of samples n + 1 should be odd, or the number of intervals should be even. The integral approximation is given by I h 3 0 (a) + f (b) X X 1 C f (x i ) + 2 f (x i ) A : i odd i even) 19
20 The error associated with this approximation is 1 90 h4 f (4) ()(b a) 4 for 2 (a; b): 20
21 Richardson's Improvement Formula Suppose that we use F [h] as an approximation of I computed using h-spacing. Then, I = F [h] + Ch n + O(h m ); where C is a constant and m > n. An improvement of F [h] is possible if there is a separation between n and m. Eliminates the error term Ch n by evaluating I for two dierent values of h and mixing the results appropriately. Assume that I is evaluated for two values of h: h 1 and h 2. Let h 2 > h 1 and h 2 =h 1 = r where r > 1. 21
22 For the sample spacing given by h 2 or rh, We have Rearranging, I = F [rh 1 ] + Cr n h n 1 + O(hm 1 ): r n I I = r n F [h 1 ] F [rh 1 ] + O(h m 1 ): I = rn F [h] r n 1 F [rh] + O(h m ): The rst term on the right can now be used as an approximation for I with the error term given by O(h m ). This removal to Ch n from the error using two evaluations of f [h] at two dierent values of h is called Richardson's Improvement Formula. 22
23 This result when applied to numerical integration is called Romberg's Integration. 23
24 Improper Integrals How do we revise the above methods to handle the following cases: The integrand goes to a nite limit at nite upper and lower limits, but cannot be calculated right on one of the limits (e.g., sin x=x at x = 0). The upper limit of integration is 1 or the lower limit is 1. There is a singularity at each limit (e.g., x 1=2 x = 0.) Commonly used techniques: 1. Change of variables For example, if a > 0 and f (t)! 0 faster than at 24
25 t 2! 0 as t! 1, then we can use u = 1=t as follows: Z 1 Z 1=a 1 f (t)dt = a 0 u 2f du: u This also works if b < 0 and the lower limit is 1. Refer to Lange for additional advice and examples. 2. Break up the integral into pieces 1 25
26 Monte-Carlo Method The main goal in this technique is to estimate the quantity, where = Z R g(x)f (x)dx = E[g(X)]; for a random variable X distributed according to the density function f (x). g(x) is any function on R such that and E[g 2 (X)] are bounded. Classical Monte Carlo approach: Suppose that we have tools to simulate independent and identically distributed samples from f (x), call them X 1 ; X 2 ; : : : ; X n, then one can approximate by 26
27 the quantity: ^ n = 1 n nx g(x i ): i=1 The variance of ^ n is given by n 1 V ar(g(x)). For this approach, the samples from f (x) are generated in an i.i.d. fashion. In order to get a good estimate, we need that the variance goes to zero and hence the number of samples goes to innity. For practical situations, the sample size never goes to innity. This raises an interesting question on whether a better estimate can be obtained with a given amount computing constraint. Now we consider three widely used techniques to accomplish this task. 27
28 Using Antithetic Variables This method depends on generating averages from the samples which have negative covariance between them, causing overall variance to go down. Let Y 1 and Y 2 be two identically distributed random variables with mean. Then, Y1 + Y V ar 2 2 = V ar(y 1) 2 Cov(Y 1; Y 2 ) + : 2 { If Y 1 and Y 2 are independent, then the last term is zero. { If Y 1 and Y 2 are positively correlated then the variance of V ar((y 1 + Y 2 )=2) > V ar(y 1 =2). { If they are negatively correlated, the resulting variance reduces. 28
29 Question: How to obtain random variables Y 1 and Y 2 with identical distribution but negative correlation? Illustration: { Let X 1 ; X 2 ; : : : ; X n be independent random variables with the distribution functions given by F 1 ; F 2 ; : : : ; F { Let g be a monotonous function. { Using the inverse transform method, the X i 's can be generated according to X i = F 1 (U i i ), for U i UNIF [0; 1]. { Dene Y 1 = g(f 1 1 (U 1 ); F 1 (U 2 2 ); : : : ; Fn 1 (U n)): U are identically distributed and Since U and 1 negatively correlated random variables, if we de- 29
30 ne Y 2 = g(f 1 1 (1 U 1 ); F 1 (1 U 2 2 ); : : : ; Fn 1 (1 U n)): { For monotonic function g, Y 1 and Y 2 are negatively correlated. { Utilizing negatively correlated functions not only reduces the resulting variance of the sample average but also reduces the computation time as only half the samples need to be generated from UNIF [0; 1]. Estimate by ~ = 1 2n nx i=1 [g(u i ) + g(1 U i )]: Other possible implementation: 30
31 { If f is symmetric around, take Y i = 2 X i. { See Geweke (1988) for the implementation of this idea. 31
32 Variance Reduction by Conditioning: Let Y and Z be two random variables. In general, we have V ar[e(y j Z)] = V ar(y ) E[V ar(y j Z)] V ar(y ): For the two random variables Y and E(Y j Z), both have the same mean. Therefore E(Y j Z) is a better random variable to simulate and average to estimate. How to nd an appropriate Z such that E(Y j Z) has signicantly lower variance than Y? Example: Estimate. 32
33 { We can do it by V i = 2U i 1, i = 1; 2, and set 1 if V I 2 = 1 + V otherwise. { Improve the estimate E(I) by using E(I j V 1 ). Note that E[I j V 1 = v 1 ] = P (V V j V 1 = v) = P (V v2 ) = (1 v 2 ) 1=2 : { The conditional variance equals to V ar[(1 V 2 1 )1=2 ] 0:0498; which is smaller than V ar(i) 0:
34 Variance Reduction using Control Variates: Estimate which is the expected value of a function g of random variables X = (X 1 ; X 2 ; : : : ; X n ). Assume that we know the expected value of another function h of these random variables, call it. For any constant a, dene a random variable W a according to W a = g(x) + a[h(x) We can utilize the sample averages of W a to estimate since E(W a ) =. Observe that V ar(w a ) = V ar(g(x))+a 2 V ar(h(x))+2acov(g(x); h(x)): ]: 34
35 It follows easily that the minimizer of V ar(w a ) as a function of a is Cov(g(X); h(x)) a = : V ar(h(x)) { Estimate by averaging observations of Cov(g(X); h(x)) g(x) [h(x) ]: V ar(h(x)) { The resulting variance of W is given by [Cov(f (X); g(x))] 2 V ar(w ) = V ar(g(x)) : V ar(f (X)) Example: Use \sample mean" to reduce the variance of estimate of \sample median." { Find median of a Poisson random variable X with = 11:5 using a set of 100 observations. 35
36 { Note that = 11:5. { Modify the usual estimate as corr(median; mean) ~x (x 11:5); s 2 where ~x and x are the median and mean of sampling data, and s 2 is the sample variance. 36
37 Example on Control Variate Method In general, suppose there are p control variates W 1 ; : : : ; W p and Z generally varies with each W i, i.e., ^ = Z px i=1 i [W i E(W i )]: Multiple regression of Z on W 1 ; : : : ; W p. How do we nd the estimates of correlation coecients between Z and W 's? 37
38 Importance Sampling Another technique commonly used for reducing variance in Monte Carlo methods is importance sampling. Importance sampling is dierent from a classical Monte Carlo method is that instead of sampling from f (x) one samples from another density h(x), and computes the estimate of using averages of g(x)f (x)=h(x) instead of g(x) evaluated on those samples. Rearrange the denition of as follows: Z Z g(x)f (x) = g(x)f (x)dx = h(x)dx: h(x) h(x) can be any density function as long as the support of h(x) contains the support of f (x). Generate samples X 1 ; : : : ; X n from the density h(x) 38
39 and compute the estimate: ^ n = 1 n nx i=1 g(x i )f (X i ) : h(x i ) It can be seen that the mean of ^ n variance is V ar( ^ n ) = 1 g(x)f n E (X) h h(x) 2 2! is and its Recall that the variance associated with the classical Monte Carlo estimator diers in the rst term. In that case, the rst term is given by E f [g(x) 2 ]. It is possible that a suitable choice of h can reduce the estimator variance below that of the classical Monte Carlo estimator. 39 :
40 By Jensen's inequality, we have a lower bound on the rst term:! E g2 (X)f 2 (X) g(x)f 2 (X) E h 2 (X) = Z h(x) g(x)f (x)dx 2 : In practice, for importance sampling, we generally seek a probability density h that is nearly proportional to f. Example taken from Robert and Casella: { Let X be a Cauchy random variable with parameters (0; 1), i.e. X is distributed according to the 40
41 density function: f (x) = 1 (1 + x 2 ) ; and g(x) = 1(x > 2) be an indicator function. { Estimate tan 2 = 0:1476: = P r(x > 2) = 1 2 { Method 1: Generate X 1 ; X 2 ; : : : ; X n as a random samples from f (x). ^ n is just the frequency of sampled values larger than 2 ^ n = 1 n nx i=1 1(X i > 2): 41
42 Variance of this estimator is simply (1 )=n or 0:126=n. { Method 2: Utilize the fact that the density f (x) is symmetric around 0, and is just half of the probability P rfj X j> 2g. Generating X i 's as i.i.d. Cauchy, one can estimate by ^ n = 1 2n nx i=1 1(j X i j> 2): Variance of this estimator is (1 2)=n or 0:052=n. { Method 3: 42
43 Write as the following integral: Z 2 = (1 + x dx: 2 ) Generate X 1 ; X 2 ; : : : ; X n as a random samples from UNIF (0; 2). Dene ^ n = n X i 1 (1 + X 2 i ): Its variance is given by 0:0092=n. { Let y = 1=x and write as the integral Z 1=2 x 2 0 (1 + x dx: 2 ) Using i.i.d. samples from UNIF [0; 1=2] and evaluating average of the function g(x) = 1=[2(1 + x 2 )] 43
44 once can further reduce the estimator variance. Importance sampling: { Select h so that its support is fx > 2g. { For x > 2, f (x) = is closely matched by 1 (1 + x 2 ) h(x) = 2 x 2: { Note that the cdf associated with h is 1 2=x. { Sampling X = 2=U, U U (0; 1), and let (x) = 1(X > 2) f (x) h(x) = 1 2(1 + x 2 ) : 44
45 { By ^ h = n P 1 i (x i ), this is equivalent to Method 3. { V ar( ^ h ) 9: =n. 45
Integration, differentiation, and root finding. Phys 420/580 Lecture 7
Integration, differentiation, and root finding Phys 420/580 Lecture 7 Numerical integration Compute an approximation to the definite integral I = b Find area under the curve in the interval Trapezoid Rule:
More informationMethod of Frobenius. General Considerations. L. Nielsen, Ph.D. Dierential Equations, Fall Department of Mathematics, Creighton University
Method of Frobenius General Considerations L. Nielsen, Ph.D. Department of Mathematics, Creighton University Dierential Equations, Fall 2008 Outline 1 The Dierential Equation and Assumptions 2 3 Main Theorem
More informationScientific Computing: Numerical Integration
Scientific Computing: Numerical Integration Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 Course MATH-GA.2043 or CSCI-GA.2112, Fall 2015 Nov 5th, 2015 A. Donev (Courant Institute) Lecture
More informationMathematical Statistics 1 Math A 6330
Mathematical Statistics 1 Math A 6330 Chapter 2 Transformations and Expectations Mohamed I. Riffi Department of Mathematics Islamic University of Gaza September 14, 2015 Outline 1 Distributions of Functions
More informationChapter 4 Integration
Chapter 4 Integration SECTION 4.1 Antiderivatives and Indefinite Integration Calculus: Chapter 4 Section 4.1 Antiderivative A function F is an antiderivative of f on an interval I if F '( x) f ( x) for
More informationIn manycomputationaleconomicapplications, one must compute thede nite n
Chapter 6 Numerical Integration In manycomputationaleconomicapplications, one must compute thede nite n integral of a real-valued function f de ned on some interval I of
More informationNorthwestern University Department of Electrical Engineering and Computer Science
Northwestern University Department of Electrical Engineering and Computer Science EECS 454: Modeling and Analysis of Communication Networks Spring 2008 Probability Review As discussed in Lecture 1, probability
More informationUSING MONTE CARLO INTEGRATION AND CONTROL VARIATES TO ESTIMATE π
USIG MOTE CARLO ITEGRATIO AD COTROL VARIATES TO ESTIMATE π ICK CAADY, PAUL FACIAE, AD DAVID MIKSA Abstract. We will demonstrate the utility of Monte Carlo integration by using this algorithm to calculate
More informationAn idea how to solve some of the problems. diverges the same must hold for the original series. T 1 p T 1 p + 1 p 1 = 1. dt = lim
An idea how to solve some of the problems 5.2-2. (a) Does not converge: By multiplying across we get Hence 2k 2k 2 /2 k 2k2 k 2 /2 k 2 /2 2k 2k 2 /2 k. As the series diverges the same must hold for the
More informationLecture 11. Probability Theory: an Overveiw
Math 408 - Mathematical Statistics Lecture 11. Probability Theory: an Overveiw February 11, 2013 Konstantin Zuev (USC) Math 408, Lecture 11 February 11, 2013 1 / 24 The starting point in developing the
More information[Read Ch. 5] [Recommended exercises: 5.2, 5.3, 5.4]
Evaluating Hypotheses [Read Ch. 5] [Recommended exercises: 5.2, 5.3, 5.4] Sample error, true error Condence intervals for observed hypothesis error Estimators Binomial distribution, Normal distribution,
More informationCS 450 Numerical Analysis. Chapter 8: Numerical Integration and Differentiation
Lecture slides based on the textbook Scientific Computing: An Introductory Survey by Michael T. Heath, copyright c 2018 by the Society for Industrial and Applied Mathematics. http://www.siam.org/books/cl80
More informationCSCI-6971 Lecture Notes: Monte Carlo integration
CSCI-6971 Lecture otes: Monte Carlo integration Kristopher R. Beevers Department of Computer Science Rensselaer Polytechnic Institute beevek@cs.rpi.edu February 21, 2006 1 Overview Consider the following
More informationOn reaching head-to-tail ratios for balanced and unbalanced coins
Journal of Statistical Planning and Inference 0 (00) 0 0 www.elsevier.com/locate/jspi On reaching head-to-tail ratios for balanced and unbalanced coins Tamas Lengyel Department of Mathematics, Occidental
More informationSolutions Final Exam May. 14, 2014
Solutions Final Exam May. 14, 2014 1. (a) (10 points) State the formal definition of a Cauchy sequence of real numbers. A sequence, {a n } n N, of real numbers, is Cauchy if and only if for every ɛ > 0,
More information2 Continuous Random Variables and their Distributions
Name: Discussion-5 1 Introduction - Continuous random variables have a range in the form of Interval on the real number line. Union of non-overlapping intervals on real line. - We also know that for any
More information5 Numerical Integration & Dierentiation
5 Numerical Integration & Dierentiation Department of Mathematics & Statistics ASU Outline of Chapter 5 1 The Trapezoidal and Simpson Rules 2 Error Formulas 3 Gaussian Numerical Integration 4 Numerical
More informationFE610 Stochastic Calculus for Financial Engineers. Stevens Institute of Technology
FE610 Stochastic Calculus for Financial Engineers Lecture 3. Calculaus in Deterministic and Stochastic Environments Steve Yang Stevens Institute of Technology 01/31/2012 Outline 1 Modeling Random Behavior
More informationLecture 22: Variance and Covariance
EE5110 : Probability Foundations for Electrical Engineers July-November 2015 Lecture 22: Variance and Covariance Lecturer: Dr. Krishna Jagannathan Scribes: R.Ravi Kiran In this lecture we will introduce
More informationNUMERICAL METHODS. x n+1 = 2x n x 2 n. In particular: which of them gives faster convergence, and why? [Work to four decimal places.
NUMERICAL METHODS 1. Rearranging the equation x 3 =.5 gives the iterative formula x n+1 = g(x n ), where g(x) = (2x 2 ) 1. (a) Starting with x = 1, compute the x n up to n = 6, and describe what is happening.
More informationCh3 Operations on one random variable-expectation
Ch3 Operations on one random variable-expectation Previously we define a random variable as a mapping from the sample space to the real line We will now introduce some operations on the random variable.
More information3 Continuous Random Variables
Jinguo Lian Math437 Notes January 15, 016 3 Continuous Random Variables Remember that discrete random variables can take only a countable number of possible values. On the other hand, a continuous random
More informationMA 3021: Numerical Analysis I Numerical Differentiation and Integration
MA 3021: Numerical Analysis I Numerical Differentiation and Integration Suh-Yuh Yang ( 楊肅煜 ) Department of Mathematics, National Central University Jhongli District, Taoyuan City 32001, Taiwan syyang@math.ncu.edu.tw
More informationLinear Regression and Its Applications
Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start
More informationThe best expert versus the smartest algorithm
Theoretical Computer Science 34 004 361 380 www.elsevier.com/locate/tcs The best expert versus the smartest algorithm Peter Chen a, Guoli Ding b; a Department of Computer Science, Louisiana State University,
More informationNumerical Integration for Multivariable. October Abstract. We consider the numerical integration of functions with point singularities over
Numerical Integration for Multivariable Functions with Point Singularities Yaun Yang and Kendall E. Atkinson y October 199 Abstract We consider the numerical integration of functions with point singularities
More information2 (Statistics) Random variables
2 (Statistics) Random variables References: DeGroot and Schervish, chapters 3, 4 and 5; Stirzaker, chapters 4, 5 and 6 We will now study the main tools use for modeling experiments with unknown outcomes
More informationGaussian process for nonstationary time series prediction
Computational Statistics & Data Analysis 47 (2004) 705 712 www.elsevier.com/locate/csda Gaussian process for nonstationary time series prediction Soane Brahim-Belhouari, Amine Bermak EEE Department, Hong
More informationRegression and Statistical Inference
Regression and Statistical Inference Walid Mnif wmnif@uwo.ca Department of Applied Mathematics The University of Western Ontario, London, Canada 1 Elements of Probability 2 Elements of Probability CDF&PDF
More informationError formulas for divided difference expansions and numerical differentiation
Error formulas for divided difference expansions and numerical differentiation Michael S. Floater Abstract: We derive an expression for the remainder in divided difference expansions and use it to give
More informationStat 451 Lecture Notes Numerical Integration
Stat 451 Lecture Notes 03 12 Numerical Integration Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapter 5 in Givens & Hoeting, and Chapters 4 & 18 of Lange 2 Updated: February 11, 2016 1 / 29
More informationDifferentiation and Integration
Differentiation and Integration (Lectures on Numerical Analysis for Economists II) Jesús Fernández-Villaverde 1 and Pablo Guerrón 2 February 12, 2018 1 University of Pennsylvania 2 Boston College Motivation
More informationMath Calculus I
Math 165 - Calculus I Christian Roettger 382 Carver Hall Mathematics Department Iowa State University www.iastate.edu/~roettger November 13, 2011 4.1 Introduction to Area Sigma Notation 4.2 The Definite
More informationMathematics 324 Riemann Zeta Function August 5, 2005
Mathematics 324 Riemann Zeta Function August 5, 25 In this note we give an introduction to the Riemann zeta function, which connects the ideas of real analysis with the arithmetic of the integers. Define
More informationSeries Solution of Linear Ordinary Differential Equations
Series Solution of Linear Ordinary Differential Equations Department of Mathematics IIT Guwahati Aim: To study methods for determining series expansions for solutions to linear ODE with variable coefficients.
More informationReal Analysis Problems
Real Analysis Problems Cristian E. Gutiérrez September 14, 29 1 1 CONTINUITY 1 Continuity Problem 1.1 Let r n be the sequence of rational numbers and Prove that f(x) = 1. f is continuous on the irrationals.
More informationK-ANTITHETIC VARIATES IN MONTE CARLO SIMULATION ISSN k-antithetic Variates in Monte Carlo Simulation Abdelaziz Nasroallah, pp.
K-ANTITHETIC VARIATES IN MONTE CARLO SIMULATION ABDELAZIZ NASROALLAH Abstract. Standard Monte Carlo simulation needs prohibitive time to achieve reasonable estimations. for untractable integrals (i.e.
More informationNumerical Integration (Quadrature) Another application for our interpolation tools!
Numerical Integration (Quadrature) Another application for our interpolation tools! Integration: Area under a curve Curve = data or function Integrating data Finite number of data points spacing specified
More informationMathematical Methods for Physics and Engineering
Mathematical Methods for Physics and Engineering Lecture notes for PDEs Sergei V. Shabanov Department of Mathematics, University of Florida, Gainesville, FL 32611 USA CHAPTER 1 The integration theory
More informationOn the positivity of linear weights in WENO approximations. Abstract
On the positivity of linear weights in WENO approximations Yuanyuan Liu, Chi-Wang Shu and Mengping Zhang 3 Abstract High order accurate weighted essentially non-oscillatory (WENO) schemes have been used
More informationLecture 11: Probability, Order Statistics and Sampling
5-75: Graduate Algorithms February, 7 Lecture : Probability, Order tatistics and ampling Lecturer: David Whitmer cribes: Ilai Deutel, C.J. Argue Exponential Distributions Definition.. Given sample space
More informationf(z)dz = 0. P dx + Qdy = D u dx v dy + i u dy + v dx. dxdy + i x = v
MA525 ON CAUCHY'S THEOREM AND GREEN'S THEOREM DAVID DRASIN (EDITED BY JOSIAH YODER) 1. Introduction No doubt the most important result in this course is Cauchy's theorem. Every critical theorem in the
More informationThe following can also be obtained from this WWW address: the papers [8, 9], more examples, comments on the implementation and a short description of
An algorithm for computing the Weierstrass normal form Mark van Hoeij Department of mathematics University of Nijmegen 6525 ED Nijmegen The Netherlands e-mail: hoeij@sci.kun.nl April 9, 1995 Abstract This
More informationTransformations and Expectations
Transformations and Expectations 1 Distributions of Functions of a Random Variable If is a random variable with cdf F (x), then any function of, say g(), is also a random variable. Sine Y = g() is a function
More informationBayesian Methods with Monte Carlo Markov Chains II
Bayesian Methods with Monte Carlo Markov Chains II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw http://tigpbp.iis.sinica.edu.tw/courses.htm 1 Part 3
More informationLimits at Infinity. Horizontal Asymptotes. Definition (Limits at Infinity) Horizontal Asymptotes
Limits at Infinity If a function f has a domain that is unbounded, that is, one of the endpoints of its domain is ±, we can determine the long term behavior of the function using a it at infinity. Definition
More informationScientific Computing: Monte Carlo
Scientific Computing: Monte Carlo Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 Course MATH-GA.2043 or CSCI-GA.2112, Spring 2012 April 5th and 12th, 2012 A. Donev (Courant Institute)
More informationMonte-Carlo MMD-MA, Université Paris-Dauphine. Xiaolu Tan
Monte-Carlo MMD-MA, Université Paris-Dauphine Xiaolu Tan tan@ceremade.dauphine.fr Septembre 2015 Contents 1 Introduction 1 1.1 The principle.................................. 1 1.2 The error analysis
More information6 Lecture 6b: the Euler Maclaurin formula
Queens College, CUNY, Department of Computer Science Numerical Methods CSCI 361 / 761 Fall 217 Instructor: Dr. Sateesh Mane c Sateesh R. Mane 217 March 26, 218 6 Lecture 6b: the Euler Maclaurin formula
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 8 10/1/2008 CONTINUOUS RANDOM VARIABLES
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 8 10/1/2008 CONTINUOUS RANDOM VARIABLES Contents 1. Continuous random variables 2. Examples 3. Expected values 4. Joint distributions
More informationINTERPOLATION. and y i = cos x i, i = 0, 1, 2 This gives us the three points. Now find a quadratic polynomial. p(x) = a 0 + a 1 x + a 2 x 2.
INTERPOLATION Interpolation is a process of finding a formula (often a polynomial) whose graph will pass through a given set of points (x, y). As an example, consider defining and x 0 = 0, x 1 = π/4, x
More informationChapter 1: A Brief Review of Maximum Likelihood, GMM, and Numerical Tools. Joan Llull. Microeconometrics IDEA PhD Program
Chapter 1: A Brief Review of Maximum Likelihood, GMM, and Numerical Tools Joan Llull Microeconometrics IDEA PhD Program Maximum Likelihood Chapter 1. A Brief Review of Maximum Likelihood, GMM, and Numerical
More informationyou expect to encounter difficulties when trying to solve A x = b? 4. A composite quadrature rule has error associated with it in the following form
Qualifying exam for numerical analysis (Spring 2017) Show your work for full credit. If you are unable to solve some part, attempt the subsequent parts. 1. Consider the following finite difference: f (0)
More informationThe Lebesgue Integral
The Lebesgue Integral Having completed our study of Lebesgue measure, we are now ready to consider the Lebesgue integral. Before diving into the details of its construction, though, we would like to give
More information10.1 Sequences. Example: A sequence is a function f(n) whose domain is a subset of the integers. Notation: *Note: n = 0 vs. n = 1.
10.1 Sequences Example: A sequence is a function f(n) whose domain is a subset of the integers. Notation: *Note: n = 0 vs. n = 1 Examples: EX1: Find a formula for the general term a n of the sequence,
More informationJim Lambers MAT 460/560 Fall Semester Practice Final Exam
Jim Lambers MAT 460/560 Fall Semester 2009-10 Practice Final Exam 1. Let f(x) = sin 2x + cos 2x. (a) Write down the 2nd Taylor polynomial P 2 (x) of f(x) centered around x 0 = 0. (b) Write down the corresponding
More informationNumerical Methods I Monte Carlo Methods
Numerical Methods I Monte Carlo Methods Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 Course G63.2010.001 / G22.2420-001, Fall 2010 Dec. 9th, 2010 A. Donev (Courant Institute) Lecture
More informationMeasure-theoretic probability
Measure-theoretic probability Koltay L. VEGTMAM144B November 28, 2012 (VEGTMAM144B) Measure-theoretic probability November 28, 2012 1 / 27 The probability space De nition The (Ω, A, P) measure space is
More informationChapter 4. Continuous Random Variables 4.1 PDF
Chapter 4 Continuous Random Variables In this chapter we study continuous random variables. The linkage between continuous and discrete random variables is the cumulative distribution (CDF) which we will
More informationExperimental mathematics and integration
Experimental mathematics and integration David H. Bailey http://www.davidhbailey.com Lawrence Berkeley National Laboratory (retired) Computer Science Department, University of California, Davis October
More informationAnalysis Qualifying Exam
Analysis Qualifying Exam Spring 2017 Problem 1: Let f be differentiable on R. Suppose that there exists M > 0 such that f(k) M for each integer k, and f (x) M for all x R. Show that f is bounded, i.e.,
More informationNon-parametric Inference and Resampling
Non-parametric Inference and Resampling Exercises by David Wozabal (Last update. Juni 010) 1 Basic Facts about Rank and Order Statistics 1.1 10 students were asked about the amount of time they spend surfing
More informationOn the convergence of interpolatory-type quadrature rules for evaluating Cauchy integrals
Journal of Computational and Applied Mathematics 49 () 38 395 www.elsevier.com/locate/cam On the convergence of interpolatory-type quadrature rules for evaluating Cauchy integrals Philsu Kim a;, Beong
More informationNumerical Integration
Numerical Integration Sanzheng Qiao Department of Computing and Software McMaster University February, 2014 Outline 1 Introduction 2 Rectangle Rule 3 Trapezoid Rule 4 Error Estimates 5 Simpson s Rule 6
More informationNumerical Methods. King Saud University
Numerical Methods King Saud University Aims In this lecture, we will... find the approximate solutions of derivative (first- and second-order) and antiderivative (definite integral only). Numerical Differentiation
More information41903: Introduction to Nonparametrics
41903: Notes 5 Introduction Nonparametrics fundamentally about fitting flexible models: want model that is flexible enough to accommodate important patterns but not so flexible it overspecializes to specific
More informationNumerical Integra/on
Numerical Integra/on Applica/ons The Trapezoidal Rule is a technique to approximate the definite integral where For 1 st order: f(a) f(b) a b Error Es/mate of Trapezoidal Rule Truncation error: From Newton-Gregory
More informationMaximum Likelihood Estimation
Connexions module: m11446 1 Maximum Likelihood Estimation Clayton Scott Robert Nowak This work is produced by The Connexions Project and licensed under the Creative Commons Attribution License Abstract
More informationChapter 4: Interpolation and Approximation. October 28, 2005
Chapter 4: Interpolation and Approximation October 28, 2005 Outline 1 2.4 Linear Interpolation 2 4.1 Lagrange Interpolation 3 4.2 Newton Interpolation and Divided Differences 4 4.3 Interpolation Error
More informationQMI Lesson 19: Integration by Substitution, Definite Integral, and Area Under Curve
QMI Lesson 19: Integration by Substitution, Definite Integral, and Area Under Curve C C Moxley Samford University Brock School of Business Substitution Rule The following rules arise from the chain rule
More informationLecture 2. Derivative. 1 / 26
Lecture 2. Derivative. 1 / 26 Basic Concepts Suppose we wish to nd the rate at which a given function f (x) is changing with respect to x when x = c. The simplest idea is to nd the average rate of change
More informationReview of Probability Theory
Review of Probability Theory Arian Maleki and Tom Do Stanford University Probability theory is the study of uncertainty Through this class, we will be relying on concepts from probability theory for deriving
More informationMULTIVARIATE PROBABILITY DISTRIBUTIONS
MULTIVARIATE PROBABILITY DISTRIBUTIONS. PRELIMINARIES.. Example. Consider an experiment that consists of tossing a die and a coin at the same time. We can consider a number of random variables defined
More informationCOURSE Numerical integration of functions (continuation) 3.3. The Romberg s iterative generation method
COURSE 7 3. Numerical integration of functions (continuation) 3.3. The Romberg s iterative generation method The presence of derivatives in the remainder difficulties in applicability to practical problems
More informationA scale-free goodness-of-t statistic for the exponential distribution based on maximum correlations
Journal of Statistical Planning and Inference 18 (22) 85 97 www.elsevier.com/locate/jspi A scale-free goodness-of-t statistic for the exponential distribution based on maximum correlations J. Fortiana,
More information. (a) Express [ ] as a non-trivial linear combination of u = [ ], v = [ ] and w =[ ], if possible. Otherwise, give your comments. (b) Express +8x+9x a
TE Linear Algebra and Numerical Methods Tutorial Set : Two Hours. (a) Show that the product AA T is a symmetric matrix. (b) Show that any square matrix A can be written as the sum of a symmetric matrix
More informationDecidability of Existence and Construction of a Complement of a given function
Decidability of Existence and Construction of a Complement of a given function Ka.Shrinivaasan, Chennai Mathematical Institute (CMI) (shrinivas@cmi.ac.in) April 28, 2011 Abstract This article denes a complement
More informationx n cos 2x dx. dx = nx n 1 and v = 1 2 sin(2x). Andreas Fring (City University London) AS1051 Lecture Autumn / 36
We saw in Example 5.4. that we sometimes need to apply integration by parts several times in the course of a single calculation. Example 5.4.4: For n let S n = x n cos x dx. Find an expression for S n
More information1 Review of di erential calculus
Review of di erential calculus This chapter presents the main elements of di erential calculus needed in probability theory. Often, students taking a course on probability theory have problems with concepts
More informationMath123 Lecture 1. Dr. Robert C. Busby. Lecturer: Office: Korman 266 Phone :
Lecturer: Math1 Lecture 1 Dr. Robert C. Busby Office: Korman 66 Phone : 15-895-1957 Email: rbusby@mcs.drexel.edu Course Web Site: http://www.mcs.drexel.edu/classes/calculus/math1_spring0/ (Links are case
More informationBounds on expectation of order statistics from a nite population
Journal of Statistical Planning and Inference 113 (2003) 569 588 www.elsevier.com/locate/jspi Bounds on expectation of order statistics from a nite population N. Balakrishnan a;, C. Charalambides b, N.
More informationIntroduction to Probability. Ariel Yadin. Lecture 10. Proposition Let X be a discrete random variable, with range R and density f X.
Introduction to Probability Ariel Yadin Lecture 1 1. Expectation - Discrete Case Proposition 1.1. Let X be a discrete random variable, with range R and density f X. Then, *** Jan. 3 *** Expectation for
More informationNew concepts: Span of a vector set, matrix column space (range) Linearly dependent set of vectors Matrix null space
Lesson 6: Linear independence, matrix column space and null space New concepts: Span of a vector set, matrix column space (range) Linearly dependent set of vectors Matrix null space Two linear systems:
More informationThe number of message symbols encoded into a
L.R.Welch THE ORIGINAL VIEW OF REED-SOLOMON CODES THE ORIGINAL VIEW [Polynomial Codes over Certain Finite Fields, I.S.Reed and G. Solomon, Journal of SIAM, June 1960] Parameters: Let GF(2 n ) be the eld
More informationSTAT 430/510: Lecture 10
STAT 430/510: Lecture 10 James Piette June 9, 2010 Updates HW2 is due today! Pick up your HW1 s up in stat dept. There is a box located right when you enter that is labeled "Stat 430 HW1". It ll be out
More informationMath 122 Fall Unit Test 1 Review Problems Set A
Math Fall 8 Unit Test Review Problems Set A We have chosen these problems because we think that they are representative of many of the mathematical concepts that we have studied. There is no guarantee
More informationA Brief Analysis of Central Limit Theorem. SIAM Chapter Florida State University
1 / 36 A Brief Analysis of Central Limit Theorem Omid Khanmohamadi (okhanmoh@math.fsu.edu) Diego Hernán Díaz Martínez (ddiazmar@math.fsu.edu) Tony Wills (twills@math.fsu.edu) Kouadio David Yao (kyao@math.fsu.edu)
More informationNumerical integration and differentiation. Unit IV. Numerical Integration and Differentiation. Plan of attack. Numerical integration.
Unit IV Numerical Integration and Differentiation Numerical integration and differentiation quadrature classical formulas for equally spaced nodes improper integrals Gaussian quadrature and orthogonal
More informationStat 451 Lecture Notes Simulating Random Variables
Stat 451 Lecture Notes 05 12 Simulating Random Variables Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapter 6 in Givens & Hoeting, Chapter 22 in Lange, and Chapter 2 in Robert & Casella 2 Updated:
More informationEntrance Exam, Real Analysis September 1, 2017 Solve exactly 6 out of the 8 problems
September, 27 Solve exactly 6 out of the 8 problems. Prove by denition (in ɛ δ language) that f(x) = + x 2 is uniformly continuous in (, ). Is f(x) uniformly continuous in (, )? Prove your conclusion.
More informationIn this chapter we study elliptical PDEs. That is, PDEs of the form. 2 u = lots,
Chapter 8 Elliptic PDEs In this chapter we study elliptical PDEs. That is, PDEs of the form 2 u = lots, where lots means lower-order terms (u x, u y,..., u, f). Here are some ways to think about the physical
More informationMATH 202B - Problem Set 5
MATH 202B - Problem Set 5 Walid Krichene (23265217) March 6, 2013 (5.1) Show that there exists a continuous function F : [0, 1] R which is monotonic on no interval of positive length. proof We know there
More informationSTA205 Probability: Week 8 R. Wolpert
INFINITE COIN-TOSS AND THE LAWS OF LARGE NUMBERS The traditional interpretation of the probability of an event E is its asymptotic frequency: the limit as n of the fraction of n repeated, similar, and
More informationAP Calculus Chapter 9: Infinite Series
AP Calculus Chapter 9: Infinite Series 9. Sequences a, a 2, a 3, a 4, a 5,... Sequence: A function whose domain is the set of positive integers n = 2 3 4 a n = a a 2 a 3 a 4 terms of the sequence Begin
More informationp. 6-1 Continuous Random Variables p. 6-2
Continuous Random Variables Recall: For discrete random variables, only a finite or countably infinite number of possible values with positive probability (>). Often, there is interest in random variables
More informationMath 141: Lecture 19
Math 141: Lecture 19 Convergence of infinite series Bob Hough November 16, 2016 Bob Hough Math 141: Lecture 19 November 16, 2016 1 / 44 Series of positive terms Recall that, given a sequence {a n } n=1,
More informationPower Series Solutions to the Legendre Equation
Department of Mathematics IIT Guwahati The Legendre equation The equation (1 x 2 )y 2xy + α(α + 1)y = 0, (1) where α is any real constant, is called Legendre s equation. When α Z +, the equation has polynomial
More information1. Introduction Over the last three decades a number of model selection criteria have been proposed, including AIC (Akaike, 1973), AICC (Hurvich & Tsa
On the Use of Marginal Likelihood in Model Selection Peide Shi Department of Probability and Statistics Peking University, Beijing 100871 P. R. China Chih-Ling Tsai Graduate School of Management University
More informationConstrained Leja points and the numerical solution of the constrained energy problem
Journal of Computational and Applied Mathematics 131 (2001) 427 444 www.elsevier.nl/locate/cam Constrained Leja points and the numerical solution of the constrained energy problem Dan I. Coroian, Peter
More informationOutline. 1 Numerical Integration. 2 Numerical Differentiation. 3 Richardson Extrapolation
Outline Numerical Integration Numerical Differentiation Numerical Integration Numerical Differentiation 3 Michael T. Heath Scientific Computing / 6 Main Ideas Quadrature based on polynomial interpolation:
More information