6. Advanced Numerical Methods. Monte Carlo Methods

Size: px
Start display at page:

Download "6. Advanced Numerical Methods. Monte Carlo Methods"

Transcription

1 6. Advanced Numerical Methods Part 1: Part : Monte Carlo Methods Fourier Methods

2 Part 1: Monte Carlo Methods 1. Uniform random numbers Generating uniform random numbers, drawn from the pdf U[0,1], is fairly easy. Any scientific Calculator will have a RAN function p(x) Better examples of U[0,1] random number generators can be found in Numerical Recipes x

3 6.1. Uniform random numbers Generating uniform random numbers, drawn from the pdf U[0,1], is fairly easy. Any scientific Calculator will have a RAN function p(x) Better examples of U[0,1] random number generators can be found in Numerical Recipes. 1 In what sense are they better? Algorithms only generate pseudo-random 0 1 numbers: very long (deterministic) sequences of numbers which are approximately random (i.e. no discernible pattern). x The better the RNG, the better it approximates U[0,1]

4 We can test pseudo-random numbers for randomness in several ways: (a) Histogram of sampled values. We can use hypothesis tests to see if the sample is consistent with the pdf we are trying to model. e.g. Chi-squared test, applied to the to the numbers in each histogram bin. 0 1 x

5 We can test pseudo-random numbers for randomness in several ways: (a) Histogram of sampled values. We can use hypothesis tests to see if the sample is consistent with the pdf we are trying to model. e.g. Chi-squared test, applied to the to the numbers in each histogram bin. χ n bin i 1 n obs i n σ i pred i Assume the bin number counts are subject to Poisson fluctuations, so that pred σ i n i Note: no. of degrees of freedom n bin 1 since we know the total sample size. 0 1 x

6 (b) Correlations between neighbouring pseudo-random numbers Sequential patterns in the sampled values would show up as structure in the phase portraits scatterplots of the i th value against the (i+1) th value etc. (see Gregory, Chapter 5) x i+ x i+1 x i+1 x i x i

7 (b) Correlations between neighbouring pseudo-random numbers Sequential patterns in the sampled values would show up as structure in the phase portraits scatterplots of the i th value against the (i+1) th value etc. We can compute the Auto-correlation function ρ( j) [ ( xi x)( xi+ j x) ] ( xi x) ( xi+ j x) j is known as the Lag If the sequence is uniformly random, we expect ρ(j) 1 for j 0 {ρ(j) 0 otherwise

8 6.. Variable transformations Generating random numbers from other pdfs can be done by transforming random numbers drawn from simpler pdfs. The procedure is similar to changing variables in integration. Suppose, e.g. x ~ p( x) y y(x) Let be monotonic Then p ( y) dy p( x) dx p( y) p( x( y)) dy dx Probability of number between y and y+dy Probability of number between x and x+dx Because probability must be positive

9 We can extend the expression given previously to the case where y(x) is not monotonic, by calculating p ( y) dy p( x i ) dx i i so that p( y) i p( x dy i ( y)) dx i

10 Example 1 p(x) Suppose we have x ~ U[0,1] 1 Then p( x) 1 for 0 < x < 1 0 otherwise 0 1 x Define y a + ( b a) x p(y) So dy dx ( b a) 1 b-a i.e. p( y) b a for a < y < 0 otherwise 1 b or y ~ U[ a, b] 0 a b y

11 Example Numerical recipes provides a program to turn Normal pdf with mean zero and standard deviation unity x ~ U[0,1] into y ~ N[0,1] Suppose we want z ~ N[ µ, σ ] z µ + σ y We define so that dz dy σ Now 1 1 p( y) exp y π so p( z) 1 exp πσ 1 z µ σ

12 Example Numerical recipes provides a program to turn Normal pdf with mean zero and standard deviation unity x ~ U[0,1] into y ~ N[0,1] Suppose we want z ~ N[ µ, σ ] z µ + σ y We define so that dz dy σ Now 1 1 p( y) exp y π so p( z) 1 exp πσ 1 z µ σ Variable transformation formula also the basis for error propagation formulae we use in data analysis - see also SUPAIDA course

13 Question 13: x ~ U[0,1] y If and, the pdf of is y ln x A p ( y) y e B p( y) e y C p( y) ln y D p( y) ln y

14 Question 13: x ~ U[0,1] y If and, the pdf of is y ln x A p ( y) y e B p( y) e y C p( y) ln y D p( y) ln y

15 6.3. Probability integral transform One particular variable transformation merits special attention. Suppose we can compute the CDF of some desired random variable

16 1) Sample a random variable y ~ U[0,1] x y P(x) ) Compute such that i.e. x P 1 ( y)

17 1) Sample a random variable y ~ U[0,1] x y P(x) ) Compute such that i.e. x P 1 ( y)

18 1) Sample a random variable y ~ U[0,1] x y P(x) ) Compute such that i.e. x P 1 ( y) x ~ p( x) 3) Then i.e. is drawn from the pdf corresponding to the cdf x P(x)

19 Example (from Gregory)

20 6.4. Rejection sampling Suppose we want to sample from some pdf and we know that p 1 ( x) p ( x) y p1( x) < p( x) x p 1 ( x) 1) Sample from ) Sample x 1 p ( x ) p ( x )] y ~ U[0, 1 (Suppose we have an easy way to do this) x 1

21 6.4. Rejection sampling Suppose we want to sample from some pdf and we know that p 1 ( x) p ( x) y p1( x) < p( x) x p 1 ( x) 1) Sample from ) Sample x 1 p ( x ) p ( x )] y ~ U[0, 1 (Suppose we have an easy way to do this) x 1 y < p 1 ( x) 3) If ACCEPT otherwise REJECT

22 6.4. Rejection sampling (following Mackay) Suppose we want to sample from some pdf and we know that p 1 ( x) p ( x) y p1( x) < p( x) x p 1 ( x) 1) Sample from ) Sample y < x 1 p ( x ) p ( x )] y ~ U[0, 1 p 1 ( x) 3) If ACCEPT otherwise REJECT (Suppose we have an easy way to do this) x 1 Set of accepted values are a sample from p 1 ( x) { } x i

23 6.4. Rejection sampling p ( x) p 1 ( x) Method can be very slow if the shaded region is too large. Ideally we want to find a pdf p ( x) that is: (a) easy to sample from (b) close to p 1 ( x)

24 6.5. Genetic Algorithms (Charbonneau 1995)

25 6.5. Genetic Algorithms (Charbonneau 1995)

26 6.5. Genetic Algorithms (Charbonneau 1995)

27 6.5. Genetic Algorithms see

28 6.5. Genetic Algorithms see

29 6.6. Markov Chain Monte Carlo This is a very powerful, new (at least in astronomy!) method for sampling from pdfs. (These can be complicated and/or of high dimension). MCMC widely used e.g. in cosmology to determine maximum likelihood model to CMBR data. Angular power spectrum of CMBR temperature fluctuations ML cosmological model, depending on 7 different parameters. (Hinshaw et al 006)

30 Consider a -D example (e.g. bivariate normal distribution); Likelihood function depends on parameters a and b. Suppose we are trying to find the maximum of L(a,b) L(a,b) 1) Start off at some randomly chosen value ( a 1, b 1 ) ) Compute L( a 1, b 1 ) and gradient L L, a b ( a b ) 1, 1 3) Move in direction of steepest +ve gradient i.e. L( a 1, b 1 ) is increasing fastest a b 4) Repeat from step until ( a n, b n ) converges on maximum of likelihood

31 Consider a -D example (e.g. bivariate normal distribution); Likelihood function depends on parameters a and b. Suppose we are trying to find the maximum of L(a,b) L(a,b) 1) Start off at some randomly chosen value ( a 1, b 1 ) ) Compute L( a 1, b 1 ) and gradient L L, a b ( a b ) 1, 1 3) Move in direction of steepest +ve gradient i.e. L( a 1, b 1 ) is increasing fastest a b 4) Repeat from step until ( a n, b n ) converges on maximum of likelihood OK for finding maximum, but not for generating a sample from or for determining errors on the the ML parameter estimates. L(a,b)

32 MCMC provides a simple Metropolis algorithm for generating random samples of points from L(a,b) b Slice through L(a,b) a

33 b MCMC provides a simple Metropolis algorithm for generating random samples of points from L(a,b) 1. Sample random initial point P 1 ( a 1, b 1 ) P 1 Slice through L(a,b) a

34 MCMC provides a simple Metropolis algorithm for generating random samples of points from L(a,b) b 1. Sample random initial point P 1 ( a 1, b 1 ). Centre a new pdf, Q, called the proposal density, on P 1 P 1 Slice through L(a,b) a

35 MCMC provides a simple Metropolis algorithm for generating random samples of points from L(a,b) b 1. Sample random initial point P 1 ( a 1, b 1 ). Centre a new pdf, Q, called the proposal density, on P 1 P 1 P 3. Sample tentative new point from Q P ( a, b ) Slice through L(a,b) a

36 MCMC provides a simple Metropolis algorithm for generating random samples of points from L(a,b) b 1. Sample random initial point P 1 ( a 1, b 1 ). Centre a new pdf, Q, called the proposal density, on P 1 P 1 P 3. Sample tentative new point from Q P ( a, b ) Slice through L(a,b) 4. Compute R L( a', b' ) L ( a 1, b 1) a

37 5. If R > 1 this means is uphill from. P P 1 We accept P as the next point in our chain, i.e. P P 6. If R < 1 this means is downhill from. P P 1 In this case we may reject P as our next point. In fact, we accept with probability R. P How do we do this? (a) Generate a random number x ~ U[0,1] (b) If x < R then accept and set P P P (c) If x > R then reject and set P P P 1

38 5. If R > 1 this means is uphill from. P P 1 We accept P as the next point in our chain, i.e. P P 6. If R < 1 this means is downhill from. P P 1 In this case we may reject P as our next point. In fact, we accept with probability R. P How do we do this? (a) Generate a random number x ~ U[0,1] (b) If x < R then accept and set P P P (c) If x > R then reject and set P P P 1 Acceptance probability depends only on the previous point - Markov Chain

39 So the Metropolis Algorithm generally (but not always) moves uphill, towards the peak of the Likelihood Function. Remarkable facts Sequence of points { P 1, P, P 3, P 4, P 5, } represents a sample from the LF L(a,b) (see notes on website) Sequence for each coordinate, e.g. samples the marginalised likelihood of a { a 1, a, a 3, a 4, a 5, } We can make a histogram of and use it to compute the mean and variance of a ( i.e. to attach an error bar to a ) { a 1, a, a 3, a 4, a 5,, a n }

40 Why is this so useful? Suppose our LF was a 1-D Gaussian. We could estimate the mean and variance quite well from a histogram of e.g samples. But what if our problem is, e.g. 7 dimensional? Exhaustive sampling could require (1000) 7 samples! No. of samples MCMC provides a short-cut. To compute a new point in our Markov Chain we need to compute Sampled value the LF. But the computational cost does not grow so dramatically as we increase the number of dimensions of our problem. This lets us tackle problems that would be impossible by normal sampling.

41 Example: CMBR constraints from WMAP 3 year data ( + 1 year data) Angular power spectrum of CMBR temperature fluctuations ML cosmological model, depending on 7 different parameters. (Hinshaw et al 006)

42 Question 14: When applying the Metropolis algorithm, if the width of the proposal density is very small A B C D the Markov Chain will move around the parameter space very slowly the Markov Chain will converge very quickly to the true pdf the acceptance rate of proposed steps in the Markov Chain will be very small most steps in the Markov Chain will explore regions of very low probability

43 Question 14: When applying the Metropolis algorithm, if the width of the proposal density is very small A B C D the Markov Chain will move around the parameter space very slowly the Markov Chain will converge very quickly to the true pdf the acceptance rate of proposed steps in the Markov Chain will be very small most steps in the Markov Chain will explore regions of very low probability

44 A number of factors can improve the performance of the Metropolis algorithm, including: using parameters in the likelihood function which are (close to) independent (i.e. their Fisher matrix is approx. diagonal). adopting a judicious choice of proposal density, well matched to the shape of the likelihood function. using a simulated annealing approach i.e. sampling from a modified posterior likelihood function of the form T for large the modified likelihood is a flatter version of the true likelihood [ p( θ D I )] ln, p T ( θ D, I ) exp T

45 T Temperature parameter starts out large, so that the acceptance rate for downhill steps is high search is essentially random. (This helps to avoid getting stuck in local maxima) T is gradually reduced as the chain evolves, so that downhill steps become increasingly disfavoured. In some versions, the evolution of is carried out automatically this is known as adaptive simulated annealing. T See, for example, Numerical Recipes Section 10.9, or Gregory Chapter 11, for more details.

46 A related idea is parallel tempering (see e.g. Gregory, Chap 1) β 1 T Series of MCMC chains, with different, set off in parallel, with a certain probability of swapping parameter states between chains. High temperature chains are effective at mapping out the global structure of the likelihood surface. Low temperature chains are effective at mapping out the shape of local likelihood maxima.

47 Example: spectral line fitting, from Section 3. Conventional MCMC MCMC with parallel tempering

48 Example: spectral line fitting, from Section 3. Conventional MCMC MCMC with parallel tempering

49 Approximating the Evidence Evidence p(data θ, M ) p( θ M ) dθ Average likelihood, weighted by prior Calculating the evidence can be computationally very costly (e.g. CMBR spectrum in cosmology) C l How to proceed? Information criteria (Liddle 004, 007). Laplace and Savage-Dickey approximations (Trotta 005) 3. Nested sampling (Skilling 004, 006; ) (Mukherjee et al. 005, 007; Sivia 006)

50 Nested Sampling (Skilling 004, 006; Mukherjee et al 005, 007) Evidence p(data θ, M ) p( θ M ) dθ Key idea: We can rewrite the Evidence as Evidence p(data θ, M )dx where X is a 1-D variable known as the prior mass uniformly distributed on [0,1] Evidence Z 1 L( X )dx 0

51 Skilling (006) Evidence Z 1 L( X )dx 0 Example: -D Likelihood function

52 Example: -D Likelihood function (from Mackay 005) Contours of constant likelihood, L Each contour encloses a different fraction, X, of the area of the square Each point in the plane has an associated value of L and X However, mapping systematically the relationship between L and X everywhere may be computationally very costly

53 However, mapping systematically the relationship between L and X everywhere may be computationally very costly

54 Skilling (006) Approximation procedure

55

56 Skilling (006)

57 7. Advanced Numerical Methods Part 1: Part : Monte Carlo Methods Fourier Methods

58 Part : Fourier Methods In many diverse fields physical data is collected or analysed as Fourier components. In this section we briefly discuss the mathematics of Fourier series and Fourier transforms. 1. Fourier Series (x) Any well-behaved function can be expanded in terms of an infinite sum of sines and cosines. The expansion takes the form: f f ( x) 1 a + a cos nx + b sin nx 0 n n 1 n 1 n Joseph Fourier ( )

59 The Fourier coefficients are given by the formulae: π 1 a 0 f ( x) dx π π a n 1 π π π f ( x) cos nxdx b n 1 π π π f ( x) sin nxdx These formulae follow from the orthogonality properties of sin and cos: π π sin mxsin nxdx πδ mn π π cos mx cos nxdx πδ mn π π sin mx cos nxdx 0

60 Some examples from Mathworld, approximating functions with a finite number of Fourier series terms

61 The Fourier series can also be written in complex form: f ( x) n A n e inx where π 1 An π f ( x) e π inx dx and recall that e inx e inx cos nx + i sin nx cos nx i sin nx Fourier's Theorem is not only one of the most beautiful results of modern analysis, but it is said to furnish an indispensable instrument in the treatment of nearly every recondite question in modern physics

62 Fourier Transform: Basic Definition The Fourier transform can be thought of simply as extending the idea of a Fourier series from an infinite sum over discrete, integer Fourier modes to an infinite integral over continuous Fourier modes. Consider, for example, a physical process that is varying in the time domain, i.e. it is described by some function of time. h(t) Alternatively we can describe the physical process in the frequency domain by defining the Fourier Transform function. h (t) H ( f ) H ( f It is useful to think of and as two different representations of the same function; the information they convey about the underlying physical process should be equivalent. )

63 We define the Fourier transform as H ( f ) πif t h( t) e dt and the corresponding inverse Fourier transform as h( t) t H ( f ) e πif df If time is measured in seconds then frequency is measured in cycles per second, or Hertz.

64 In many physical applications it is common to define the frequency domain behaviour of the function in terms of angular frequency This changes the previous relations accordingly: ω π f H ( ω) πiωt h( t) e dt h( t) 1 πiω t H ( ω) e dω π Thus the symmetry of the previous expressions is broken.

65 Fourier Transform: Further properties The FT is a linear operation: (1) the FT of the sum of two functions is equal to the sum of their FTs () the FT of a constant times a function is equal to the constant times the FT of the function. h(t) If the time domain function is a real function, then its FT is complex. However, more generally we can consider the case where is also a complex function i.e. we can write h( t) h ( t) ih ( t) R + I h(t) Real part Imaginary part h(t) may also possess certain symmetries: even function odd function h( t) h( t) h( t) h( t)

66 The following properties then hold: See Numerical Recipes, Section 1.0 Note that in the above table a star (*) denotes the complex conjugate, i.e. if z x + i y then z* x i y

67 For convenience we will denote the FT pair by It is then straightforward to show that h( t) H ( f ) 1 h( at) H ( f a) time scaling a 1 b h( t b) H ( b f ) frequency scaling h( t t 0 ) H ( f ) e π i f t 0 time shifting ( ) π i f t h t e 0 H ( f f 0) frequency scaling

68 Suppose we have two functions and Their convolution is defined as g (t) h(t) ( h) ( t) g g( s) h( t s) ds We can prove the Convolution Theorem ( g h) ( t) G( f ) H ( f ) i.e. the FT of the convolution of the two functions is equal to the product of their individual FTs. Known as the lag Also their correlation, which is also a function of t, is defined as Corr( g, h) g( s + t) h( s) ds

69 We can prove the Correlation Theorem Corr( g, h) G( f ) H ( f ) i.e. the FT of the first time domain function, multiplied by the complex conjugate of the FT of the second time domain function, is equal to the FT of their correlation. The correlation of a function with itself is called the auto-correlation In this case Corr( g, g) G( f ) G( f ) The function is known as the power spectral density, or (more loosely) as the power spectrum. Hence, the power spectrum is equal to the Fourier Transform of the auto-correlation function for the time domain function g(t)

70 The power spectral density The power spectral density is analogous to the pdf we defined in previous sections. In order to know how much power is contained in a given interval of frequency, we need to integrate the power spectral density over that interval. The total power in a signal is the same, regardless of whether we measure it in the time domain or the frequency domain: Total Power h(t) dt - - H(f) df Parseval s Theorem

71 Often we will want to know how much power is contained in a frequency interval without distinguishing between positive and negative values. In this case we define the one-sided power spectral density: P h ( f ) H ( f ) + H ( f ) 0 f < And Total Power 0 P h ( f ) df h(t) When is a real function P h ( f ) H ( f ) With the proper normalisation, the total power (i.e. the integrated area under the relevant curve) is the same regardless of whether we are working with the time domain signal, the power spectral density or the one-sided power spectral density.

72 From Numerical Recipes, Chapter 1.0 Time domain One-sided PSD Two-sided PSD

73 Examples (1) h( t) const. Dirac Delta function H ( f ) δ D(0) h(t) H ( f ) 0 t 0 f () π i f0 t h( t) e H ( f ) δ D( f f0) h(t) H ( f ) 0 t 0 f Imaginary part Real part

74 H ( f ) sinc( π f ) (3) 1 h( t) 0 for 1 t otherwise 1 H ( f ) h(t) 0 f Imaginary part 0 0 t y sincx The sinc function occurs frequently in many areas of physics y 1 sinc x sin x x 1 st zeroat x π 1 st zeroat x π x 0 x ±mπ The function has a maximum at and the zeros occur at for positive integer m x

75 (4) h( t) ( t) H ( f ) sinc ( π f ) H ( f ) 1 1-1/ 0 1/ t 0 f Re [ H ( f )] a a + 4π f (5) h( t) e at Im [ H ( f )] a π f ( a + 4π f ) H ( f ) Real part t Imaginary part

76 (6) h( t) exp( a t ) ( π f / ) H ( f ) exp a 0 t 0 f i.e. the FT of a Gaussian function in the time domain is also a Gaussian in the frequency domain.

77 Question 15: If the variance of a Gaussian is doubled in the time domain A B C D the variance of its Fourier transform will be doubled in the frequency domain the variance of its Fourier transform will be halved in the frequency domain the standard deviation of its Fourier transform will be doubled in the frequency domain the standard deviation of its Fourier transform will be halved in the frequency domain

78 Question 15: If the variance of a Gaussian is doubled in the time domain A B C D the variance of its Fourier transform will be doubled in the frequency domain the variance of its Fourier transform will be halved in the frequency domain the standard deviation of its Fourier transform will be doubled in the frequency domain the standard deviation of its Fourier transform will be halved in the frequency domain

79 (6) h( t) exp( a t ) ( π f / ) H ( f ) exp a 0 t 0 f i.e. the FT of a Gaussian function in the time domain is also a Gaussian in the frequency domain. The broader the Gaussian is in the time domain, then the narrower the Gaussian FT in the frequency domain, and vice versa.

80 Discrete Fourier Transforms Although we have discussed FTs so far in the context of a continuous, h(t) analytic function,, in many practical situations we must work instead with observational data which are sampled at a discrete set of times. h(t) Suppose that we sample in total times at evenly spaced time N intervals, i.e. (for even) N +1 h k h ( t k ) where t k k, k N,...,0,..., N h(t) [ If is non-zero over only a finite interval of time, then we N +1 suppose that the sampled points contain this interval. Or if has an infinite range, then we at least suppose that the sampled points h(t) h(t) cover a sufficient range to be representative of the behaviour of ].

81 We therefore approximate the FT as H ( f ) k N π t h( t) e if dt h k N k e π if t k h(t) N + 1 Since we are sampling at discrete timesteps, in view of the symmetry of the FT and inverse FT it makes sense also to compute N +1 only at a set of discrete frequencies: H ( f ) f n n N, n N,...,0,..., N f c 1 (The frequency is known as the Nyquist (critical) frequency and it is a very important value. We discuss its significance later).

82 Then Note that Hence, there are only independent values. Also, note that So we can re-define the Discrete FT as: n N k N i k n k n H e h f H 1 0 / ) ( π in in e e π π N N k N in N ikn in N ikn e e e )/ ( / / + + π π π π / ) ( N k N k N i k n k N k N k t if k n e h e h f H k n π π Discrete Fourier Transform of the k h

83 The discrete inverse FT, which recovers the set of from the set of H n 's is h k 1 1 N k 0 N H n e π i k n / N h k 's Parseval s theorem for discrete FTs takes the form N 1 k 0 h k 1 1 N N n 0 H n There are also discrete analogues to the convolution and correlation theorems.

84 Fast Fourier Transforms Consider again the formula for the discrete FT. We can write it as H n N 1 k 0 e π i k n / N h k N 1 k 0 W nk h k H n ' s This is a matrix equation: we compute the ( N 1) vector of nk by multiplying the ( N) matrix W by the ( N 1) vector of. N [ ] h k 's In general, this requires of order multiplications (and the may be complex numbers). N h k ' s N N e.g. suppose. Even if a computer can perform (say) 1 billion multiplications per second, it would still require more than 115 days to calculate the FT.

85 Fortunately, there is a way around this problem. Suppose (as we assumed before) is an even number. Then we can write where So we have turned an FT with points into the weighted sum of two FTs with points. This would reduce our computing time by a factor of two. N / 0 1 / 1) ( 1 / 0 / ) ( 1 0 / N j j N n j i N j j N n j i N k k N i k n n h e h e h e H π π π Even values of k Odd values of k / 1 0 / M j j M j n i n M j j M j n i h e W h e π π N / M N N /

86 Why stop there, however?... M If is also even, we can repeat the process and re-write the FTs of M M / length as the weighted sum of two FTs of length. N If is a power of two (e.g. 104, 048, etc) then we can repeat iteratively the process of splitting each longer FT into two FTs half as long. The final step in this iteration consists of computing FTs of length unity: H π i k 0 0 e hk h0 k 0 i.e. the FT of each discretely sampled data value is just the data value itself.

87 O( N This iterative process converts multiplications into O( N log N) ) operations. This notation means of the order of So our operations are reduced to about operations. Instead of 100 days of CPU time, we can perform the FT in less than 3 seconds. The Fast Fourier Transform (FFT) has revolutionised our ability to tackle problems in Fourier analysis on a desktop PC which would otherwise be impractical, even on the largest supercomputers.

88 Data Acquisition Earlier we approximated the continuous function and its FT H ( f ) N +1 h(t) by a finite set of discretely sampled values. How good is this approximation? The answer depends on the form of h(t) H ( f ) and. In this short section we will consider: 1. under what conditions we can reconstruct and exactly from a set of discretely sampled points? h (t) H ( f ). what is the minimum sampling rate (or density, if is a spatially varying function) required to achieve this exact reconstruction? 3. what is the effect on our reconstructed and if our data acquisition does not achieve this minimum sampling rate? h h (t) H ( f )

89 The Nyquist Shannon Sampling Theorem h(t) Suppose the function is bandwidth limited. This means that h(t) the FT of is non-zero over a finite range of frequencies. i.e. there exists a critical frequency f c such that H ( f ) 0 for all f fc The Nyquist Shannon Sampling Theorem (NSST) is a very important result from information theory. It concerns the representation of by a set of discretely sampled values h(t) h k h ( t k ) where t k k, k...,, 1,0,1,,...

90 The NSST states that, provided the sampling interval satisfies 1 f c or less then we can exactly reconstruct the function from the discrete { } h k samples. It can be shown that h(t) h( t) sin [ π fc( t k ) ] π ( t k ) + hk k - f c is also known as the Nyquist frequency and is known as the Nyquist rate. 1 f c

91 We can re-write this as h( t) sin [ π ( t k ) ] [( t k ) ] + hk k - π So the function is the sum of the sampled values by the normalised sampled values. h (t) { }, weighted sinc function, scaled so that its zeroes lie at those h k Normalised sinc function sinc(x) (compare with previous section) x ( t ) k t k h ( t) since sinc ( 0) 1 Note that when then h k Sampled values

92 The NSST is a very powerful result. We can think of the interpolating sinc functions, centred on each sampled point, as filling in the gaps in our data. The remarkable fact is that they h(t) do this job perfectly, provided is bandwidth limited. i.e. the h (t) H ( f ) discrete sampling incurs no loss of information about and. Suppose, for example, that. Then we need only h(t) sample twice every period in order to be able to reconstruct the entire function exactly. h( t) sin ( π f t) c (Note that formally we do need to sample an infinite number of { h } { } discretely spaced values,. If we only sample the k h(t) over a finite time interval, then our interpolated will be approximate). h k

93 y sin ( π f t) c x f c t 1 h(t) f c Sampling at (infinitely many of) the red points is sufficient to reconstruct the function for all values of t, with no loss of information.

94 Aliasing There is a downside, however. h(t) If is not bandwidth limited (or, equivalently, if we don t sample 1 < f c frequently enough i.e. if the sampling rate ) then our h (t) H ( f ) reconstruction of and is badly affected by aliasing. This means that all of the power spectral density which lies outside the f < f < H ( f ) h(t) range is spuriously moved inside that range, so that the FT of will be computed incorrectly from the discretely sampled data. c f c Any frequency component outside the range is falsely translated ( aliased ) into that range. ( f, ) c f c

95 h(t) Consider as shown. h(t) Suppose is zero outside the range T. This means that extends to ±. H ( f ) h(t) sampled at regular intervals The contribution to the true FT from outside the ( ) range 1,1 gets aliased into this range, appearing as a mirror image. Thus, at our computed value of f ± 1 H ( f ) is equal to twice the true value. From Numerical Recipes, Chapter 1.1

96 How do we combat aliasing? f h(t) c f > o Enforce some chosen e.g. by filtering to remove the high frequency components. (Also known as anti-aliasing) h(t) 1 f c o Sample at a high enough rate so that - i.e. at least two samples per cycle of the highest frequency present f c 1 To check for / eliminate aliasing without pre-filtering: o Given a sampling interval, compute f lim 1 o h(t) Check if discrete FT of is approaching zero as f flim o o If not, then frequencies outside the range 1,1 are probably being folded back into this range. Try increasing the sampling rate, and repeat ( )

97 And finally. Mock Data Challenge 1000 (x,y) data pairs, generated from an unknown model plus Gaussian noise. y Data posted on my.supa Four-stage challenge: x 1. Fit a linear model to these data, using ordinary least squares;. Compute Bayesian credible regions for the model parameters, using a bivariate normal model for the likelihood function; 3. Write an MCMC code to sample from the posterior pdf of the model parameters, and compare their sample estimates with the LS fits; 4. Carry out a Bayesian model comparison, calculating the posterior odds ratio of a constant, linear and quadratic model.

Data Analysis I. Dr Martin Hendry, Dept of Physics and Astronomy University of Glasgow, UK. 10 lectures, beginning October 2006

Data Analysis I. Dr Martin Hendry, Dept of Physics and Astronomy University of Glasgow, UK. 10 lectures, beginning October 2006 Astronomical p( y x, I) p( x, I) p ( x y, I) = p( y, I) Data Analysis I Dr Martin Hendry, Dept of Physics and Astronomy University of Glasgow, UK 10 lectures, beginning October 2006 4. Monte Carlo Methods

More information

In many diverse fields physical data is collected or analysed as Fourier components.

In many diverse fields physical data is collected or analysed as Fourier components. 1. Fourier Methods In many diverse ields physical data is collected or analysed as Fourier components. In this section we briely discuss the mathematics o Fourier series and Fourier transorms. 1. Fourier

More information

SEISMIC WAVE PROPAGATION. Lecture 2: Fourier Analysis

SEISMIC WAVE PROPAGATION. Lecture 2: Fourier Analysis SEISMIC WAVE PROPAGATION Lecture 2: Fourier Analysis Fourier Series & Fourier Transforms Fourier Series Review of trigonometric identities Analysing the square wave Fourier Transform Transforms of some

More information

Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics)

Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics) Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics) Probability quantifies randomness and uncertainty How do I estimate the normalization and logarithmic slope of a X ray continuum, assuming

More information

12.1 Fourier Transform of Discretely Sampled Data

12.1 Fourier Transform of Discretely Sampled Data 494 Chapter 12. Fast Fourier Transform now is: The PSD-per-unit-time converges to finite values at all frequencies except those where h(t) has a discrete sine-wave (or cosine-wave) component of finite

More information

17 : Markov Chain Monte Carlo

17 : Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models, Spring 2015 17 : Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Heran Lin, Bin Deng, Yun Huang 1 Review of Monte Carlo Methods 1.1 Overview Monte Carlo

More information

Information and Communications Security: Encryption and Information Hiding

Information and Communications Security: Encryption and Information Hiding Short Course on Information and Communications Security: Encryption and Information Hiding Tuesday, 10 March Friday, 13 March, 2015 Lecture 5: Signal Analysis Contents The complex exponential The complex

More information

Physics 403. Segev BenZvi. Numerical Methods, Maximum Likelihood, and Least Squares. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Numerical Methods, Maximum Likelihood, and Least Squares. Department of Physics and Astronomy University of Rochester Physics 403 Numerical Methods, Maximum Likelihood, and Least Squares Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Quadratic Approximation

More information

CS711008Z Algorithm Design and Analysis

CS711008Z Algorithm Design and Analysis CS711008Z Algorithm Design and Analysis Lecture 5 FFT and Divide and Conquer Dongbo Bu Institute of Computing Technology Chinese Academy of Sciences, Beijing, China 1 / 56 Outline DFT: evaluate a polynomial

More information

Lecture 7. Fourier Analysis

Lecture 7. Fourier Analysis Lecture 7 Fourier Analysis Summary Lecture 6 Minima and maxima 1 dimension : Bracket minima : 3 values of f(x) : f(2) < f(1) and f(2)

More information

Statistical Methods in Particle Physics

Statistical Methods in Particle Physics Statistical Methods in Particle Physics Lecture 11 January 7, 2013 Silvia Masciocchi, GSI Darmstadt s.masciocchi@gsi.de Winter Semester 2012 / 13 Outline How to communicate the statistical uncertainty

More information

These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop

These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop Music and Machine Learning (IFT68 Winter 8) Prof. Douglas Eck, Université de Montréal These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop

More information

Practical Numerical Methods in Physics and Astronomy. Lecture 5 Optimisation and Search Techniques

Practical Numerical Methods in Physics and Astronomy. Lecture 5 Optimisation and Search Techniques Practical Numerical Methods in Physics and Astronomy Lecture 5 Optimisation and Search Techniques Pat Scott Department of Physics, McGill University January 30, 2013 Slides available from http://www.physics.mcgill.ca/

More information

Physics 403. Segev BenZvi. Propagation of Uncertainties. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Propagation of Uncertainties. Department of Physics and Astronomy University of Rochester Physics 403 Propagation of Uncertainties Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Maximum Likelihood and Minimum Least Squares Uncertainty Intervals

More information

Variational Methods in Bayesian Deconvolution

Variational Methods in Bayesian Deconvolution PHYSTAT, SLAC, Stanford, California, September 8-, Variational Methods in Bayesian Deconvolution K. Zarb Adami Cavendish Laboratory, University of Cambridge, UK This paper gives an introduction to the

More information

Bayesian Inference in Astronomy & Astrophysics A Short Course

Bayesian Inference in Astronomy & Astrophysics A Short Course Bayesian Inference in Astronomy & Astrophysics A Short Course Tom Loredo Dept. of Astronomy, Cornell University p.1/37 Five Lectures Overview of Bayesian Inference From Gaussians to Periodograms Learning

More information

Reminder of some Markov Chain properties:

Reminder of some Markov Chain properties: Reminder of some Markov Chain properties: 1. a transition from one state to another occurs probabilistically 2. only state that matters is where you currently are (i.e. given present, future is independent

More information

Bayesian Regression Linear and Logistic Regression

Bayesian Regression Linear and Logistic Regression When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we

More information

Today: Fundamentals of Monte Carlo

Today: Fundamentals of Monte Carlo Today: Fundamentals of Monte Carlo What is Monte Carlo? Named at Los Alamos in 940 s after the casino. Any method which uses (pseudo)random numbers as an essential part of the algorithm. Stochastic - not

More information

Strong Lens Modeling (II): Statistical Methods

Strong Lens Modeling (II): Statistical Methods Strong Lens Modeling (II): Statistical Methods Chuck Keeton Rutgers, the State University of New Jersey Probability theory multiple random variables, a and b joint distribution p(a, b) conditional distribution

More information

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis Lecture 3 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As

More information

A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling. Christopher Jennison. Adriana Ibrahim. Seminar at University of Kuwait

A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling. Christopher Jennison. Adriana Ibrahim. Seminar at University of Kuwait A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj Adriana Ibrahim Institute

More information

AST 418/518 Instrumentation and Statistics

AST 418/518 Instrumentation and Statistics AST 418/518 Instrumentation and Statistics Class Website: http://ircamera.as.arizona.edu/astr_518 Class Texts: Practical Statistics for Astronomers, J.V. Wall, and C.R. Jenkins Measuring the Universe,

More information

Multimodal Nested Sampling

Multimodal Nested Sampling Multimodal Nested Sampling Farhan Feroz Astrophysics Group, Cavendish Lab, Cambridge Inverse Problems & Cosmology Most obvious example: standard CMB data analysis pipeline But many others: object detection,

More information

If we want to analyze experimental or simulated data we might encounter the following tasks:

If we want to analyze experimental or simulated data we might encounter the following tasks: Chapter 1 Introduction If we want to analyze experimental or simulated data we might encounter the following tasks: Characterization of the source of the signal and diagnosis Studying dependencies Prediction

More information

Lecture : Probabilistic Machine Learning

Lecture : Probabilistic Machine Learning Lecture : Probabilistic Machine Learning Riashat Islam Reasoning and Learning Lab McGill University September 11, 2018 ML : Many Methods with Many Links Modelling Views of Machine Learning Machine Learning

More information

Fourier Series & The Fourier Transform

Fourier Series & The Fourier Transform Fourier Series & The Fourier Transform What is the Fourier Transform? Anharmonic Waves Fourier Cosine Series for even functions Fourier Sine Series for odd functions The continuous limit: the Fourier transform

More information

Doing Bayesian Integrals

Doing Bayesian Integrals ASTR509-13 Doing Bayesian Integrals The Reverend Thomas Bayes (c.1702 1761) Philosopher, theologian, mathematician Presbyterian (non-conformist) minister Tunbridge Wells, UK Elected FRS, perhaps due to

More information

Periodic functions: simple harmonic oscillator

Periodic functions: simple harmonic oscillator Periodic functions: simple harmonic oscillator Recall the simple harmonic oscillator (e.g. mass-spring system) d 2 y dt 2 + ω2 0y = 0 Solution can be written in various ways: y(t) = Ae iω 0t y(t) = A cos

More information

Recall the Basics of Hypothesis Testing

Recall the Basics of Hypothesis Testing Recall the Basics of Hypothesis Testing The level of significance α, (size of test) is defined as the probability of X falling in w (rejecting H 0 ) when H 0 is true: P(X w H 0 ) = α. H 0 TRUE H 1 TRUE

More information

Statistics. Lent Term 2015 Prof. Mark Thomson. 2: The Gaussian Limit

Statistics. Lent Term 2015 Prof. Mark Thomson. 2: The Gaussian Limit Statistics Lent Term 2015 Prof. Mark Thomson Lecture 2 : The Gaussian Limit Prof. M.A. Thomson Lent Term 2015 29 Lecture Lecture Lecture Lecture 1: Back to basics Introduction, Probability distribution

More information

Additional Keplerian Signals in the HARPS data for Gliese 667C from a Bayesian re-analysis

Additional Keplerian Signals in the HARPS data for Gliese 667C from a Bayesian re-analysis Additional Keplerian Signals in the HARPS data for Gliese 667C from a Bayesian re-analysis Phil Gregory, Samantha Lawler, Brett Gladman Physics and Astronomy Univ. of British Columbia Abstract A re-analysis

More information

Fourier Series. Fourier Transform

Fourier Series. Fourier Transform Math Methods I Lia Vas Fourier Series. Fourier ransform Fourier Series. Recall that a function differentiable any number of times at x = a can be represented as a power series n= a n (x a) n where the

More information

MCMC 2: Lecture 3 SIR models - more topics. Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham

MCMC 2: Lecture 3 SIR models - more topics. Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham MCMC 2: Lecture 3 SIR models - more topics Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham Contents 1. What can be estimated? 2. Reparameterisation 3. Marginalisation

More information

Lecture 8: Bayesian Estimation of Parameters in State Space Models

Lecture 8: Bayesian Estimation of Parameters in State Space Models in State Space Models March 30, 2016 Contents 1 Bayesian estimation of parameters in state space models 2 Computational methods for parameter estimation 3 Practical parameter estimation in state space

More information

Fourier Transform Chapter 10 Sampling and Series

Fourier Transform Chapter 10 Sampling and Series Fourier Transform Chapter 0 Sampling and Series Sampling Theorem Sampling Theorem states that, under a certain condition, it is in fact possible to recover with full accuracy the values intervening between

More information

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 2011

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 2011 A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 2011 Reading Chapter 5 (continued) Lecture 8 Key points in probability CLT CLT examples Prior vs Likelihood Box & Tiao

More information

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 2015 http://www.astro.cornell.edu/~cordes/a6523 Lecture 23:! Nonlinear least squares!! Notes Modeling2015.pdf on course

More information

Statistics and Data Analysis

Statistics and Data Analysis Statistics and Data Analysis The Crash Course Physics 226, Fall 2013 "There are three kinds of lies: lies, damned lies, and statistics. Mark Twain, allegedly after Benjamin Disraeli Statistics and Data

More information

Today: Fundamentals of Monte Carlo

Today: Fundamentals of Monte Carlo Today: Fundamentals of Monte Carlo What is Monte Carlo? Named at Los Alamos in 1940 s after the casino. Any method which uses (pseudo)random numbers as an essential part of the algorithm. Stochastic -

More information

Random Number Generation. CS1538: Introduction to simulations

Random Number Generation. CS1538: Introduction to simulations Random Number Generation CS1538: Introduction to simulations Random Numbers Stochastic simulations require random data True random data cannot come from an algorithm We must obtain it from some process

More information

Physics 403. Segev BenZvi. Choosing Priors and the Principle of Maximum Entropy. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Choosing Priors and the Principle of Maximum Entropy. Department of Physics and Astronomy University of Rochester Physics 403 Choosing Priors and the Principle of Maximum Entropy Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Odds Ratio Occam Factors

More information

Multiple Random Variables

Multiple Random Variables Multiple Random Variables Joint Probability Density Let X and Y be two random variables. Their joint distribution function is F ( XY x, y) P X x Y y. F XY ( ) 1, < x

More information

Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo

Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo Andrew Gordon Wilson www.cs.cmu.edu/~andrewgw Carnegie Mellon University March 18, 2015 1 / 45 Resources and Attribution Image credits,

More information

19. The Fourier Transform in optics

19. The Fourier Transform in optics 19. The Fourier Transform in optics What is the Fourier Transform? Anharmonic waves The spectrum of a light wave Fourier transform of an exponential The Dirac delta function The Fourier transform of e

More information

Computational Methods for Astrophysics: Fourier Transforms

Computational Methods for Astrophysics: Fourier Transforms Computational Methods for Astrophysics: Fourier Transforms John T. Whelan (filling in for Joshua Faber) April 27, 2011 John T. Whelan April 27, 2011 Fourier Transforms 1/13 Fourier Analysis Outline: Fourier

More information

Statistical Methods for Astronomy

Statistical Methods for Astronomy Statistical Methods for Astronomy If your experiment needs statistics, you ought to have done a better experiment. -Ernest Rutherford Lecture 1 Lecture 2 Why do we need statistics? Definitions Statistical

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey Scientific Computing: An Introductory Survey Chapter 12 Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction permitted for noncommercial,

More information

Methods of Mathematical Physics X1 Homework 3 Solutions

Methods of Mathematical Physics X1 Homework 3 Solutions Methods of Mathematical Physics - 556 X Homework 3 Solutions. (Problem 2.. from Keener.) Verify that l 2 is an inner product space. Specifically, show that if x, y l 2, then x, y x k y k is defined and

More information

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1 Lecture 5 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,

More information

Likelihood-free MCMC

Likelihood-free MCMC Bayesian inference for stable distributions with applications in finance Department of Mathematics University of Leicester September 2, 2011 MSc project final presentation Outline 1 2 3 4 Classical Monte

More information

TAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω

TAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω ECO 513 Spring 2015 TAKEHOME FINAL EXAM (1) Suppose the univariate stochastic process y is ARMA(2,2) of the following form: y t = 1.6974y t 1.9604y t 2 + ε t 1.6628ε t 1 +.9216ε t 2, (1) where ε is i.i.d.

More information

Sequential Monte Carlo Samplers for Applications in High Dimensions

Sequential Monte Carlo Samplers for Applications in High Dimensions Sequential Monte Carlo Samplers for Applications in High Dimensions Alexandros Beskos National University of Singapore KAUST, 26th February 2014 Joint work with: Dan Crisan, Ajay Jasra, Nik Kantas, Alex

More information

2. A Basic Statistical Toolbox

2. A Basic Statistical Toolbox . A Basic Statistical Toolbo Statistics is a mathematical science pertaining to the collection, analysis, interpretation, and presentation of data. Wikipedia definition Mathematical statistics: concerned

More information

Statistical techniques for data analysis in Cosmology

Statistical techniques for data analysis in Cosmology Statistical techniques for data analysis in Cosmology arxiv:0712.3028; arxiv:0911.3105 Numerical recipes (the bible ) Licia Verde ICREA & ICC UB-IEEC http://icc.ub.edu/~liciaverde outline Lecture 1: Introduction

More information

Simulated Annealing for Constrained Global Optimization

Simulated Annealing for Constrained Global Optimization Monte Carlo Methods for Computation and Optimization Final Presentation Simulated Annealing for Constrained Global Optimization H. Edwin Romeijn & Robert L.Smith (1994) Presented by Ariel Schwartz Objective

More information

Markov chain Monte Carlo methods in atmospheric remote sensing

Markov chain Monte Carlo methods in atmospheric remote sensing 1 / 45 Markov chain Monte Carlo methods in atmospheric remote sensing Johanna Tamminen johanna.tamminen@fmi.fi ESA Summer School on Earth System Monitoring and Modeling July 3 Aug 11, 212, Frascati July,

More information

Monte Carlo Simulations

Monte Carlo Simulations Monte Carlo Simulations What are Monte Carlo Simulations and why ones them? Pseudo Random Number generators Creating a realization of a general PDF The Bootstrap approach A real life example: LOFAR simulations

More information

STATISTICS OF OBSERVATIONS & SAMPLING THEORY. Parent Distributions

STATISTICS OF OBSERVATIONS & SAMPLING THEORY. Parent Distributions ASTR 511/O Connell Lec 6 1 STATISTICS OF OBSERVATIONS & SAMPLING THEORY References: Bevington Data Reduction & Error Analysis for the Physical Sciences LLM: Appendix B Warning: the introductory literature

More information

Curve Fitting Re-visited, Bishop1.2.5

Curve Fitting Re-visited, Bishop1.2.5 Curve Fitting Re-visited, Bishop1.2.5 Maximum Likelihood Bishop 1.2.5 Model Likelihood differentiation p(t x, w, β) = Maximum Likelihood N N ( t n y(x n, w), β 1). (1.61) n=1 As we did in the case of the

More information

Markov Chain Monte Carlo (MCMC) and Model Evaluation. August 15, 2017

Markov Chain Monte Carlo (MCMC) and Model Evaluation. August 15, 2017 Markov Chain Monte Carlo (MCMC) and Model Evaluation August 15, 2017 Frequentist Linking Frequentist and Bayesian Statistics How can we estimate model parameters and what does it imply? Want to find the

More information

REVIEW OF DIFFERENTIAL CALCULUS

REVIEW OF DIFFERENTIAL CALCULUS REVIEW OF DIFFERENTIAL CALCULUS DONU ARAPURA 1. Limits and continuity To simplify the statements, we will often stick to two variables, but everything holds with any number of variables. Let f(x, y) be

More information

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn Parameter estimation and forecasting Cristiano Porciani AIfA, Uni-Bonn Questions? C. Porciani Estimation & forecasting 2 Temperature fluctuations Variance at multipole l (angle ~180o/l) C. Porciani Estimation

More information

Robert Collins CSE586, PSU Intro to Sampling Methods

Robert Collins CSE586, PSU Intro to Sampling Methods Robert Collins Intro to Sampling Methods CSE586 Computer Vision II Penn State Univ Robert Collins A Brief Overview of Sampling Monte Carlo Integration Sampling and Expected Values Inverse Transform Sampling

More information

Statistics for Data Analysis. Niklaus Berger. PSI Practical Course Physics Institute, University of Heidelberg

Statistics for Data Analysis. Niklaus Berger. PSI Practical Course Physics Institute, University of Heidelberg Statistics for Data Analysis PSI Practical Course 2014 Niklaus Berger Physics Institute, University of Heidelberg Overview You are going to perform a data analysis: Compare measured distributions to theoretical

More information

A523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 2011

A523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 2011 A523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 2011 Lecture 6 PDFs for Lecture 1-5 are on the web page Problem set 2 is on the web page Article on web page A Guided

More information

Fourier Series. (Com S 477/577 Notes) Yan-Bin Jia. Nov 29, 2016

Fourier Series. (Com S 477/577 Notes) Yan-Bin Jia. Nov 29, 2016 Fourier Series (Com S 477/577 otes) Yan-Bin Jia ov 9, 016 1 Introduction Many functions in nature are periodic, that is, f(x+τ) = f(x), for some fixed τ, which is called the period of f. Though function

More information

Chapter 4: Monte Carlo Methods. Paisan Nakmahachalasint

Chapter 4: Monte Carlo Methods. Paisan Nakmahachalasint Chapter 4: Monte Carlo Methods Paisan Nakmahachalasint Introduction Monte Carlo Methods are a class of computational algorithms that rely on repeated random sampling to compute their results. Monte Carlo

More information

Parametric Techniques Lecture 3

Parametric Techniques Lecture 3 Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos Contents Markov Chain Monte Carlo Methods Sampling Rejection Importance Hastings-Metropolis Gibbs Markov Chains

More information

f (t) K(t, u) d t. f (t) K 1 (t, u) d u. Integral Transform Inverse Fourier Transform

f (t) K(t, u) d t. f (t) K 1 (t, u) d u. Integral Transform Inverse Fourier Transform Integral Transforms Massoud Malek An integral transform maps an equation from its original domain into another domain, where it might be manipulated and solved much more easily than in the original domain.

More information

ELEG 3143 Probability & Stochastic Process Ch. 6 Stochastic Process

ELEG 3143 Probability & Stochastic Process Ch. 6 Stochastic Process Department of Electrical Engineering University of Arkansas ELEG 3143 Probability & Stochastic Process Ch. 6 Stochastic Process Dr. Jingxian Wu wuj@uark.edu OUTLINE 2 Definition of stochastic process (random

More information

Taylor Series and stationary points

Taylor Series and stationary points Chapter 5 Taylor Series and stationary points 5.1 Taylor Series The surface z = f(x, y) and its derivatives can give a series approximation for f(x, y) about some point (x 0, y 0 ) as illustrated in Figure

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov

More information

Each of these functions represents a signal in terms of its spectral components in the frequency domain.

Each of these functions represents a signal in terms of its spectral components in the frequency domain. N INTRODUCTION TO SPECTRL FUNCTIONS Revision B By Tom Irvine Email: tomirvine@aol.com March 3, 000 INTRODUCTION This tutorial presents the Fourier transform. It also discusses the power spectral density

More information

ENSC327 Communications Systems 2: Fourier Representations. Jie Liang School of Engineering Science Simon Fraser University

ENSC327 Communications Systems 2: Fourier Representations. Jie Liang School of Engineering Science Simon Fraser University ENSC327 Communications Systems 2: Fourier Representations Jie Liang School of Engineering Science Simon Fraser University 1 Outline Chap 2.1 2.5: Signal Classifications Fourier Transform Dirac Delta Function

More information

Stochastic Processes. M. Sami Fadali Professor of Electrical Engineering University of Nevada, Reno

Stochastic Processes. M. Sami Fadali Professor of Electrical Engineering University of Nevada, Reno Stochastic Processes M. Sami Fadali Professor of Electrical Engineering University of Nevada, Reno 1 Outline Stochastic (random) processes. Autocorrelation. Crosscorrelation. Spectral density function.

More information

ENGI 9420 Lecture Notes 1 - ODEs Page 1.01

ENGI 9420 Lecture Notes 1 - ODEs Page 1.01 ENGI 940 Lecture Notes - ODEs Page.0. Ordinary Differential Equations An equation involving a function of one independent variable and the derivative(s) of that function is an ordinary differential equation

More information

Parameter Estimation and Fitting to Data

Parameter Estimation and Fitting to Data Parameter Estimation and Fitting to Data Parameter estimation Maximum likelihood Least squares Goodness-of-fit Examples Elton S. Smith, Jefferson Lab 1 Parameter estimation Properties of estimators 3 An

More information

CSC 2541: Bayesian Methods for Machine Learning

CSC 2541: Bayesian Methods for Machine Learning CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 3 More Markov Chain Monte Carlo Methods The Metropolis algorithm isn t the only way to do MCMC. We ll

More information

Hypothesis Testing - Frequentist

Hypothesis Testing - Frequentist Frequentist Hypothesis Testing - Frequentist Compare two hypotheses to see which one better explains the data. Or, alternatively, what is the best way to separate events into two classes, those originating

More information

Detection ASTR ASTR509 Jasper Wall Fall term. William Sealey Gosset

Detection ASTR ASTR509 Jasper Wall Fall term. William Sealey Gosset ASTR509-14 Detection William Sealey Gosset 1876-1937 Best known for his Student s t-test, devised for handling small samples for quality control in brewing. To many in the statistical world "Student" was

More information

Practical Statistics

Practical Statistics Practical Statistics Lecture 1 (Nov. 9): - Correlation - Hypothesis Testing Lecture 2 (Nov. 16): - Error Estimation - Bayesian Analysis - Rejecting Outliers Lecture 3 (Nov. 18) - Monte Carlo Modeling -

More information

Statistics for scientists and engineers

Statistics for scientists and engineers Statistics for scientists and engineers February 0, 006 Contents Introduction. Motivation - why study statistics?................................... Examples..................................................3

More information

UC Berkeley Department of Electrical Engineering and Computer Sciences. EECS 126: Probability and Random Processes

UC Berkeley Department of Electrical Engineering and Computer Sciences. EECS 126: Probability and Random Processes UC Berkeley Department of Electrical Engineering and Computer Sciences EECS 6: Probability and Random Processes Problem Set 3 Spring 9 Self-Graded Scores Due: February 8, 9 Submit your self-graded scores

More information

Approximate Inference Part 1 of 2

Approximate Inference Part 1 of 2 Approximate Inference Part 1 of 2 Tom Minka Microsoft Research, Cambridge, UK Machine Learning Summer School 2009 http://mlg.eng.cam.ac.uk/mlss09/ 1 Bayesian paradigm Consistent use of probability theory

More information

3F1 Random Processes Examples Paper (for all 6 lectures)

3F1 Random Processes Examples Paper (for all 6 lectures) 3F Random Processes Examples Paper (for all 6 lectures). Three factories make the same electrical component. Factory A supplies half of the total number of components to the central depot, while factories

More information

BAYESIAN DECISION THEORY

BAYESIAN DECISION THEORY Last updated: September 17, 2012 BAYESIAN DECISION THEORY Problems 2 The following problems from the textbook are relevant: 2.1 2.9, 2.11, 2.17 For this week, please at least solve Problem 2.3. We will

More information

26. The Fourier Transform in optics

26. The Fourier Transform in optics 26. The Fourier Transform in optics What is the Fourier Transform? Anharmonic waves The spectrum of a light wave Fourier transform of an exponential The Dirac delta function The Fourier transform of e

More information

Other Optimization Techniques. Similar to steepest descent, but slightly different way of choosing direction of next step: x r +1 = x.

Other Optimization Techniques. Similar to steepest descent, but slightly different way of choosing direction of next step: x r +1 = x. Conjugate Gradient Other Optimization Techniques Similar to steepest descent, but slightly different way of choosing direction of next step: x r +1 = x r + r s r s 0 = g 0 s r +1 = g r +1 + r +1 s r {

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 254 Part V

More information

Monte Carlo Methods. Leon Gu CSD, CMU

Monte Carlo Methods. Leon Gu CSD, CMU Monte Carlo Methods Leon Gu CSD, CMU Approximate Inference EM: y-observed variables; x-hidden variables; θ-parameters; E-step: q(x) = p(x y, θ t 1 ) M-step: θ t = arg max E q(x) [log p(y, x θ)] θ Monte

More information

Cosmology & CMB. Set5: Data Analysis. Davide Maino

Cosmology & CMB. Set5: Data Analysis. Davide Maino Cosmology & CMB Set5: Data Analysis Davide Maino Gaussian Statistics Statistical isotropy states only two-point correlation function is needed and it is related to power spectrum Θ(ˆn) = lm Θ lm Y lm (ˆn)

More information

SAMSI Astrostatistics Tutorial. More Markov chain Monte Carlo & Demo of Mathematica software

SAMSI Astrostatistics Tutorial. More Markov chain Monte Carlo & Demo of Mathematica software SAMSI Astrostatistics Tutorial More Markov chain Monte Carlo & Demo of Mathematica software Phil Gregory University of British Columbia 26 Bayesian Logical Data Analysis for the Physical Sciences Contents:

More information

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning for for Advanced Topics in California Institute of Technology April 20th, 2017 1 / 50 Table of Contents for 1 2 3 4 2 / 50 History of methods for Enrico Fermi used to calculate incredibly accurate predictions

More information

Bayesian Methods and Uncertainty Quantification for Nonlinear Inverse Problems

Bayesian Methods and Uncertainty Quantification for Nonlinear Inverse Problems Bayesian Methods and Uncertainty Quantification for Nonlinear Inverse Problems John Bardsley, University of Montana Collaborators: H. Haario, J. Kaipio, M. Laine, Y. Marzouk, A. Seppänen, A. Solonen, Z.

More information

FOURIER TRANSFORMS. At, is sometimes taken as 0.5 or it may not have any specific value. Shifting at

FOURIER TRANSFORMS. At, is sometimes taken as 0.5 or it may not have any specific value. Shifting at Chapter 2 FOURIER TRANSFORMS 2.1 Introduction The Fourier series expresses any periodic function into a sum of sinusoids. The Fourier transform is the extension of this idea to non-periodic functions by

More information

Physics 6720 Introduction to Statistics April 4, 2017

Physics 6720 Introduction to Statistics April 4, 2017 Physics 6720 Introduction to Statistics April 4, 2017 1 Statistics of Counting Often an experiment yields a result that can be classified according to a set of discrete events, giving rise to an integer

More information