The relation between memory and power-law exponent

Size: px
Start display at page:

Download "The relation between memory and power-law exponent"

Transcription

1 The relation between memory and power-law exponent Fangjian Guo and Tao Zhou December 2, 22 Abstract The inter-event time of many real systems are empirically found to be power-law distributed, and memory (the first-order autocorrelation), as a quantity ranging from - to +, has been proposed to characterize the short-range correlation of time series. While series from real systems are usually found to be positively or negatively correlated, the memories for empirical power-law series are predominantly positive. In this paper, we study the mechanism for such memory bias. We analyze the bounds of memory for a family of random series with a specified marginal, which are generated by rearranging the orders of i.i.d. samples drawn from the marginal while preserving their values. While such rearrangement can produce uniform or Gaussian distributed series with arbitrary memory from - to +, we found in this paper that power-law distribution imposes a special constraint on the possible memory of such series, which depends on the power-law exponent. Probabilistic and approximation methods are developed respectively to analyze the bounds when corresponding population moments converge and diverge, accompanied by validation from experimental results. Introduction Power-law distributions are used to model the inter-event time series of many real systems [], including communication, web browsing and heart beats. Whereas the power-law captures the marginal distribution of empirical data, memory has been introduced by Goh and Barabási as a quantity to characterize their interdependence, which has even been claimed to be a measure orthogonal to the marginal distribution [2]. Given a positive-valued series {t i } of length n, memory is defined as its first-order autocorrelation, i.e. M = n (t i m )(t i+ m2), () n σ σ 2

2 where m, σ and m 2, σ 2 refer to the mean and standard deviation of series {t, t 2,, t n } and {t 2, t 3,, t n } respectively. M lies in the range of [, +] due to Cauchy-Schwartz inequality, for which we have M [ ] [ ] n n (t i m ) σ σ 2 n 2 (t i+ m 2 ) n 2 =. (2) It should be noted that, when n, the aforementioned two series would only be different in one element. Therefore, we have m = m 2 = m, σ = σ 2 = σ (n ), (3) where m and σ are simply the mean and standard deviation of the whole series. The memory can thus be rewritten as M = n ( σ 2 t i t i+ m 2) (n ), (4) n where the rearrangement of the order of the series only affects the sum of products n t it i+, leaving m and σ unchanged. When the series {t i } are independently sampled from a distribution, the memory of the series would definitely approach zero when n, as E[M] = n ( σ 2 E[t i t i+ ] m 2) =, (5) n with E[t i t i+ ] = m 2. However, by rearranging the order of the elements in the series while preserving their values, the memory can be adjusted without changing the marginal distribution. In fact, in the it of large n, such permutations affect the term n t it i+ only. 2 Rearrangement with maximum or minimum memory In the following sections, we assume n and therefore a simpler form of memory as in equation (4) is adopted. A rearrangement θ is a one-toone mapping from the set {, 2,, n} to itself, and the i-th element in the rearranged sequence is denoted as θ i. Denoting the rearrangement with maximum memory and the one with minimum memory as θ max and θ min respectively, and denoting the sum of the products of adjacent elements as n S θ = t θi t θi+, (6) 2

3 we have θ max = argmax S θ, θ θ min = argmin S θ. θ (7) We also define a shorthand s θ = n S θ. It has been shown there exist fixed θ max and θ min that holds for all the possible positive series of {t, t 2,, t n } [3]. (Whereas [3] uses an objective function that sums the products in a circle, i.e. S θ = n t θ i t θi+ + t θ t θn, the results can be reduced to our case by introducing an additional S = to the series, which makes zero contribution to the sum.) To illustrate how they are arranged, we first introduce the order statics. The order statics {t (i) } are formed by simply arranging the series in an increasing order, i.e. t () t (2) t (n ) t (n), (8) with t (i) being the i-th smallest element in the series. θ max obtains the maximum memory by first arranging the odd elements of order statistics in increasing order, followed by even elements in decreasing order, which is t (), t (3),, t (2l ), t (2l), t (2l 2),, t (4), t (2) (n = 2l), t (), t (3),, t (2l ), t (2l+), t (2l),, t (4), t (2) (n = 2l + ). (9) For simplicity, we only address the case when n = 2l, the sum with order statistics is expressed as S θmax = 2l 2 t (i) t (i+2) + t (2l) t (2l ) (n = 2l), () while the case of odd n can be handled similarly. On the contrary, quite interestingly, θ min arranges the order statistics by interlacing the even and odd terms when n = 2l or small and big terms when n = 2l +, i.e. t (2l), t (), t (2l 2),, t (2l 3), t (2), t (2l ) (n = 2l), t (2l), t (2),, t (l),, t (), t (2l+) (n = 2l + ). () Again, for even n, we have l ( ) S θmin = t(i) t (2l+ i) + t (i) t (2l i) + t(l) t (l+) (n = 2l). (2) 3

4 3 Adjusting memory by iterative rearrangement We first explore the bounds for uniform, Gaussian and power-law samples by iteratively rearranging them towards θ max or θ min. To do this, we construct an approximation of the order statistic iteratively. The process contains n steps, the same as the length of the series. In each step, the series is rearranged with one pass of bubble sort on the series produced after the last step, which means stepping through the series from the first element to the last, comparing each pair of adjacent elements and swap them if they are not in increasing order. The series obtained after i such steps is denoted as { t (i) }, as an approximation to the order statistics, with { t (n) } guaranteed to be the same as the order statistics because a bubble sort always finishes in n passes. In each step, treating { t (i) } as approximate order statistics, series with positive and negative memory are obtained by rearranging { t (i) } according to (9) and () respectively. In our experiment, series are of length n =,, independently drawn from uniform distribution on [, ], standard Gaussian distribution (all samples are added by the same positive constant afterwards to ensure being positive), and power-law distribution whose probability density function is f(t) = α t ( ) α, (3) t min t min with t min = and α = 3.5. Results are obtained by averaging 5 independent runs and reported by Figure for positive memory and Figure 2 for negative memory. As can be seen, by rearranging the series towards θ max, the memory goes to + for all three distributions. As is noted, the maximum memory of power-law series after the last iteration is slightly below +, due to the effect of finite n, which we will see later. However, while the memory can be tuned down to by rearrangement towards θ min for uniform and Gaussian distributed series, it is bounded above about.2 for the power-law series. In the following sections, we will further analyze how this non-trivial lower bound depends on the exponent α. 4 The bounds of memory for rearranged independent power-law series In our following analysis, without loss of generality, we assume the original series are independently sampled from the power-law distribution with t min =, hence the probability density function being f(t) = (α )t α, (4) 4

5 .2.8 Power law Uniform Gaussian Memory Iterations of rearrangement Figure : Positive memory by iteratively rearranging independently sampled series..2.2 Memory.4.6 Power law Uniform Gaussian Iterations of rearrangement Figure 2: Negative memory by iteratively rearranging independently sampled series. where α > is required to form a well-defined distribution. Series with a different lower bound can be recovered by being multiplied by t min and its 5

6 memory ( ) n M = (t min σ) 2 (t min t i )(t min t i+ ) (t min m) 2 = M (5) n is the same as the one with t min =. Therefore, the bounds obtained with t min = hold true with any positive t min. M is always well-defined for empirical samples, as samples are of course finite. However, the population variance Var[t] exists only when α > 3 and the population mean E[t] exists only when α > 2. Therefore, we adopt different treatments for α > 3 and < α 3. When α > 3, we equate sample moments involved in the definition of memory with population moments and then derive the expected value E[M] in the it of n. When < α 3, we instead use a method that approximates the random samples with determinant values. Such approximation can largely simplify the analysis when population moments diverge while still reproducing the behavior of bounds. 4. Probabilistic method for the α > 3 case When α > 3, we denote the population standard deviation and the population mean as σ(α) and m(α) respectively when there is no ambiguity. By equating sample moments with corresponding population moments, we have where E[M] = n σ(α) 2 ( E[t i t i+ ] m(α) 2 ), (6) n m(α) = α α 2, (7) σ(α) 2 = α α 3 m(α)2. (8) 4.. The upper bound of memory The upper bound of memory M max is defined as the expected value of the memory with rearrangement θ max in the it of n, namely M max = E[M θ n max ] = σ(α) 2 ( n n E[S θ max ] m(α) 2 ), (9) where the expected value of S θmax is composed of the expected value of products of adjacent items according to equation (), i.e. n E[S θ max ] = 2l 2 E[t 2l (i) t (i+2) ] + 2l E[t (2l)t (2l ) ], (2) 6

7 assuming n = 2l. The case when n = 2l + can be worked out in a similar fashion to arrive at the same results. The expected value of each term can be obtained by using the joint distribution of order statistics. The probability density function for the joint distribution of two order statistics t (j), t (k) (j < k) is given by [4] [F (x)]j [F (y) F (x)] k j f t(j),t (k) (x, y) = n! (j )! (k j)! [ F (y)] n k f(x)f(y) (n k)! where F (x) is the cumulative distribution function for power-law, i.e. (x y), (2) F (x) = x α, (22) and f is given by (4). Therefore, we have E[t (i) t (i+2) ] = = x y< Γ(2l + ) Γ(2l + 2c) xyf t(i),t (i+2) (x, y)dxdy Γ(2l i + 2c) Γ(2l i ) [(2l i) c][(2l i) αc] ( i 2l 2), and E[t (2l ) t (2l) ] = = x y< xyf t(2l ),t (2l) (x, y)dxdy (α )2 2(α 2) 2 Γ(3 2c) Γ(2l + ) Γ(2l + 2c), (23) where we use a shorthand c = α (, 2 ). In the it of n, the first term in (2) would be 2l 2 2l = 2l E[t (i) t (i+2) ] Γ(2l + ) Γ(2l + 2c) 2l 2 [2l i 2c][2l i αc] Γ(2l i + 2c). Γ(2l i ) Reducing the prefactor by using a property of Gamma function, namely x Γ(x) Γ(x + γ)x γ = (γ R), and rewriting the summation with index k = 2l i, we have 2l 2 2l 2l E[t (i) t (i+2) ] = (2l+) ( 2c) 7 k=2 (k c)(k c ) Γ(k + 2c). Γ(k )

8 As c < 2, the prefactor approaches zero when n, making the result depend on the iting behavior of the summing terms only. Now we have 2l 2 2l E[t (i) t (i+2) ] = k=2 k 2l ( 2l + ) 2c 2l + = (24) Meanwhile, the second term in the RHS of (2) vanishes in the it of large n, as Substituting (24) and (25) into (6), we arrive at 2l E[t (2l)t (2l ) ] =. (25) M max = n E[M θ max ] = (α > 3), (26) proving that the maximum memory for rearranged power-law series is +, the same as uniform and Gaussian distributed series The lower bound of memory The lower bound of memory is formulated similarly as where M min = M θ n min = σ(α) 2 ( n n E[S θ min ] m(α) 2 ), (27) t 2c dt = α α 3. n E[S θ min ] = l ( E[t(i) t 2l (2l+ i) ] + E[t (i) t (2l i) ] ) + 2l E[t (l)t (l+) ], (28) again assuming n = 2l for convenience. The expected value of these products can also be obtained with the distribution of order statistics: E[t (i) t (2l+ i) ] = E[t (i) t (2l i) ] = Γ(2l + ) Γ(2l + 2c) Γ(2l + ) Γ(2l + 2c) Γ(i c) Γ(2l i + 2c) Γ(i) Γ(2l i + c), (29) Γ(i + 2 c) Γ(2l i + 2c) Γ(i + 2) Γ(2l i + c). (3) 8

9 Taking the large n it, we have l 2l = 2l E[t (i) t (2l+ i) ] Γ(2l + ) Γ(2l + 2c) l Γ(i + 2 c) Γ(2l i + 2c) Γ(i + 2) Γ(2l i + c) l = (2l + ) ( 2c) (i + 2) c (2l i + ) c l = ( i + 2 2l + ) c ( i 2l + ) c ( 2l + ) = 2 u c ( u) c du = B( 2 ; α 2 α, α 2 α ), where B is the incomplete beta function, defined as B(x; a, b) = x (3) t a ( t) b dt. (32) We also have the other term with the same it, namely l 2l and again the remaining term Therefore, the minimum memory is derived as [ 2B M min = n E[M θ min ] = σ(α) Experimental results E[t (i) t (2l i) ] = B( 2 ; α 2 α, α 2 ), (33) α 2l E[t (l)t (l+) ] =. (34) ( 2 ; m(α), m(α) ) m(α) 2 ] (α > 3). (35) Figure 3 reports the comparison between theoretical and experimental results for the upper and lower bound of memory for rearranged independently sampled power-law series, in the case of α > 3. The solid lines are drawn according to (26) and (35), while the circles are produced by rearranging the independently sampled series with θ max and θ min. The effect of finite system size n is observed when α is near 3, resulting in minor difference between theoretical and experimental results. When α > 3, while M max is a constant, M min is a decreasing function of α with the it M min.64 (α ). 9

10 .8 Memory Max Memory (experiment) Min Memory (experiment) Min Memory (theory) Max Memory (theory) α Figure 3: The comparison between theoretical and experimental bounds for memory, in the case of α > 3. The series are of length n = 5, and experimental results are obtained by averaging over independent runs. 4.2 Approximation method for the α 3 case When α 3, the population variance would diverge and even the population mean would diverge when α goes below 2, making the method of equating sample moments with population moments invalid for such circumstances. And a direct calculation of the expected value of maximum or minimum memory involves the distribution of M itself, which is intractable. Therefore, we adopt an approximation method for the α 3 case, which substitutes the random samples with determinant values in the it of large n Approximating random samples with equiprobable slices As illustrated by Figure 4, the points ˆt, ˆt 2,, ˆt n cut the area under the probability density function f(t) into slices of equal area n, with ˆt = t min = and the area from t n extending to infinity also being n. Then we approximate the random samples {t, t 2,, t n } with these determinant points {ˆt, ˆt 2,, ˆt n }. It should be noted that such approximation imposes a cutoff on the maximum value of t i and the probability of drawing a sample exceeding the cut-off is n, which diminishes to zero as n. By setting

11 ˆt i = i n where c = ˆt (i) = ˆt i., we have α > 2 f(t) ˆt i = ( i n ) c (i =, 2,, n), (36) in this case. And obviously the order statistics satisfies ^ ^ ^ t t 2 t 3 t min ^t 4... ^t n t Figure 4: Cutting the area under f(t) (in logarithm scale) into slices with equal area /n, thus equal probability. The area under the last point to infinity is also /n. Recall the expression for memory M = n ( σ 2 t i t i+ m 2) n = n n n t it i+ m 2 n t2 i, m2 where three quantities of samples are involved, denoted as (37) and s = n S = n t i t i+, (38) n t 2 = n m 2 = ( n n t 2 i, (39) n t i ) 2. (4)

12 When α 3, s and t 2 diverge as n and m 2 no longer converges if α 2. We can then use the approximated samples {ˆt, ˆt 2,, ˆt n } to analyze the asymptotic behavior of these quantities as n The upper bound of memory Substituting t (i) with ˆt i, we have ŝ θmax = 2l 2 n ( ˆt iˆt i+2 + ˆt 2lˆt 2l ) (n = 2l) = n n2c[ n (k 2 ) c + 2 c] (α < 3), k=2 (4) ˆt 2 θ max = n n n ˆt 2 i = n 2c (k + ) 2c (α < 3). (42) With n, both of them are of the same order as n 2c, while it is found that m 2 either converges when 2 < α 3 or diverges with a lower order n 2c 2 when < α 2. Hence, in the it of large n, M θmax can be approximated by neglecting m 2, i.e. ˆM θmax ŝθ max ˆt 2 θ max k= n k=2 (k2 ) c + 2 c n k= (k + ) 2c, (43) where both the numerator and denominator are convergent when n. And thus we have M max = n E[M θ max ] n n ˆM θmax n k=2 (k2 ) c + 2 c n k= (k + ) 2c ( < α 3). (44) The lower bound of memory By analyzing the memory produced by θ min with approximated samples, it is found that ˆt 2 θ min is the term diverging the biggest order of n, while the orders for m 2 and ŝ θmin are smaller if they diverge. As the biggest order term only appears in the denominator, we have M min = n E[M θ min ] n ˆM θmin =. (45) 2

13 .2.8 Max Memory (experiment) Min Memory (experiment) Max Memory (approx. theory) Min Memory (approx. theory) Memory α Figure 5: Comparison between experimental results and approximation theoretical results for < α 3. Each experimental data point is obtained by averaging sampled series of length n = 5, for independent runs Experimental results Figure 5 compares the upper and lower bounds of memory obtained by equiprobable slice approximation theory with experimental results, where data points are averaged over independent runs of drawing power-law samples of length n = 5,. Although such approximation method cannot precisely recover the behavior of maximum memory, it still captures the trend the maximum memory goes up from to + as α increases from to 3. Meanwhile, the predicted minimum memory remains zero when α 3 and larger gaps between experimental and theoretical values are noticed when α is near 3. 5 Conclusion and discussion By combining the results obtained by probabilistic method for α > 3 and approximation method for α 3, in Figure 6 we plot theoretical values for M max and M min with < α 7, where slice approximation is used for < α 3 and probabilistic method is used for α > 3. Clearly, in the sense of expected values and in the it of large n, the memory of rearranged power-law series is confined to the positive region for α 3, with the upper bound goes up from to + as α increases. 3

14 .8.6 Maximum Memory Minimum Memory Memory α Figure 6: Theoretical values for maximum and minimum memory of rearranged independent power-law series in the range of < α < 7. Slice approximation method is used for < α 3 while probabilistic method is used for 3 < α < 7. When α > 3, however, the upper bound remains to be +, while the lower bound slides down to the negative region as a decreasing function of α, with the it M min.64 (α ). Therefore, the minimum memory for rearranged power-law series is constrained above.64. Admittedly, the rearrangement of independently sampled power-law series cannot cover all the possible correlated series with power-law marginal and memory (first-order autocorrelation) is not the only measure for describing temporal correlation. Still, our results may point to a hidden mechanism of power-law, i.e. the power-law distribution may naturally constrains the possible interdependence among elements, especially preventing the correlation from being too negative. In a addition, our approach, both probabilistic and approximate, may be generalized to the analysis of a wider spectrum of distributions common for empirical data, such as log-normal and power-law with exponential tails. References [] A. Clauset, C. Shalizi, and M. Newman. Power-law distributions in empirical data. SIAM Review, 5(4):66 73, 29. 4

15 [2] K. I. Goh and A. L. Barabási. Burstiness and memory in complex systems. EPL (Europhysics Letters), February 28. [3] Marc Hallin, Guy Melard, and Xavier Milhaud. Permutational extreme values of autocorrelation coefficients and a pitman test against serial dependence. ULB Institutional Repository 23/237, ULB Universite Libre de Bruxelles, 992. [4] H. A. David and H. N. Nagaraja. Order Statistics. Wiley, New Jersey, 3rd edition edition, 23. 5

The Expectation Maximization Algorithm

The Expectation Maximization Algorithm The Expectation Maximization Algorithm Frank Dellaert College of Computing, Georgia Institute of Technology Technical Report number GIT-GVU-- February Abstract This note represents my attempt at explaining

More information

EM for Spherical Gaussians

EM for Spherical Gaussians EM for Spherical Gaussians Karthekeyan Chandrasekaran Hassan Kingravi December 4, 2007 1 Introduction In this project, we examine two aspects of the behavior of the EM algorithm for mixtures of spherical

More information

Algebra 2: Semester 2 Practice Final Unofficial Worked Out Solutions by Earl Whitney

Algebra 2: Semester 2 Practice Final Unofficial Worked Out Solutions by Earl Whitney Algebra 2: Semester 2 Practice Final Unofficial Worked Out Solutions by Earl Whitney 1. The key to this problem is recognizing cubes as factors in the radicands. 24 16 5 2. The key to this problem is recognizing

More information

Statistics for scientists and engineers

Statistics for scientists and engineers Statistics for scientists and engineers February 0, 006 Contents Introduction. Motivation - why study statistics?................................... Examples..................................................3

More information

Chem 3502/4502 Physical Chemistry II (Quantum Mechanics) 3 Credits Spring Semester 2006 Christopher J. Cramer. Lecture 9, February 8, 2006

Chem 3502/4502 Physical Chemistry II (Quantum Mechanics) 3 Credits Spring Semester 2006 Christopher J. Cramer. Lecture 9, February 8, 2006 Chem 3502/4502 Physical Chemistry II (Quantum Mechanics) 3 Credits Spring Semester 2006 Christopher J. Cramer Lecture 9, February 8, 2006 The Harmonic Oscillator Consider a diatomic molecule. Such a molecule

More information

The Expectation-Maximization Algorithm

The Expectation-Maximization Algorithm 1/29 EM & Latent Variable Models Gaussian Mixture Models EM Theory The Expectation-Maximization Algorithm Mihaela van der Schaar Department of Engineering Science University of Oxford MLE for Latent Variable

More information

Variational Learning : From exponential families to multilinear systems

Variational Learning : From exponential families to multilinear systems Variational Learning : From exponential families to multilinear systems Ananth Ranganathan th February 005 Abstract This note aims to give a general overview of variational inference on graphical models.

More information

Non-parametric Inference and Resampling

Non-parametric Inference and Resampling Non-parametric Inference and Resampling Exercises by David Wozabal (Last update. Juni 010) 1 Basic Facts about Rank and Order Statistics 1.1 10 students were asked about the amount of time they spend surfing

More information

Machine Learning Lecture Notes

Machine Learning Lecture Notes Machine Learning Lecture Notes Predrag Radivojac January 25, 205 Basic Principles of Parameter Estimation In probabilistic modeling, we are typically presented with a set of observations and the objective

More information

Mixtures of Gaussians. Sargur Srihari

Mixtures of Gaussians. Sargur Srihari Mixtures of Gaussians Sargur srihari@cedar.buffalo.edu 1 9. Mixture Models and EM 0. Mixture Models Overview 1. K-Means Clustering 2. Mixtures of Gaussians 3. An Alternative View of EM 4. The EM Algorithm

More information

Non-parametric Inference and Resampling

Non-parametric Inference and Resampling Non-parametric Inference and Resampling Exercises by David Wozabal (Last update 3. Juni 2013) 1 Basic Facts about Rank and Order Statistics 1.1 10 students were asked about the amount of time they spend

More information

Large Sample Properties of Estimators in the Classical Linear Regression Model

Large Sample Properties of Estimators in the Classical Linear Regression Model Large Sample Properties of Estimators in the Classical Linear Regression Model 7 October 004 A. Statement of the classical linear regression model The classical linear regression model can be written in

More information

Algebra II Unit Breakdown (Curriculum Map Outline)

Algebra II Unit Breakdown (Curriculum Map Outline) QUARTER 1 Unit 1 (Arithmetic and Geometric Sequences) A. Sequences as Functions 1. Identify finite and infinite sequences 2. Identify an arithmetic sequence and its parts 3. Identify a geometric sequence

More information

Algorithms and Their Complexity

Algorithms and Their Complexity CSCE 222 Discrete Structures for Computing David Kebo Houngninou Algorithms and Their Complexity Chapter 3 Algorithm An algorithm is a finite sequence of steps that solves a problem. Computational complexity

More information

Gaussian Models (9/9/13)

Gaussian Models (9/9/13) STA561: Probabilistic machine learning Gaussian Models (9/9/13) Lecturer: Barbara Engelhardt Scribes: Xi He, Jiangwei Pan, Ali Razeen, Animesh Srivastava 1 Multivariate Normal Distribution The multivariate

More information

STAT 302 Introduction to Probability Learning Outcomes. Textbook: A First Course in Probability by Sheldon Ross, 8 th ed.

STAT 302 Introduction to Probability Learning Outcomes. Textbook: A First Course in Probability by Sheldon Ross, 8 th ed. STAT 302 Introduction to Probability Learning Outcomes Textbook: A First Course in Probability by Sheldon Ross, 8 th ed. Chapter 1: Combinatorial Analysis Demonstrate the ability to solve combinatorial

More information

Adaptive Monte Carlo methods

Adaptive Monte Carlo methods Adaptive Monte Carlo methods Jean-Michel Marin Projet Select, INRIA Futurs, Université Paris-Sud joint with Randal Douc (École Polytechnique), Arnaud Guillin (Université de Marseille) and Christian Robert

More information

Spring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM

Spring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 14 GEE-GMM Throughout the course we have emphasized methods of estimation and inference based on the principle

More information

NAG Library Chapter Introduction. G08 Nonparametric Statistics

NAG Library Chapter Introduction. G08 Nonparametric Statistics NAG Library Chapter Introduction G08 Nonparametric Statistics Contents 1 Scope of the Chapter.... 2 2 Background to the Problems... 2 2.1 Parametric and Nonparametric Hypothesis Testing... 2 2.2 Types

More information

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows. Chapter 5 Two Random Variables In a practical engineering problem, there is almost always causal relationship between different events. Some relationships are determined by physical laws, e.g., voltage

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Physics Department Statistical Physics I Spring Term 2013 Notes on the Microcanonical Ensemble

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Physics Department Statistical Physics I Spring Term 2013 Notes on the Microcanonical Ensemble MASSACHUSETTS INSTITUTE OF TECHNOLOGY Physics Department 8.044 Statistical Physics I Spring Term 2013 Notes on the Microcanonical Ensemble The object of this endeavor is to impose a simple probability

More information

Technical Details about the Expectation Maximization (EM) Algorithm

Technical Details about the Expectation Maximization (EM) Algorithm Technical Details about the Expectation Maximization (EM Algorithm Dawen Liang Columbia University dliang@ee.columbia.edu February 25, 2015 1 Introduction Maximum Lielihood Estimation (MLE is widely used

More information

A Generalized Uncertainty Principle and Sparse Representation in Pairs of Bases

A Generalized Uncertainty Principle and Sparse Representation in Pairs of Bases 2558 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 48, NO 9, SEPTEMBER 2002 A Generalized Uncertainty Principle Sparse Representation in Pairs of Bases Michael Elad Alfred M Bruckstein Abstract An elementary

More information

Stat 426 : Homework 1.

Stat 426 : Homework 1. Stat 426 : Homework 1. Moulinath Banerjee University of Michigan Announcement: The homework carries 120 points and contributes 10 points to the total grade. (1) A geometric random variable W takes values

More information

BTRY 4090: Spring 2009 Theory of Statistics

BTRY 4090: Spring 2009 Theory of Statistics BTRY 4090: Spring 2009 Theory of Statistics Guozhang Wang September 25, 2010 1 Review of Probability We begin with a real example of using probability to solve computationally intensive (or infeasible)

More information

THE EXPONENTIAL DISTRIBUTION ANALOG OF THE GRUBBS WEAVER METHOD

THE EXPONENTIAL DISTRIBUTION ANALOG OF THE GRUBBS WEAVER METHOD THE EXPONENTIAL DISTRIBUTION ANALOG OF THE GRUBBS WEAVER METHOD ANDREW V. SILLS AND CHARLES W. CHAMP Abstract. In Grubbs and Weaver (1947), the authors suggest a minimumvariance unbiased estimator for

More information

Networks and Systems Prof V.G K. Murti Department of Electrical Engineering Indian Institute of Technology, Madras Lecture - 10 Fourier Series (10)

Networks and Systems Prof V.G K. Murti Department of Electrical Engineering Indian Institute of Technology, Madras Lecture - 10 Fourier Series (10) Networks and Systems Prof V.G K. Murti Department of Electrical Engineering Indian Institute of Technology, Madras Lecture - 10 Fourier Series (10) What we have seen in the previous lectures, is first

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project

More information

PCA with random noise. Van Ha Vu. Department of Mathematics Yale University

PCA with random noise. Van Ha Vu. Department of Mathematics Yale University PCA with random noise Van Ha Vu Department of Mathematics Yale University An important problem that appears in various areas of applied mathematics (in particular statistics, computer science and numerical

More information

1 Degree distributions and data

1 Degree distributions and data 1 Degree distributions and data A great deal of effort is often spent trying to identify what functional form best describes the degree distribution of a network, particularly the upper tail of that distribution.

More information

STA 4273H: Sta-s-cal Machine Learning

STA 4273H: Sta-s-cal Machine Learning STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 2 In our

More information

CS Lecture 19. Exponential Families & Expectation Propagation

CS Lecture 19. Exponential Families & Expectation Propagation CS 6347 Lecture 19 Exponential Families & Expectation Propagation Discrete State Spaces We have been focusing on the case of MRFs over discrete state spaces Probability distributions over discrete spaces

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

PLC Papers. Created For:

PLC Papers. Created For: PLC Papers Created For: Algebra and proof 2 Grade 8 Objective: Use algebra to construct proofs Question 1 a) If n is a positive integer explain why the expression 2n + 1 is always an odd number. b) Use

More information

MATH 205C: STATIONARY PHASE LEMMA

MATH 205C: STATIONARY PHASE LEMMA MATH 205C: STATIONARY PHASE LEMMA For ω, consider an integral of the form I(ω) = e iωf(x) u(x) dx, where u Cc (R n ) complex valued, with support in a compact set K, and f C (R n ) real valued. Thus, I(ω)

More information

Introduction: exponential family, conjugacy, and sufficiency (9/2/13)

Introduction: exponential family, conjugacy, and sufficiency (9/2/13) STA56: Probabilistic machine learning Introduction: exponential family, conjugacy, and sufficiency 9/2/3 Lecturer: Barbara Engelhardt Scribes: Melissa Dalis, Abhinandan Nath, Abhishek Dubey, Xin Zhou Review

More information

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur Lecture No. # 33 Probability Models using Gamma and Extreme Value

More information

13 : Variational Inference: Loopy Belief Propagation and Mean Field

13 : Variational Inference: Loopy Belief Propagation and Mean Field 10-708: Probabilistic Graphical Models 10-708, Spring 2012 13 : Variational Inference: Loopy Belief Propagation and Mean Field Lecturer: Eric P. Xing Scribes: Peter Schulam and William Wang 1 Introduction

More information

A Generalization of Wigner s Law

A Generalization of Wigner s Law A Generalization of Wigner s Law Inna Zakharevich June 2, 2005 Abstract We present a generalization of Wigner s semicircle law: we consider a sequence of probability distributions (p, p 2,... ), with mean

More information

Introduction to Restricted Boltzmann Machines

Introduction to Restricted Boltzmann Machines Introduction to Restricted Boltzmann Machines Ilija Bogunovic and Edo Collins EPFL {ilija.bogunovic,edo.collins}@epfl.ch October 13, 2014 Introduction Ingredients: 1. Probabilistic graphical models (undirected,

More information

Business Economics BUSINESS ECONOMICS. PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS MODULE No. : 3, GAUSS MARKOV THEOREM

Business Economics BUSINESS ECONOMICS. PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS MODULE No. : 3, GAUSS MARKOV THEOREM Subject Business Economics Paper No and Title Module No and Title Module Tag 8, Fundamentals of Econometrics 3, The gauss Markov theorem BSE_P8_M3 1 TABLE OF CONTENTS 1. INTRODUCTION 2. ASSUMPTIONS OF

More information

A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models

A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Jeff A. Bilmes (bilmes@cs.berkeley.edu) International Computer Science Institute

More information

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology Kharagpur

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology Kharagpur Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology Kharagpur Lecture No. #13 Probability Distribution of Continuous RVs (Contd

More information

Robust Testing and Variable Selection for High-Dimensional Time Series

Robust Testing and Variable Selection for High-Dimensional Time Series Robust Testing and Variable Selection for High-Dimensional Time Series Ruey S. Tsay Booth School of Business, University of Chicago May, 2017 Ruey S. Tsay HTS 1 / 36 Outline 1 Focus on high-dimensional

More information

A General Overview of Parametric Estimation and Inference Techniques.

A General Overview of Parametric Estimation and Inference Techniques. A General Overview of Parametric Estimation and Inference Techniques. Moulinath Banerjee University of Michigan September 11, 2012 The object of statistical inference is to glean information about an underlying

More information

5 Introduction to the Theory of Order Statistics and Rank Statistics

5 Introduction to the Theory of Order Statistics and Rank Statistics 5 Introduction to the Theory of Order Statistics and Rank Statistics This section will contain a summary of important definitions and theorems that will be useful for understanding the theory of order

More information

PRIME NUMBERS YANKI LEKILI

PRIME NUMBERS YANKI LEKILI PRIME NUMBERS YANKI LEKILI We denote by N the set of natural numbers: 1,2,..., These are constructed using Peano axioms. We will not get into the philosophical questions related to this and simply assume

More information

Machine Learning and Bayesian Inference. Unsupervised learning. Can we find regularity in data without the aid of labels?

Machine Learning and Bayesian Inference. Unsupervised learning. Can we find regularity in data without the aid of labels? Machine Learning and Bayesian Inference Dr Sean Holden Computer Laboratory, Room FC6 Telephone extension 6372 Email: sbh11@cl.cam.ac.uk www.cl.cam.ac.uk/ sbh11/ Unsupervised learning Can we find regularity

More information

Probability Background

Probability Background Probability Background Namrata Vaswani, Iowa State University August 24, 2015 Probability recap 1: EE 322 notes Quick test of concepts: Given random variables X 1, X 2,... X n. Compute the PDF of the second

More information

TWO METHODS FOR ESTIMATING OVERCOMPLETE INDEPENDENT COMPONENT BASES. Mika Inki and Aapo Hyvärinen

TWO METHODS FOR ESTIMATING OVERCOMPLETE INDEPENDENT COMPONENT BASES. Mika Inki and Aapo Hyvärinen TWO METHODS FOR ESTIMATING OVERCOMPLETE INDEPENDENT COMPONENT BASES Mika Inki and Aapo Hyvärinen Neural Networks Research Centre Helsinki University of Technology P.O. Box 54, FIN-215 HUT, Finland ABSTRACT

More information

arxiv: v1 [quant-ph] 11 Mar 2016

arxiv: v1 [quant-ph] 11 Mar 2016 The Asymptotics of Quantum Max-Flow Min-Cut Matthew B. Hastings 1 Station Q, Microsoft Research, Santa Barbara, CA 93106-6105, USA 2 Quantum Architectures and Computation Group, Microsoft Research, Redmond,

More information

An Asymptotically Optimal Algorithm for the Max k-armed Bandit Problem

An Asymptotically Optimal Algorithm for the Max k-armed Bandit Problem An Asymptotically Optimal Algorithm for the Max k-armed Bandit Problem Matthew J. Streeter February 27, 2006 CMU-CS-06-110 Stephen F. Smith School of Computer Science Carnegie Mellon University Pittsburgh,

More information

ELEG 833. Nonlinear Signal Processing

ELEG 833. Nonlinear Signal Processing Nonlinear Signal Processing ELEG 833 Gonzalo R. Arce Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu February 15, 25 2 NON-GAUSSIAN MODELS 2 Non-Gaussian Models

More information

1 Potential due to a charged wire/sheet

1 Potential due to a charged wire/sheet Lecture XXX Renormalization, Regularization and Electrostatics Let us calculate the potential due to an infinitely large object, e.g. a uniformly charged wire or a uniformly charged sheet. Our main interest

More information

STATISTICAL METHODS FOR SIGNAL PROCESSING c Alfred Hero

STATISTICAL METHODS FOR SIGNAL PROCESSING c Alfred Hero STATISTICAL METHODS FOR SIGNAL PROCESSING c Alfred Hero 1999 32 Statistic used Meaning in plain english Reduction ratio T (X) [X 1,..., X n ] T, entire data sample RR 1 T (X) [X (1),..., X (n) ] T, rank

More information

Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a

Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a Some slides are due to Christopher Bishop Limitations of K-means Hard assignments of data points to clusters small shift of a

More information

Lecture 12 Reactions as Rare Events 3/02/2009 Notes by MIT Student (and MZB)

Lecture 12 Reactions as Rare Events 3/02/2009 Notes by MIT Student (and MZB) Lecture 1 Reactions as Rare Events 3/0/009 Notes by MIT Student (and MZB) 1. Introduction In this lecture, we ll tackle the general problem of reaction rates involving particles jumping over barriers in

More information

A Modified Baum Welch Algorithm for Hidden Markov Models with Multiple Observation Spaces

A Modified Baum Welch Algorithm for Hidden Markov Models with Multiple Observation Spaces IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 9, NO. 4, MAY 2001 411 A Modified Baum Welch Algorithm for Hidden Markov Models with Multiple Observation Spaces Paul M. Baggenstoss, Member, IEEE

More information

Learning MN Parameters with Alternative Objective Functions. Sargur Srihari

Learning MN Parameters with Alternative Objective Functions. Sargur Srihari Learning MN Parameters with Alternative Objective Functions Sargur srihari@cedar.buffalo.edu 1 Topics Max Likelihood & Contrastive Objectives Contrastive Objective Learning Methods Pseudo-likelihood Gradient

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

13: Variational inference II

13: Variational inference II 10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational

More information

Real Analysis Prof. S.H. Kulkarni Department of Mathematics Indian Institute of Technology, Madras. Lecture - 13 Conditional Convergence

Real Analysis Prof. S.H. Kulkarni Department of Mathematics Indian Institute of Technology, Madras. Lecture - 13 Conditional Convergence Real Analysis Prof. S.H. Kulkarni Department of Mathematics Indian Institute of Technology, Madras Lecture - 13 Conditional Convergence Now, there are a few things that are remaining in the discussion

More information

Clustering on Unobserved Data using Mixture of Gaussians

Clustering on Unobserved Data using Mixture of Gaussians Clustering on Unobserved Data using Mixture of Gaussians Lu Ye Minas E. Spetsakis Technical Report CS-2003-08 Oct. 6 2003 Department of Computer Science 4700 eele Street orth York, Ontario M3J 1P3 Canada

More information

Lecture 2. We now introduce some fundamental tools in martingale theory, which are useful in controlling the fluctuation of martingales.

Lecture 2. We now introduce some fundamental tools in martingale theory, which are useful in controlling the fluctuation of martingales. Lecture 2 1 Martingales We now introduce some fundamental tools in martingale theory, which are useful in controlling the fluctuation of martingales. 1.1 Doob s inequality We have the following maximal

More information

Lecture 7 Introduction to Statistical Decision Theory

Lecture 7 Introduction to Statistical Decision Theory Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7

More information

Spring 2012 Math 541B Exam 1

Spring 2012 Math 541B Exam 1 Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote

More information

Multiple Random Variables

Multiple Random Variables Multiple Random Variables Joint Probability Density Let X and Y be two random variables. Their joint distribution function is F ( XY x, y) P X x Y y. F XY ( ) 1, < x

More information

Contrastive Divergence

Contrastive Divergence Contrastive Divergence Training Products of Experts by Minimizing CD Hinton, 2002 Helmut Puhr Institute for Theoretical Computer Science TU Graz June 9, 2010 Contents 1 Theory 2 Argument 3 Contrastive

More information

Sampling Distributions and Asymptotics Part II

Sampling Distributions and Asymptotics Part II Sampling Distributions and Asymptotics Part II Insan TUNALI Econ 511 - Econometrics I Koç University 18 October 2018 I. Tunal (Koç University) Week 5 18 October 2018 1 / 29 Lecture Outline v Sampling Distributions

More information

CSCI-567: Machine Learning (Spring 2019)

CSCI-567: Machine Learning (Spring 2019) CSCI-567: Machine Learning (Spring 2019) Prof. Victor Adamchik U of Southern California Mar. 19, 2019 March 19, 2019 1 / 43 Administration March 19, 2019 2 / 43 Administration TA3 is due this week March

More information

The Sommerfeld Polynomial Method: Harmonic Oscillator Example

The Sommerfeld Polynomial Method: Harmonic Oscillator Example Chemistry 460 Fall 2017 Dr. Jean M. Standard October 2, 2017 The Sommerfeld Polynomial Method: Harmonic Oscillator Example Scaling the Harmonic Oscillator Equation Recall the basic definitions of the harmonic

More information

Math 350: An exploration of HMMs through doodles.

Math 350: An exploration of HMMs through doodles. Math 350: An exploration of HMMs through doodles. Joshua Little (407673) 19 December 2012 1 Background 1.1 Hidden Markov models. Markov chains (MCs) work well for modelling discrete-time processes, or

More information

INTRODUCTION TO PATTERN RECOGNITION

INTRODUCTION TO PATTERN RECOGNITION INTRODUCTION TO PATTERN RECOGNITION INSTRUCTOR: WEI DING 1 Pattern Recognition Automatic discovery of regularities in data through the use of computer algorithms With the use of these regularities to take

More information

Algorithms. Quicksort. Slide credit: David Luebke (Virginia)

Algorithms. Quicksort. Slide credit: David Luebke (Virginia) 1 Algorithms Quicksort Slide credit: David Luebke (Virginia) Sorting revisited We have seen algorithms for sorting: INSERTION-SORT, MERGESORT More generally: given a sequence of items Each item has a characteristic

More information

Probability and Distributions

Probability and Distributions Probability and Distributions What is a statistical model? A statistical model is a set of assumptions by which the hypothetical population distribution of data is inferred. It is typically postulated

More information

Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements

Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements Jeffrey N. Rouder Francis Tuerlinckx Paul L. Speckman Jun Lu & Pablo Gomez May 4 008 1 The Weibull regression model

More information

ECE534, Spring 2018: Solutions for Problem Set #3

ECE534, Spring 2018: Solutions for Problem Set #3 ECE534, Spring 08: Solutions for Problem Set #3 Jointly Gaussian Random Variables and MMSE Estimation Suppose that X, Y are jointly Gaussian random variables with µ X = µ Y = 0 and σ X = σ Y = Let their

More information

THE N-VALUE GAME OVER Z AND R

THE N-VALUE GAME OVER Z AND R THE N-VALUE GAME OVER Z AND R YIDA GAO, MATT REDMOND, ZACH STEWARD Abstract. The n-value game is an easily described mathematical diversion with deep underpinnings in dynamical systems analysis. We examine

More information

Summary of free theory: one particle state: vacuum state is annihilated by all a s: then, one particle state has normalization:

Summary of free theory: one particle state: vacuum state is annihilated by all a s: then, one particle state has normalization: The LSZ reduction formula based on S-5 In order to describe scattering experiments we need to construct appropriate initial and final states and calculate scattering amplitude. Summary of free theory:

More information

SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions

SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu

More information

Chapter 6. Order Statistics and Quantiles. 6.1 Extreme Order Statistics

Chapter 6. Order Statistics and Quantiles. 6.1 Extreme Order Statistics Chapter 6 Order Statistics and Quantiles 61 Extreme Order Statistics Suppose we have a finite sample X 1,, X n Conditional on this sample, we define the values X 1),, X n) to be a permutation of X 1,,

More information

CS264: Beyond Worst-Case Analysis Lecture #20: From Unknown Input Distributions to Instance Optimality

CS264: Beyond Worst-Case Analysis Lecture #20: From Unknown Input Distributions to Instance Optimality CS264: Beyond Worst-Case Analysis Lecture #20: From Unknown Input Distributions to Instance Optimality Tim Roughgarden December 3, 2014 1 Preamble This lecture closes the loop on the course s circle of

More information

Statistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation

Statistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation Statistics 62: L p spaces, metrics on spaces of probabilites, and connections to estimation Moulinath Banerjee December 6, 2006 L p spaces and Hilbert spaces We first formally define L p spaces. Consider

More information

Basic Sampling Methods

Basic Sampling Methods Basic Sampling Methods Sargur Srihari srihari@cedar.buffalo.edu 1 1. Motivation Topics Intractability in ML How sampling can help 2. Ancestral Sampling Using BNs 3. Transforming a Uniform Distribution

More information

Chapter 9. Non-Parametric Density Function Estimation

Chapter 9. Non-Parametric Density Function Estimation 9-1 Density Estimation Version 1.2 Chapter 9 Non-Parametric Density Function Estimation 9.1. Introduction We have discussed several estimation techniques: method of moments, maximum likelihood, and least

More information

Probability Methods in Civil Engineering Prof. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur

Probability Methods in Civil Engineering Prof. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur Probability Methods in Civil Engineering Prof. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur Lecture No. # 12 Probability Distribution of Continuous RVs (Contd.)

More information

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Review. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with

More information

A graph contains a set of nodes (vertices) connected by links (edges or arcs)

A graph contains a set of nodes (vertices) connected by links (edges or arcs) BOLTZMANN MACHINES Generative Models Graphical Models A graph contains a set of nodes (vertices) connected by links (edges or arcs) In a probabilistic graphical model, each node represents a random variable,

More information

(Refer Slide Time: )

(Refer Slide Time: ) Digital Signal Processing Prof. S. C. Dutta Roy Department of Electrical Engineering Indian Institute of Technology, Delhi FIR Lattice Synthesis Lecture - 32 This is the 32nd lecture and our topic for

More information

CHAPTER 2 Estimating Probabilities

CHAPTER 2 Estimating Probabilities CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2017. Tom M. Mitchell. All rights reserved. *DRAFT OF September 16, 2017* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is

More information

Guess & Check Codes for Deletions, Insertions, and Synchronization

Guess & Check Codes for Deletions, Insertions, and Synchronization Guess & Check Codes for Deletions, Insertions, and Synchronization Serge Kas Hanna, Salim El Rouayheb ECE Department, Rutgers University sergekhanna@rutgersedu, salimelrouayheb@rutgersedu arxiv:759569v3

More information

Unsupervised Learning with Permuted Data

Unsupervised Learning with Permuted Data Unsupervised Learning with Permuted Data Sergey Kirshner skirshne@ics.uci.edu Sridevi Parise sparise@ics.uci.edu Padhraic Smyth smyth@ics.uci.edu School of Information and Computer Science, University

More information

Review Notes for IB Standard Level Math

Review Notes for IB Standard Level Math Review Notes for IB Standard Level Math 1 Contents 1 Algebra 8 1.1 Rules of Basic Operations............................... 8 1.2 Rules of Roots..................................... 8 1.3 Rules of Exponents...................................

More information

V. Properties of estimators {Parts C, D & E in this file}

V. Properties of estimators {Parts C, D & E in this file} A. Definitions & Desiderata. model. estimator V. Properties of estimators {Parts C, D & E in this file}. sampling errors and sampling distribution 4. unbiasedness 5. low sampling variance 6. low mean squared

More information

Lecture 1 : Data Compression and Entropy

Lecture 1 : Data Compression and Entropy CPS290: Algorithmic Foundations of Data Science January 8, 207 Lecture : Data Compression and Entropy Lecturer: Kamesh Munagala Scribe: Kamesh Munagala In this lecture, we will study a simple model for

More information

Solutions to Dynamical Systems 2010 exam. Each question is worth 25 marks.

Solutions to Dynamical Systems 2010 exam. Each question is worth 25 marks. Solutions to Dynamical Systems exam Each question is worth marks [Unseen] Consider the following st order differential equation: dy dt Xy yy 4 a Find and classify all the fixed points of Hence draw the

More information

CS 195-5: Machine Learning Problem Set 1

CS 195-5: Machine Learning Problem Set 1 CS 95-5: Machine Learning Problem Set Douglas Lanman dlanman@brown.edu 7 September Regression Problem Show that the prediction errors y f(x; ŵ) are necessarily uncorrelated with any linear function of

More information

An Introduction to Expectation-Maximization

An Introduction to Expectation-Maximization An Introduction to Expectation-Maximization Dahua Lin Abstract This notes reviews the basics about the Expectation-Maximization EM) algorithm, a popular approach to perform model estimation of the generative

More information

Estimating the parameters of hidden binomial trials by the EM algorithm

Estimating the parameters of hidden binomial trials by the EM algorithm Hacettepe Journal of Mathematics and Statistics Volume 43 (5) (2014), 885 890 Estimating the parameters of hidden binomial trials by the EM algorithm Degang Zhu Received 02 : 09 : 2013 : Accepted 02 :

More information

Bennett-type Generalization Bounds: Large-deviation Case and Faster Rate of Convergence

Bennett-type Generalization Bounds: Large-deviation Case and Faster Rate of Convergence Bennett-type Generalization Bounds: Large-deviation Case and Faster Rate of Convergence Chao Zhang The Biodesign Institute Arizona State University Tempe, AZ 8587, USA Abstract In this paper, we present

More information