The relation between memory and power-law exponent
|
|
- Sherilyn Lee
- 5 years ago
- Views:
Transcription
1 The relation between memory and power-law exponent Fangjian Guo and Tao Zhou December 2, 22 Abstract The inter-event time of many real systems are empirically found to be power-law distributed, and memory (the first-order autocorrelation), as a quantity ranging from - to +, has been proposed to characterize the short-range correlation of time series. While series from real systems are usually found to be positively or negatively correlated, the memories for empirical power-law series are predominantly positive. In this paper, we study the mechanism for such memory bias. We analyze the bounds of memory for a family of random series with a specified marginal, which are generated by rearranging the orders of i.i.d. samples drawn from the marginal while preserving their values. While such rearrangement can produce uniform or Gaussian distributed series with arbitrary memory from - to +, we found in this paper that power-law distribution imposes a special constraint on the possible memory of such series, which depends on the power-law exponent. Probabilistic and approximation methods are developed respectively to analyze the bounds when corresponding population moments converge and diverge, accompanied by validation from experimental results. Introduction Power-law distributions are used to model the inter-event time series of many real systems [], including communication, web browsing and heart beats. Whereas the power-law captures the marginal distribution of empirical data, memory has been introduced by Goh and Barabási as a quantity to characterize their interdependence, which has even been claimed to be a measure orthogonal to the marginal distribution [2]. Given a positive-valued series {t i } of length n, memory is defined as its first-order autocorrelation, i.e. M = n (t i m )(t i+ m2), () n σ σ 2
2 where m, σ and m 2, σ 2 refer to the mean and standard deviation of series {t, t 2,, t n } and {t 2, t 3,, t n } respectively. M lies in the range of [, +] due to Cauchy-Schwartz inequality, for which we have M [ ] [ ] n n (t i m ) σ σ 2 n 2 (t i+ m 2 ) n 2 =. (2) It should be noted that, when n, the aforementioned two series would only be different in one element. Therefore, we have m = m 2 = m, σ = σ 2 = σ (n ), (3) where m and σ are simply the mean and standard deviation of the whole series. The memory can thus be rewritten as M = n ( σ 2 t i t i+ m 2) (n ), (4) n where the rearrangement of the order of the series only affects the sum of products n t it i+, leaving m and σ unchanged. When the series {t i } are independently sampled from a distribution, the memory of the series would definitely approach zero when n, as E[M] = n ( σ 2 E[t i t i+ ] m 2) =, (5) n with E[t i t i+ ] = m 2. However, by rearranging the order of the elements in the series while preserving their values, the memory can be adjusted without changing the marginal distribution. In fact, in the it of large n, such permutations affect the term n t it i+ only. 2 Rearrangement with maximum or minimum memory In the following sections, we assume n and therefore a simpler form of memory as in equation (4) is adopted. A rearrangement θ is a one-toone mapping from the set {, 2,, n} to itself, and the i-th element in the rearranged sequence is denoted as θ i. Denoting the rearrangement with maximum memory and the one with minimum memory as θ max and θ min respectively, and denoting the sum of the products of adjacent elements as n S θ = t θi t θi+, (6) 2
3 we have θ max = argmax S θ, θ θ min = argmin S θ. θ (7) We also define a shorthand s θ = n S θ. It has been shown there exist fixed θ max and θ min that holds for all the possible positive series of {t, t 2,, t n } [3]. (Whereas [3] uses an objective function that sums the products in a circle, i.e. S θ = n t θ i t θi+ + t θ t θn, the results can be reduced to our case by introducing an additional S = to the series, which makes zero contribution to the sum.) To illustrate how they are arranged, we first introduce the order statics. The order statics {t (i) } are formed by simply arranging the series in an increasing order, i.e. t () t (2) t (n ) t (n), (8) with t (i) being the i-th smallest element in the series. θ max obtains the maximum memory by first arranging the odd elements of order statistics in increasing order, followed by even elements in decreasing order, which is t (), t (3),, t (2l ), t (2l), t (2l 2),, t (4), t (2) (n = 2l), t (), t (3),, t (2l ), t (2l+), t (2l),, t (4), t (2) (n = 2l + ). (9) For simplicity, we only address the case when n = 2l, the sum with order statistics is expressed as S θmax = 2l 2 t (i) t (i+2) + t (2l) t (2l ) (n = 2l), () while the case of odd n can be handled similarly. On the contrary, quite interestingly, θ min arranges the order statistics by interlacing the even and odd terms when n = 2l or small and big terms when n = 2l +, i.e. t (2l), t (), t (2l 2),, t (2l 3), t (2), t (2l ) (n = 2l), t (2l), t (2),, t (l),, t (), t (2l+) (n = 2l + ). () Again, for even n, we have l ( ) S θmin = t(i) t (2l+ i) + t (i) t (2l i) + t(l) t (l+) (n = 2l). (2) 3
4 3 Adjusting memory by iterative rearrangement We first explore the bounds for uniform, Gaussian and power-law samples by iteratively rearranging them towards θ max or θ min. To do this, we construct an approximation of the order statistic iteratively. The process contains n steps, the same as the length of the series. In each step, the series is rearranged with one pass of bubble sort on the series produced after the last step, which means stepping through the series from the first element to the last, comparing each pair of adjacent elements and swap them if they are not in increasing order. The series obtained after i such steps is denoted as { t (i) }, as an approximation to the order statistics, with { t (n) } guaranteed to be the same as the order statistics because a bubble sort always finishes in n passes. In each step, treating { t (i) } as approximate order statistics, series with positive and negative memory are obtained by rearranging { t (i) } according to (9) and () respectively. In our experiment, series are of length n =,, independently drawn from uniform distribution on [, ], standard Gaussian distribution (all samples are added by the same positive constant afterwards to ensure being positive), and power-law distribution whose probability density function is f(t) = α t ( ) α, (3) t min t min with t min = and α = 3.5. Results are obtained by averaging 5 independent runs and reported by Figure for positive memory and Figure 2 for negative memory. As can be seen, by rearranging the series towards θ max, the memory goes to + for all three distributions. As is noted, the maximum memory of power-law series after the last iteration is slightly below +, due to the effect of finite n, which we will see later. However, while the memory can be tuned down to by rearrangement towards θ min for uniform and Gaussian distributed series, it is bounded above about.2 for the power-law series. In the following sections, we will further analyze how this non-trivial lower bound depends on the exponent α. 4 The bounds of memory for rearranged independent power-law series In our following analysis, without loss of generality, we assume the original series are independently sampled from the power-law distribution with t min =, hence the probability density function being f(t) = (α )t α, (4) 4
5 .2.8 Power law Uniform Gaussian Memory Iterations of rearrangement Figure : Positive memory by iteratively rearranging independently sampled series..2.2 Memory.4.6 Power law Uniform Gaussian Iterations of rearrangement Figure 2: Negative memory by iteratively rearranging independently sampled series. where α > is required to form a well-defined distribution. Series with a different lower bound can be recovered by being multiplied by t min and its 5
6 memory ( ) n M = (t min σ) 2 (t min t i )(t min t i+ ) (t min m) 2 = M (5) n is the same as the one with t min =. Therefore, the bounds obtained with t min = hold true with any positive t min. M is always well-defined for empirical samples, as samples are of course finite. However, the population variance Var[t] exists only when α > 3 and the population mean E[t] exists only when α > 2. Therefore, we adopt different treatments for α > 3 and < α 3. When α > 3, we equate sample moments involved in the definition of memory with population moments and then derive the expected value E[M] in the it of n. When < α 3, we instead use a method that approximates the random samples with determinant values. Such approximation can largely simplify the analysis when population moments diverge while still reproducing the behavior of bounds. 4. Probabilistic method for the α > 3 case When α > 3, we denote the population standard deviation and the population mean as σ(α) and m(α) respectively when there is no ambiguity. By equating sample moments with corresponding population moments, we have where E[M] = n σ(α) 2 ( E[t i t i+ ] m(α) 2 ), (6) n m(α) = α α 2, (7) σ(α) 2 = α α 3 m(α)2. (8) 4.. The upper bound of memory The upper bound of memory M max is defined as the expected value of the memory with rearrangement θ max in the it of n, namely M max = E[M θ n max ] = σ(α) 2 ( n n E[S θ max ] m(α) 2 ), (9) where the expected value of S θmax is composed of the expected value of products of adjacent items according to equation (), i.e. n E[S θ max ] = 2l 2 E[t 2l (i) t (i+2) ] + 2l E[t (2l)t (2l ) ], (2) 6
7 assuming n = 2l. The case when n = 2l + can be worked out in a similar fashion to arrive at the same results. The expected value of each term can be obtained by using the joint distribution of order statistics. The probability density function for the joint distribution of two order statistics t (j), t (k) (j < k) is given by [4] [F (x)]j [F (y) F (x)] k j f t(j),t (k) (x, y) = n! (j )! (k j)! [ F (y)] n k f(x)f(y) (n k)! where F (x) is the cumulative distribution function for power-law, i.e. (x y), (2) F (x) = x α, (22) and f is given by (4). Therefore, we have E[t (i) t (i+2) ] = = x y< Γ(2l + ) Γ(2l + 2c) xyf t(i),t (i+2) (x, y)dxdy Γ(2l i + 2c) Γ(2l i ) [(2l i) c][(2l i) αc] ( i 2l 2), and E[t (2l ) t (2l) ] = = x y< xyf t(2l ),t (2l) (x, y)dxdy (α )2 2(α 2) 2 Γ(3 2c) Γ(2l + ) Γ(2l + 2c), (23) where we use a shorthand c = α (, 2 ). In the it of n, the first term in (2) would be 2l 2 2l = 2l E[t (i) t (i+2) ] Γ(2l + ) Γ(2l + 2c) 2l 2 [2l i 2c][2l i αc] Γ(2l i + 2c). Γ(2l i ) Reducing the prefactor by using a property of Gamma function, namely x Γ(x) Γ(x + γ)x γ = (γ R), and rewriting the summation with index k = 2l i, we have 2l 2 2l 2l E[t (i) t (i+2) ] = (2l+) ( 2c) 7 k=2 (k c)(k c ) Γ(k + 2c). Γ(k )
8 As c < 2, the prefactor approaches zero when n, making the result depend on the iting behavior of the summing terms only. Now we have 2l 2 2l E[t (i) t (i+2) ] = k=2 k 2l ( 2l + ) 2c 2l + = (24) Meanwhile, the second term in the RHS of (2) vanishes in the it of large n, as Substituting (24) and (25) into (6), we arrive at 2l E[t (2l)t (2l ) ] =. (25) M max = n E[M θ max ] = (α > 3), (26) proving that the maximum memory for rearranged power-law series is +, the same as uniform and Gaussian distributed series The lower bound of memory The lower bound of memory is formulated similarly as where M min = M θ n min = σ(α) 2 ( n n E[S θ min ] m(α) 2 ), (27) t 2c dt = α α 3. n E[S θ min ] = l ( E[t(i) t 2l (2l+ i) ] + E[t (i) t (2l i) ] ) + 2l E[t (l)t (l+) ], (28) again assuming n = 2l for convenience. The expected value of these products can also be obtained with the distribution of order statistics: E[t (i) t (2l+ i) ] = E[t (i) t (2l i) ] = Γ(2l + ) Γ(2l + 2c) Γ(2l + ) Γ(2l + 2c) Γ(i c) Γ(2l i + 2c) Γ(i) Γ(2l i + c), (29) Γ(i + 2 c) Γ(2l i + 2c) Γ(i + 2) Γ(2l i + c). (3) 8
9 Taking the large n it, we have l 2l = 2l E[t (i) t (2l+ i) ] Γ(2l + ) Γ(2l + 2c) l Γ(i + 2 c) Γ(2l i + 2c) Γ(i + 2) Γ(2l i + c) l = (2l + ) ( 2c) (i + 2) c (2l i + ) c l = ( i + 2 2l + ) c ( i 2l + ) c ( 2l + ) = 2 u c ( u) c du = B( 2 ; α 2 α, α 2 α ), where B is the incomplete beta function, defined as B(x; a, b) = x (3) t a ( t) b dt. (32) We also have the other term with the same it, namely l 2l and again the remaining term Therefore, the minimum memory is derived as [ 2B M min = n E[M θ min ] = σ(α) Experimental results E[t (i) t (2l i) ] = B( 2 ; α 2 α, α 2 ), (33) α 2l E[t (l)t (l+) ] =. (34) ( 2 ; m(α), m(α) ) m(α) 2 ] (α > 3). (35) Figure 3 reports the comparison between theoretical and experimental results for the upper and lower bound of memory for rearranged independently sampled power-law series, in the case of α > 3. The solid lines are drawn according to (26) and (35), while the circles are produced by rearranging the independently sampled series with θ max and θ min. The effect of finite system size n is observed when α is near 3, resulting in minor difference between theoretical and experimental results. When α > 3, while M max is a constant, M min is a decreasing function of α with the it M min.64 (α ). 9
10 .8 Memory Max Memory (experiment) Min Memory (experiment) Min Memory (theory) Max Memory (theory) α Figure 3: The comparison between theoretical and experimental bounds for memory, in the case of α > 3. The series are of length n = 5, and experimental results are obtained by averaging over independent runs. 4.2 Approximation method for the α 3 case When α 3, the population variance would diverge and even the population mean would diverge when α goes below 2, making the method of equating sample moments with population moments invalid for such circumstances. And a direct calculation of the expected value of maximum or minimum memory involves the distribution of M itself, which is intractable. Therefore, we adopt an approximation method for the α 3 case, which substitutes the random samples with determinant values in the it of large n Approximating random samples with equiprobable slices As illustrated by Figure 4, the points ˆt, ˆt 2,, ˆt n cut the area under the probability density function f(t) into slices of equal area n, with ˆt = t min = and the area from t n extending to infinity also being n. Then we approximate the random samples {t, t 2,, t n } with these determinant points {ˆt, ˆt 2,, ˆt n }. It should be noted that such approximation imposes a cutoff on the maximum value of t i and the probability of drawing a sample exceeding the cut-off is n, which diminishes to zero as n. By setting
11 ˆt i = i n where c = ˆt (i) = ˆt i., we have α > 2 f(t) ˆt i = ( i n ) c (i =, 2,, n), (36) in this case. And obviously the order statistics satisfies ^ ^ ^ t t 2 t 3 t min ^t 4... ^t n t Figure 4: Cutting the area under f(t) (in logarithm scale) into slices with equal area /n, thus equal probability. The area under the last point to infinity is also /n. Recall the expression for memory M = n ( σ 2 t i t i+ m 2) n = n n n t it i+ m 2 n t2 i, m2 where three quantities of samples are involved, denoted as (37) and s = n S = n t i t i+, (38) n t 2 = n m 2 = ( n n t 2 i, (39) n t i ) 2. (4)
12 When α 3, s and t 2 diverge as n and m 2 no longer converges if α 2. We can then use the approximated samples {ˆt, ˆt 2,, ˆt n } to analyze the asymptotic behavior of these quantities as n The upper bound of memory Substituting t (i) with ˆt i, we have ŝ θmax = 2l 2 n ( ˆt iˆt i+2 + ˆt 2lˆt 2l ) (n = 2l) = n n2c[ n (k 2 ) c + 2 c] (α < 3), k=2 (4) ˆt 2 θ max = n n n ˆt 2 i = n 2c (k + ) 2c (α < 3). (42) With n, both of them are of the same order as n 2c, while it is found that m 2 either converges when 2 < α 3 or diverges with a lower order n 2c 2 when < α 2. Hence, in the it of large n, M θmax can be approximated by neglecting m 2, i.e. ˆM θmax ŝθ max ˆt 2 θ max k= n k=2 (k2 ) c + 2 c n k= (k + ) 2c, (43) where both the numerator and denominator are convergent when n. And thus we have M max = n E[M θ max ] n n ˆM θmax n k=2 (k2 ) c + 2 c n k= (k + ) 2c ( < α 3). (44) The lower bound of memory By analyzing the memory produced by θ min with approximated samples, it is found that ˆt 2 θ min is the term diverging the biggest order of n, while the orders for m 2 and ŝ θmin are smaller if they diverge. As the biggest order term only appears in the denominator, we have M min = n E[M θ min ] n ˆM θmin =. (45) 2
13 .2.8 Max Memory (experiment) Min Memory (experiment) Max Memory (approx. theory) Min Memory (approx. theory) Memory α Figure 5: Comparison between experimental results and approximation theoretical results for < α 3. Each experimental data point is obtained by averaging sampled series of length n = 5, for independent runs Experimental results Figure 5 compares the upper and lower bounds of memory obtained by equiprobable slice approximation theory with experimental results, where data points are averaged over independent runs of drawing power-law samples of length n = 5,. Although such approximation method cannot precisely recover the behavior of maximum memory, it still captures the trend the maximum memory goes up from to + as α increases from to 3. Meanwhile, the predicted minimum memory remains zero when α 3 and larger gaps between experimental and theoretical values are noticed when α is near 3. 5 Conclusion and discussion By combining the results obtained by probabilistic method for α > 3 and approximation method for α 3, in Figure 6 we plot theoretical values for M max and M min with < α 7, where slice approximation is used for < α 3 and probabilistic method is used for α > 3. Clearly, in the sense of expected values and in the it of large n, the memory of rearranged power-law series is confined to the positive region for α 3, with the upper bound goes up from to + as α increases. 3
14 .8.6 Maximum Memory Minimum Memory Memory α Figure 6: Theoretical values for maximum and minimum memory of rearranged independent power-law series in the range of < α < 7. Slice approximation method is used for < α 3 while probabilistic method is used for 3 < α < 7. When α > 3, however, the upper bound remains to be +, while the lower bound slides down to the negative region as a decreasing function of α, with the it M min.64 (α ). Therefore, the minimum memory for rearranged power-law series is constrained above.64. Admittedly, the rearrangement of independently sampled power-law series cannot cover all the possible correlated series with power-law marginal and memory (first-order autocorrelation) is not the only measure for describing temporal correlation. Still, our results may point to a hidden mechanism of power-law, i.e. the power-law distribution may naturally constrains the possible interdependence among elements, especially preventing the correlation from being too negative. In a addition, our approach, both probabilistic and approximate, may be generalized to the analysis of a wider spectrum of distributions common for empirical data, such as log-normal and power-law with exponential tails. References [] A. Clauset, C. Shalizi, and M. Newman. Power-law distributions in empirical data. SIAM Review, 5(4):66 73, 29. 4
15 [2] K. I. Goh and A. L. Barabási. Burstiness and memory in complex systems. EPL (Europhysics Letters), February 28. [3] Marc Hallin, Guy Melard, and Xavier Milhaud. Permutational extreme values of autocorrelation coefficients and a pitman test against serial dependence. ULB Institutional Repository 23/237, ULB Universite Libre de Bruxelles, 992. [4] H. A. David and H. N. Nagaraja. Order Statistics. Wiley, New Jersey, 3rd edition edition, 23. 5
The Expectation Maximization Algorithm
The Expectation Maximization Algorithm Frank Dellaert College of Computing, Georgia Institute of Technology Technical Report number GIT-GVU-- February Abstract This note represents my attempt at explaining
More informationEM for Spherical Gaussians
EM for Spherical Gaussians Karthekeyan Chandrasekaran Hassan Kingravi December 4, 2007 1 Introduction In this project, we examine two aspects of the behavior of the EM algorithm for mixtures of spherical
More informationAlgebra 2: Semester 2 Practice Final Unofficial Worked Out Solutions by Earl Whitney
Algebra 2: Semester 2 Practice Final Unofficial Worked Out Solutions by Earl Whitney 1. The key to this problem is recognizing cubes as factors in the radicands. 24 16 5 2. The key to this problem is recognizing
More informationStatistics for scientists and engineers
Statistics for scientists and engineers February 0, 006 Contents Introduction. Motivation - why study statistics?................................... Examples..................................................3
More informationChem 3502/4502 Physical Chemistry II (Quantum Mechanics) 3 Credits Spring Semester 2006 Christopher J. Cramer. Lecture 9, February 8, 2006
Chem 3502/4502 Physical Chemistry II (Quantum Mechanics) 3 Credits Spring Semester 2006 Christopher J. Cramer Lecture 9, February 8, 2006 The Harmonic Oscillator Consider a diatomic molecule. Such a molecule
More informationThe Expectation-Maximization Algorithm
1/29 EM & Latent Variable Models Gaussian Mixture Models EM Theory The Expectation-Maximization Algorithm Mihaela van der Schaar Department of Engineering Science University of Oxford MLE for Latent Variable
More informationVariational Learning : From exponential families to multilinear systems
Variational Learning : From exponential families to multilinear systems Ananth Ranganathan th February 005 Abstract This note aims to give a general overview of variational inference on graphical models.
More informationNon-parametric Inference and Resampling
Non-parametric Inference and Resampling Exercises by David Wozabal (Last update. Juni 010) 1 Basic Facts about Rank and Order Statistics 1.1 10 students were asked about the amount of time they spend surfing
More informationMachine Learning Lecture Notes
Machine Learning Lecture Notes Predrag Radivojac January 25, 205 Basic Principles of Parameter Estimation In probabilistic modeling, we are typically presented with a set of observations and the objective
More informationMixtures of Gaussians. Sargur Srihari
Mixtures of Gaussians Sargur srihari@cedar.buffalo.edu 1 9. Mixture Models and EM 0. Mixture Models Overview 1. K-Means Clustering 2. Mixtures of Gaussians 3. An Alternative View of EM 4. The EM Algorithm
More informationNon-parametric Inference and Resampling
Non-parametric Inference and Resampling Exercises by David Wozabal (Last update 3. Juni 2013) 1 Basic Facts about Rank and Order Statistics 1.1 10 students were asked about the amount of time they spend
More informationLarge Sample Properties of Estimators in the Classical Linear Regression Model
Large Sample Properties of Estimators in the Classical Linear Regression Model 7 October 004 A. Statement of the classical linear regression model The classical linear regression model can be written in
More informationAlgebra II Unit Breakdown (Curriculum Map Outline)
QUARTER 1 Unit 1 (Arithmetic and Geometric Sequences) A. Sequences as Functions 1. Identify finite and infinite sequences 2. Identify an arithmetic sequence and its parts 3. Identify a geometric sequence
More informationAlgorithms and Their Complexity
CSCE 222 Discrete Structures for Computing David Kebo Houngninou Algorithms and Their Complexity Chapter 3 Algorithm An algorithm is a finite sequence of steps that solves a problem. Computational complexity
More informationGaussian Models (9/9/13)
STA561: Probabilistic machine learning Gaussian Models (9/9/13) Lecturer: Barbara Engelhardt Scribes: Xi He, Jiangwei Pan, Ali Razeen, Animesh Srivastava 1 Multivariate Normal Distribution The multivariate
More informationSTAT 302 Introduction to Probability Learning Outcomes. Textbook: A First Course in Probability by Sheldon Ross, 8 th ed.
STAT 302 Introduction to Probability Learning Outcomes Textbook: A First Course in Probability by Sheldon Ross, 8 th ed. Chapter 1: Combinatorial Analysis Demonstrate the ability to solve combinatorial
More informationAdaptive Monte Carlo methods
Adaptive Monte Carlo methods Jean-Michel Marin Projet Select, INRIA Futurs, Université Paris-Sud joint with Randal Douc (École Polytechnique), Arnaud Guillin (Université de Marseille) and Christian Robert
More informationSpring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM
University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 14 GEE-GMM Throughout the course we have emphasized methods of estimation and inference based on the principle
More informationNAG Library Chapter Introduction. G08 Nonparametric Statistics
NAG Library Chapter Introduction G08 Nonparametric Statistics Contents 1 Scope of the Chapter.... 2 2 Background to the Problems... 2 2.1 Parametric and Nonparametric Hypothesis Testing... 2 2.2 Types
More informationPerhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.
Chapter 5 Two Random Variables In a practical engineering problem, there is almost always causal relationship between different events. Some relationships are determined by physical laws, e.g., voltage
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY Physics Department Statistical Physics I Spring Term 2013 Notes on the Microcanonical Ensemble
MASSACHUSETTS INSTITUTE OF TECHNOLOGY Physics Department 8.044 Statistical Physics I Spring Term 2013 Notes on the Microcanonical Ensemble The object of this endeavor is to impose a simple probability
More informationTechnical Details about the Expectation Maximization (EM) Algorithm
Technical Details about the Expectation Maximization (EM Algorithm Dawen Liang Columbia University dliang@ee.columbia.edu February 25, 2015 1 Introduction Maximum Lielihood Estimation (MLE is widely used
More informationA Generalized Uncertainty Principle and Sparse Representation in Pairs of Bases
2558 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 48, NO 9, SEPTEMBER 2002 A Generalized Uncertainty Principle Sparse Representation in Pairs of Bases Michael Elad Alfred M Bruckstein Abstract An elementary
More informationStat 426 : Homework 1.
Stat 426 : Homework 1. Moulinath Banerjee University of Michigan Announcement: The homework carries 120 points and contributes 10 points to the total grade. (1) A geometric random variable W takes values
More informationBTRY 4090: Spring 2009 Theory of Statistics
BTRY 4090: Spring 2009 Theory of Statistics Guozhang Wang September 25, 2010 1 Review of Probability We begin with a real example of using probability to solve computationally intensive (or infeasible)
More informationTHE EXPONENTIAL DISTRIBUTION ANALOG OF THE GRUBBS WEAVER METHOD
THE EXPONENTIAL DISTRIBUTION ANALOG OF THE GRUBBS WEAVER METHOD ANDREW V. SILLS AND CHARLES W. CHAMP Abstract. In Grubbs and Weaver (1947), the authors suggest a minimumvariance unbiased estimator for
More informationNetworks and Systems Prof V.G K. Murti Department of Electrical Engineering Indian Institute of Technology, Madras Lecture - 10 Fourier Series (10)
Networks and Systems Prof V.G K. Murti Department of Electrical Engineering Indian Institute of Technology, Madras Lecture - 10 Fourier Series (10) What we have seen in the previous lectures, is first
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project
More informationPCA with random noise. Van Ha Vu. Department of Mathematics Yale University
PCA with random noise Van Ha Vu Department of Mathematics Yale University An important problem that appears in various areas of applied mathematics (in particular statistics, computer science and numerical
More information1 Degree distributions and data
1 Degree distributions and data A great deal of effort is often spent trying to identify what functional form best describes the degree distribution of a network, particularly the upper tail of that distribution.
More informationSTA 4273H: Sta-s-cal Machine Learning
STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 2 In our
More informationCS Lecture 19. Exponential Families & Expectation Propagation
CS 6347 Lecture 19 Exponential Families & Expectation Propagation Discrete State Spaces We have been focusing on the case of MRFs over discrete state spaces Probability distributions over discrete spaces
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate
More informationPLC Papers. Created For:
PLC Papers Created For: Algebra and proof 2 Grade 8 Objective: Use algebra to construct proofs Question 1 a) If n is a positive integer explain why the expression 2n + 1 is always an odd number. b) Use
More informationMATH 205C: STATIONARY PHASE LEMMA
MATH 205C: STATIONARY PHASE LEMMA For ω, consider an integral of the form I(ω) = e iωf(x) u(x) dx, where u Cc (R n ) complex valued, with support in a compact set K, and f C (R n ) real valued. Thus, I(ω)
More informationIntroduction: exponential family, conjugacy, and sufficiency (9/2/13)
STA56: Probabilistic machine learning Introduction: exponential family, conjugacy, and sufficiency 9/2/3 Lecturer: Barbara Engelhardt Scribes: Melissa Dalis, Abhinandan Nath, Abhishek Dubey, Xin Zhou Review
More informationProbability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur
Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur Lecture No. # 33 Probability Models using Gamma and Extreme Value
More information13 : Variational Inference: Loopy Belief Propagation and Mean Field
10-708: Probabilistic Graphical Models 10-708, Spring 2012 13 : Variational Inference: Loopy Belief Propagation and Mean Field Lecturer: Eric P. Xing Scribes: Peter Schulam and William Wang 1 Introduction
More informationA Generalization of Wigner s Law
A Generalization of Wigner s Law Inna Zakharevich June 2, 2005 Abstract We present a generalization of Wigner s semicircle law: we consider a sequence of probability distributions (p, p 2,... ), with mean
More informationIntroduction to Restricted Boltzmann Machines
Introduction to Restricted Boltzmann Machines Ilija Bogunovic and Edo Collins EPFL {ilija.bogunovic,edo.collins}@epfl.ch October 13, 2014 Introduction Ingredients: 1. Probabilistic graphical models (undirected,
More informationBusiness Economics BUSINESS ECONOMICS. PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS MODULE No. : 3, GAUSS MARKOV THEOREM
Subject Business Economics Paper No and Title Module No and Title Module Tag 8, Fundamentals of Econometrics 3, The gauss Markov theorem BSE_P8_M3 1 TABLE OF CONTENTS 1. INTRODUCTION 2. ASSUMPTIONS OF
More informationA Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models
A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Jeff A. Bilmes (bilmes@cs.berkeley.edu) International Computer Science Institute
More informationProbability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology Kharagpur
Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology Kharagpur Lecture No. #13 Probability Distribution of Continuous RVs (Contd
More informationRobust Testing and Variable Selection for High-Dimensional Time Series
Robust Testing and Variable Selection for High-Dimensional Time Series Ruey S. Tsay Booth School of Business, University of Chicago May, 2017 Ruey S. Tsay HTS 1 / 36 Outline 1 Focus on high-dimensional
More informationA General Overview of Parametric Estimation and Inference Techniques.
A General Overview of Parametric Estimation and Inference Techniques. Moulinath Banerjee University of Michigan September 11, 2012 The object of statistical inference is to glean information about an underlying
More information5 Introduction to the Theory of Order Statistics and Rank Statistics
5 Introduction to the Theory of Order Statistics and Rank Statistics This section will contain a summary of important definitions and theorems that will be useful for understanding the theory of order
More informationPRIME NUMBERS YANKI LEKILI
PRIME NUMBERS YANKI LEKILI We denote by N the set of natural numbers: 1,2,..., These are constructed using Peano axioms. We will not get into the philosophical questions related to this and simply assume
More informationMachine Learning and Bayesian Inference. Unsupervised learning. Can we find regularity in data without the aid of labels?
Machine Learning and Bayesian Inference Dr Sean Holden Computer Laboratory, Room FC6 Telephone extension 6372 Email: sbh11@cl.cam.ac.uk www.cl.cam.ac.uk/ sbh11/ Unsupervised learning Can we find regularity
More informationProbability Background
Probability Background Namrata Vaswani, Iowa State University August 24, 2015 Probability recap 1: EE 322 notes Quick test of concepts: Given random variables X 1, X 2,... X n. Compute the PDF of the second
More informationTWO METHODS FOR ESTIMATING OVERCOMPLETE INDEPENDENT COMPONENT BASES. Mika Inki and Aapo Hyvärinen
TWO METHODS FOR ESTIMATING OVERCOMPLETE INDEPENDENT COMPONENT BASES Mika Inki and Aapo Hyvärinen Neural Networks Research Centre Helsinki University of Technology P.O. Box 54, FIN-215 HUT, Finland ABSTRACT
More informationarxiv: v1 [quant-ph] 11 Mar 2016
The Asymptotics of Quantum Max-Flow Min-Cut Matthew B. Hastings 1 Station Q, Microsoft Research, Santa Barbara, CA 93106-6105, USA 2 Quantum Architectures and Computation Group, Microsoft Research, Redmond,
More informationAn Asymptotically Optimal Algorithm for the Max k-armed Bandit Problem
An Asymptotically Optimal Algorithm for the Max k-armed Bandit Problem Matthew J. Streeter February 27, 2006 CMU-CS-06-110 Stephen F. Smith School of Computer Science Carnegie Mellon University Pittsburgh,
More informationELEG 833. Nonlinear Signal Processing
Nonlinear Signal Processing ELEG 833 Gonzalo R. Arce Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu February 15, 25 2 NON-GAUSSIAN MODELS 2 Non-Gaussian Models
More information1 Potential due to a charged wire/sheet
Lecture XXX Renormalization, Regularization and Electrostatics Let us calculate the potential due to an infinitely large object, e.g. a uniformly charged wire or a uniformly charged sheet. Our main interest
More informationSTATISTICAL METHODS FOR SIGNAL PROCESSING c Alfred Hero
STATISTICAL METHODS FOR SIGNAL PROCESSING c Alfred Hero 1999 32 Statistic used Meaning in plain english Reduction ratio T (X) [X 1,..., X n ] T, entire data sample RR 1 T (X) [X (1),..., X (n) ] T, rank
More informationParametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a
Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a Some slides are due to Christopher Bishop Limitations of K-means Hard assignments of data points to clusters small shift of a
More informationLecture 12 Reactions as Rare Events 3/02/2009 Notes by MIT Student (and MZB)
Lecture 1 Reactions as Rare Events 3/0/009 Notes by MIT Student (and MZB) 1. Introduction In this lecture, we ll tackle the general problem of reaction rates involving particles jumping over barriers in
More informationA Modified Baum Welch Algorithm for Hidden Markov Models with Multiple Observation Spaces
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 9, NO. 4, MAY 2001 411 A Modified Baum Welch Algorithm for Hidden Markov Models with Multiple Observation Spaces Paul M. Baggenstoss, Member, IEEE
More informationLearning MN Parameters with Alternative Objective Functions. Sargur Srihari
Learning MN Parameters with Alternative Objective Functions Sargur srihari@cedar.buffalo.edu 1 Topics Max Likelihood & Contrastive Objectives Contrastive Objective Learning Methods Pseudo-likelihood Gradient
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More information13: Variational inference II
10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational
More informationReal Analysis Prof. S.H. Kulkarni Department of Mathematics Indian Institute of Technology, Madras. Lecture - 13 Conditional Convergence
Real Analysis Prof. S.H. Kulkarni Department of Mathematics Indian Institute of Technology, Madras Lecture - 13 Conditional Convergence Now, there are a few things that are remaining in the discussion
More informationClustering on Unobserved Data using Mixture of Gaussians
Clustering on Unobserved Data using Mixture of Gaussians Lu Ye Minas E. Spetsakis Technical Report CS-2003-08 Oct. 6 2003 Department of Computer Science 4700 eele Street orth York, Ontario M3J 1P3 Canada
More informationLecture 2. We now introduce some fundamental tools in martingale theory, which are useful in controlling the fluctuation of martingales.
Lecture 2 1 Martingales We now introduce some fundamental tools in martingale theory, which are useful in controlling the fluctuation of martingales. 1.1 Doob s inequality We have the following maximal
More informationLecture 7 Introduction to Statistical Decision Theory
Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7
More informationSpring 2012 Math 541B Exam 1
Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote
More informationMultiple Random Variables
Multiple Random Variables Joint Probability Density Let X and Y be two random variables. Their joint distribution function is F ( XY x, y) P X x Y y. F XY ( ) 1, < x
More informationContrastive Divergence
Contrastive Divergence Training Products of Experts by Minimizing CD Hinton, 2002 Helmut Puhr Institute for Theoretical Computer Science TU Graz June 9, 2010 Contents 1 Theory 2 Argument 3 Contrastive
More informationSampling Distributions and Asymptotics Part II
Sampling Distributions and Asymptotics Part II Insan TUNALI Econ 511 - Econometrics I Koç University 18 October 2018 I. Tunal (Koç University) Week 5 18 October 2018 1 / 29 Lecture Outline v Sampling Distributions
More informationCSCI-567: Machine Learning (Spring 2019)
CSCI-567: Machine Learning (Spring 2019) Prof. Victor Adamchik U of Southern California Mar. 19, 2019 March 19, 2019 1 / 43 Administration March 19, 2019 2 / 43 Administration TA3 is due this week March
More informationThe Sommerfeld Polynomial Method: Harmonic Oscillator Example
Chemistry 460 Fall 2017 Dr. Jean M. Standard October 2, 2017 The Sommerfeld Polynomial Method: Harmonic Oscillator Example Scaling the Harmonic Oscillator Equation Recall the basic definitions of the harmonic
More informationMath 350: An exploration of HMMs through doodles.
Math 350: An exploration of HMMs through doodles. Joshua Little (407673) 19 December 2012 1 Background 1.1 Hidden Markov models. Markov chains (MCs) work well for modelling discrete-time processes, or
More informationINTRODUCTION TO PATTERN RECOGNITION
INTRODUCTION TO PATTERN RECOGNITION INSTRUCTOR: WEI DING 1 Pattern Recognition Automatic discovery of regularities in data through the use of computer algorithms With the use of these regularities to take
More informationAlgorithms. Quicksort. Slide credit: David Luebke (Virginia)
1 Algorithms Quicksort Slide credit: David Luebke (Virginia) Sorting revisited We have seen algorithms for sorting: INSERTION-SORT, MERGESORT More generally: given a sequence of items Each item has a characteristic
More informationProbability and Distributions
Probability and Distributions What is a statistical model? A statistical model is a set of assumptions by which the hypothetical population distribution of data is inferred. It is typically postulated
More informationSupplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements
Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements Jeffrey N. Rouder Francis Tuerlinckx Paul L. Speckman Jun Lu & Pablo Gomez May 4 008 1 The Weibull regression model
More informationECE534, Spring 2018: Solutions for Problem Set #3
ECE534, Spring 08: Solutions for Problem Set #3 Jointly Gaussian Random Variables and MMSE Estimation Suppose that X, Y are jointly Gaussian random variables with µ X = µ Y = 0 and σ X = σ Y = Let their
More informationTHE N-VALUE GAME OVER Z AND R
THE N-VALUE GAME OVER Z AND R YIDA GAO, MATT REDMOND, ZACH STEWARD Abstract. The n-value game is an easily described mathematical diversion with deep underpinnings in dynamical systems analysis. We examine
More informationSummary of free theory: one particle state: vacuum state is annihilated by all a s: then, one particle state has normalization:
The LSZ reduction formula based on S-5 In order to describe scattering experiments we need to construct appropriate initial and final states and calculate scattering amplitude. Summary of free theory:
More informationSYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions
SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu
More informationChapter 6. Order Statistics and Quantiles. 6.1 Extreme Order Statistics
Chapter 6 Order Statistics and Quantiles 61 Extreme Order Statistics Suppose we have a finite sample X 1,, X n Conditional on this sample, we define the values X 1),, X n) to be a permutation of X 1,,
More informationCS264: Beyond Worst-Case Analysis Lecture #20: From Unknown Input Distributions to Instance Optimality
CS264: Beyond Worst-Case Analysis Lecture #20: From Unknown Input Distributions to Instance Optimality Tim Roughgarden December 3, 2014 1 Preamble This lecture closes the loop on the course s circle of
More informationStatistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation
Statistics 62: L p spaces, metrics on spaces of probabilites, and connections to estimation Moulinath Banerjee December 6, 2006 L p spaces and Hilbert spaces We first formally define L p spaces. Consider
More informationBasic Sampling Methods
Basic Sampling Methods Sargur Srihari srihari@cedar.buffalo.edu 1 1. Motivation Topics Intractability in ML How sampling can help 2. Ancestral Sampling Using BNs 3. Transforming a Uniform Distribution
More informationChapter 9. Non-Parametric Density Function Estimation
9-1 Density Estimation Version 1.2 Chapter 9 Non-Parametric Density Function Estimation 9.1. Introduction We have discussed several estimation techniques: method of moments, maximum likelihood, and least
More informationProbability Methods in Civil Engineering Prof. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur
Probability Methods in Civil Engineering Prof. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur Lecture No. # 12 Probability Distribution of Continuous RVs (Contd.)
More informationReview. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda
Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with
More informationA graph contains a set of nodes (vertices) connected by links (edges or arcs)
BOLTZMANN MACHINES Generative Models Graphical Models A graph contains a set of nodes (vertices) connected by links (edges or arcs) In a probabilistic graphical model, each node represents a random variable,
More information(Refer Slide Time: )
Digital Signal Processing Prof. S. C. Dutta Roy Department of Electrical Engineering Indian Institute of Technology, Delhi FIR Lattice Synthesis Lecture - 32 This is the 32nd lecture and our topic for
More informationCHAPTER 2 Estimating Probabilities
CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2017. Tom M. Mitchell. All rights reserved. *DRAFT OF September 16, 2017* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is
More informationGuess & Check Codes for Deletions, Insertions, and Synchronization
Guess & Check Codes for Deletions, Insertions, and Synchronization Serge Kas Hanna, Salim El Rouayheb ECE Department, Rutgers University sergekhanna@rutgersedu, salimelrouayheb@rutgersedu arxiv:759569v3
More informationUnsupervised Learning with Permuted Data
Unsupervised Learning with Permuted Data Sergey Kirshner skirshne@ics.uci.edu Sridevi Parise sparise@ics.uci.edu Padhraic Smyth smyth@ics.uci.edu School of Information and Computer Science, University
More informationReview Notes for IB Standard Level Math
Review Notes for IB Standard Level Math 1 Contents 1 Algebra 8 1.1 Rules of Basic Operations............................... 8 1.2 Rules of Roots..................................... 8 1.3 Rules of Exponents...................................
More informationV. Properties of estimators {Parts C, D & E in this file}
A. Definitions & Desiderata. model. estimator V. Properties of estimators {Parts C, D & E in this file}. sampling errors and sampling distribution 4. unbiasedness 5. low sampling variance 6. low mean squared
More informationLecture 1 : Data Compression and Entropy
CPS290: Algorithmic Foundations of Data Science January 8, 207 Lecture : Data Compression and Entropy Lecturer: Kamesh Munagala Scribe: Kamesh Munagala In this lecture, we will study a simple model for
More informationSolutions to Dynamical Systems 2010 exam. Each question is worth 25 marks.
Solutions to Dynamical Systems exam Each question is worth marks [Unseen] Consider the following st order differential equation: dy dt Xy yy 4 a Find and classify all the fixed points of Hence draw the
More informationCS 195-5: Machine Learning Problem Set 1
CS 95-5: Machine Learning Problem Set Douglas Lanman dlanman@brown.edu 7 September Regression Problem Show that the prediction errors y f(x; ŵ) are necessarily uncorrelated with any linear function of
More informationAn Introduction to Expectation-Maximization
An Introduction to Expectation-Maximization Dahua Lin Abstract This notes reviews the basics about the Expectation-Maximization EM) algorithm, a popular approach to perform model estimation of the generative
More informationEstimating the parameters of hidden binomial trials by the EM algorithm
Hacettepe Journal of Mathematics and Statistics Volume 43 (5) (2014), 885 890 Estimating the parameters of hidden binomial trials by the EM algorithm Degang Zhu Received 02 : 09 : 2013 : Accepted 02 :
More informationBennett-type Generalization Bounds: Large-deviation Case and Faster Rate of Convergence
Bennett-type Generalization Bounds: Large-deviation Case and Faster Rate of Convergence Chao Zhang The Biodesign Institute Arizona State University Tempe, AZ 8587, USA Abstract In this paper, we present
More information