Phenomena in high dimensions in geometric analysis, random matrices, and computational geometry Roscoff, France, June 25-29, 2012
|
|
- Amos Hampton
- 5 years ago
- Views:
Transcription
1 Phenomena in high dimensions in geometric analysis, random matrices, and computational geometry Roscoff, France, June 25-29, 202 BOUNDS AND ASYMPTOTICS FOR FISHER INFORMATION IN THE CENTRAL LIMIT THEOREM Sergey G. Bobkov University of Minnesota, Minneapolis, USA joint work with Gennadiy P. Chistyakov and Friedrich Götze Bielefeld University, Bielefeld, Germany
2 Fisher s quantity of information X a random variable with values in R Definition. If X has an absolutely continuous density p, its Fisher information is defined by I(X) = I(p) = + p (x) 2 p(x) dx, where p is a Radon-Nikodym derivative of p. In all other cases, I(X) = +. Equivalently, I(X) = E p (X) p(x) 2. Remarks. ) P{p(X) > 0} =, so the definition makes sense. Integration is over {x : p(x) > 0}. 2) Assume I(X) < +. Then p(x) = 0 p (x) = 0. 3) Translation invariance and homogeneity: I(a+bX) = b 2 I(X) (a R, b 0). 2
3 2 When the Fisher information appears naturally Statistics: Estimation of the shift parameter in p(x θ). Probability: Shifts of product measures, distinguishing a sequence of iid random variables from a translate of itself. µ a probability measure on R, µ θ (A) = µ (A + θ), θ R, A R Theorem (Feldman 96, Shepp 965) ( θ l 2 µ θ << µ ) I(µ) < + and dµ(x) dx > 0 a.e. Information Theory: de Bruijn s identity Differential entropy h(x) = + p(x) log p(x) dx. Theorem. If a random variable X has finite variance, then for all τ > 0, d dτ h(x + τz) = 2 I(X + τz), where Z N(0, ) is independent of X. 3
4 3 Distances to normality X a r.v., with density p(x), and a = EX, σ 2 = Var(X) < + Z N(a, σ 2 ) with density q(x) = 2πσ 2 e (x a)2 /2σ 2. Relative entropy of X with respect to Z (informational divergence, Kullback-Leibler distance): D(X) = D(X Z) = h(z) h(x) = p log p q dx. Relative Fisher information of X with respect to Z Properties I(X Z) = I(X) I(Z) = 0 D(X) + D(a + bx) = D(X) p p q q Same for the standardized Fisher information σ 2 I(X Z) = σ 2 I(X) D(X) = 0 I(X Z) = 0 X is normal 2 p dx. 4
5 4 Relations between distances Csiszár-Kullback-Pinsker inequality for total variation (967): For all random variables X and Z, 2 P X P Z 2 TV D(X Z). Stam s inequality (959) Logarithmic Sobolev inequality: If Z N(0, ), D(X Z) 2 I(X Z). Sharpening (still equivalent): If Z N(a, σ 2 ), EX = EZ = a, Var(X) = Var(Z) = σ 2, then D(X) 2 log [ + σ 2 I(X Z) ] = 2 log [σ2 I(X)]. Let EX = 0, Var(X) =, X p, Z N(0, ): P X P Z TV = + p(x) ϕ(x) dx 2 I(X Z). Shimizu (975): sup x p(x) ϕ(x) C I(X Z). Sharpening: One can show that p ϕ TV = + p (x) ϕ (x) dx C I(X Z). 5
6 5 Central limit theorem (X n ) n independent identically distributed random variables, EX = 0, Var(X ) = CLT: Weakly in distribution Z n = X X n n Z N(0, ) (n ) Theorem (Barron-Johnson 2004) I(Z n Z) 0, as n I(Z n0 Z) < + for some n 0. Equivalently: I(Z n0 ) < + for some n 0. Sufficient: I(X ) < +. Necessary: n n, Z n have bounded densities p n and sup x p n (x) ϕ(x) 0 (n ). Problems. How to determine in terms of X? (range of applicability) 2. What is rate for I(Z n Z), and under what conditions? 6
7 6 Uniform local limit theorem Theorem (Gnedenko 950 s) The following properties are equivalent: a) For all sufficently large n, Z n have (continuous) bounded densities p n satisfying sup x p n (x) ϕ(x) 0 (n ); b) For some n, Z n has a (continuous) bounded density p n ; c) The characteristic function f (t) = E e itx of X satisfies a smoothness condition + f (t) ν dt < +, for some ν > 0. 7
8 7 CLT for Fisher information distance (X n ) n independent identically distributed random variables, EX = 0, Var(X ) =. Theorem. The following assertions are equivalent: a) For some n, Z n has finite Fisher information; b) For some n, Z n has density of bounded total variation; c) For some n, Z n has a continuously differentiable density p n such that + p n(x) dx < + ; d) For some ε > 0, the characteristic function f (t) = E e itx satisfies f (t) = O(t ε ), as t + ; e) For some ν > 0, + f (t) ν t dt < +. In this and only in this case, I(Z n Z) 0 (n ). 8
9 8 /n bounds Barron, Johnson (2004) Artstein, Ball, Barthe, Naor (2004) Theorem. Assume that EX = 0, Var(X ) =, and that X satisfies a Poincaré-type inequality λ Var(u(X )) E u (X ) 2 (0 < λ ). Then I(Z n Z) + λ 2 (n ) I(X Z). Thus, I(Z n Z) = O(/n). Extension to Z n = a X a n X n (a a 2 n = ) A-B-B-N (2004): where I(Z n Z) L 4 λ 2 + ( λ 2 ) L 4 I(X Z), L 4 = a a 4 n. 9
10 9 Rate of convergence under moment conditions (X n ) n independent identically distributed random variables. Let EX = 0, Var(X ) =, and I(Z n0 ) < +, for some n 0. Theorem 2. If E X s < +, for some s > 2, then I(Z n Z) = [(s 2)/2)] j= c j n j + o n (s 2)/2 (log n) (s 3)/2, where each c j is a certain polynomial in cumulants γ 3,..., γ 2j+ of X, or moments EX 3,..., EX 2j+. s = 4: EX 4 < + I(Z n Z) = c n + o n (log n) /2, c = 2! γ2 3 = 2 (EX3 ) 2. s = 6: EX 6 < +, EX 3 = 0 I(Z n Z) = c 2 n 2 + o n 2 (log n) 3/2, c 2 = 3! γ2 4 = 6 (EX4 3) 2. 0
11 0 Case 2 < s < 4. Lower bounds In case E X s < + with 2 < s < 4, Theorem 2 only yields I(Z n Z) = o This is worse than /n rate. n (s 2)/2 (log n) (s 3)/2. Let η > s 2, 2 < s < 4. Theorem 3. There exists a sequence (X n ) n of independent i.i.d. random variables with symmetric distributions, with EX 2 =, E X s < +, I(X ) < +, and such that with some constant c = c(η, s) I(Z n Z) c n (s 2)/2 (log n) η, n n (X ). Remark. The distribution of X may be a mixture of mean zero normal laws.
12 When is Fisher information finite? Question: What should one assume about X with density p to ensure that I(X) = + p (x) 2 p(x) dx < +? And if so, how to bound I(X) from above? Stam s inequality: If X and X 2 are independent, then I(X + X 2 ) I(X ) + I(X 2 ). Monotonicity: I(X + X 2 ) I(X ). Example: X j Uniform on intervals of length a j I(X ) = + (uniform distribution) I(X + X 2 ) = + (triangle distribution) I(X + X 2 + X 3 ) < + (like beta with α = β = 2). 2
13 2 Necessary conditions From the definition I(X) = E p (X) p(x) 2 E p (X) p(x) Hence, p is a function of bounded variation with 2 = [ + ] 2 p (x) dx. p TV I(X). In general, the characteristic function f(t) = E e itx satisfies f(t) t p TV (t R). Conclusion. f(t) t I(X) (t R). 3
14 3 Convolution of densities of bounded variation Let S = X +X 2 +X 3 be the sum of three independent random variables with densities p, p 2, p 3 having bounded total variation. Proposition. One has 2I(S) p TV p 2 TV + p TV p 3 TV + p 2 TV p 3 TV. In particular, if p = p 2 = p 3 = p, I(X + X 2 + X 3 ) 3 2 p 2 TV. Definition. p TV = sup n k= p(x k ) p(x k ), where the sup is over all x 0 < x <... < x n, and where we may assume that p(x) is in between p(x ) and p(x+), for all x. Particular case: If X j Uniform on intervals of length a j, then 2 I(X + X 2 + X 3 ) + +. a a 2 a a 3 a 2 a 3 However, I(X + X 2 ) = +. 4
15 4 Proof of Proposition Let P denote the collection of all densities of bounded variation. Let U denote the collection of all uniform densities q(x) =, for a < x < b. b a Note that q TV = 2 b a. Proposition follows from the case of uniform densities and the following: Lemma. Any density p P can be represented as a convex mixture of uniform densities p(x) = U q(x) dπ(q) a.e. and with the property that p TV = U q TV dπ(q). Remark. The mixing probability measure π on U seems to be unique, but no explicit construction is available. When p is piece-wise constant, the lemma can be proved by induction on the number of supporting intervals. 5
16 5 Proof of Theorem Let S n = X X n with i.i.d. summands and characteristic function f n (t) = E e its n = f (t) n. If I n = I(S n ) < +, then, as noted, f (t) n = f n (t) t In f (t) = O(t /n ). Now, assume that, for some (fixed) n, Then S n has density + f (t) n t dt < +. p n (x) = 2π + e itx f (t) n dt, which has a continuous derivative satisfying ( + x 2 ) p n(x) = i 2π Hence, p n(x) + e itx (tf n(t) + 2f n(t) tf n (t)) dt. C +x 2 and p n TV < +. By Proposition, I 3n < +. 6
17 6 Towards Theorem 2 Let (X n ) n be i.i.d., EX = 0, Var(X ) =, Z n = X X n n, I(Z n ) < + (n n 0 ), with densities p n so that I(Z n Z) = + (p n(x) + xp n (x)) 2 p n (x) dx = I 0 + I, I 0 = T n T n (p n(x) + xp n (x)) 2 p n (x) dx, I = x T n... Good choice: T n = (s 2) log n + s log log n + ρ n (s > 2), where ρ n + sufficiently slowly to guarantee that sup x T n p n (x) ϕ(x) 0. Case s = 4: T 2 n 2 log n + 4 log log n + ρ n. 7
18 7 Edgeworth-type expansion for densities Let EX s < + (s 3 integer). For x T n, one may use a suitable approximation of p n. Not enough: ( + x s ) (p n (x) ϕ(x)) = O. n Edgeworth approximation of p n : with q k (x) = ϕ(x) ϕ s (x) = ϕ(x) + s 2 H k+2j(x) r!... r k! k= q k (x) n k/2 (γ 3 ) r... ( γ k+2 ) r k. 3! (k + 2)! Here r + 2r kr k = k, j = r r k r dr γ r = i dt log E r eitx t=0 and (3 r s). Lemma. Let I(Z n0 ) < +, for some n 0. Fix l = 0,,... Then, for all sufficiently large n, for all x, p (l) n (x) ϕ (l) s (x) where ε n 0, as n, and sup x ψ l,n (x), ψ l,n(x) + x s ε n n (s 2)/2, + ψ l,n(x) 2 dx. 8
19 8 Moderate deviations Second step: I = x T n (p n(x) + xp n (x)) 2 p n (x) dx = o n (s 2)/2 (log n) (s 3)/2. We have where I 2I, + 2I,2, I, = x T n p n(x) 2 p n (x) dx, I,2 = x T n x 2 p n (x) dx easy. Integration by parts: I +, = + T n p n(x) 2 p n (x) dx = p n(t n ) log p n (T n ) + T n p n(x) log p n (x) dx. Lemma 2. Assume p is representable as convolution of three densities with Fisher information I. Then, for all x, p (x) I 3/4 p(x), p (x) I 5/4 p(x). 9
RÉNYI DIVERGENCE AND THE CENTRAL LIMIT THEOREM. 1. Introduction
RÉNYI DIVERGENCE AND THE CENTRAL LIMIT THEOREM S. G. BOBKOV,4, G. P. CHISTYAKOV 2,4, AND F. GÖTZE3,4 Abstract. We explore properties of the χ 2 and more general Rényi Tsallis) distances to the normal law
More informationThe Central Limit Theorem: More of the Story
The Central Limit Theorem: More of the Story Steven Janke November 2015 Steven Janke (Seminar) The Central Limit Theorem:More of the Story November 2015 1 / 33 Central Limit Theorem Theorem (Central Limit
More informationStein s method, logarithmic Sobolev and transport inequalities
Stein s method, logarithmic Sobolev and transport inequalities M. Ledoux University of Toulouse, France and Institut Universitaire de France Stein s method, logarithmic Sobolev and transport inequalities
More informationSelf-normalized Cramér-Type Large Deviations for Independent Random Variables
Self-normalized Cramér-Type Large Deviations for Independent Random Variables Qi-Man Shao National University of Singapore and University of Oregon qmshao@darkwing.uoregon.edu 1. Introduction Let X, X
More informationMonotonicity of entropy and Fisher information: a quick proof via maximal correlation
Communications in Information and Systems Volume 16, Number 2, 111 115, 2016 Monotonicity of entropy and Fisher information: a quick proof via maximal correlation Thomas A. Courtade A simple proof is given
More informationEntropy and Limit theorems in Probability Theory
Entropy and Limit theorems in Probability Theory Introduction Shigeki Aida Important Notice : Solve at least one problem from the following Problems -8 and submit the report to me until June 9. What is
More informationStability results for Logarithmic Sobolev inequality
Stability results for Logarithmic Sobolev inequality Daesung Kim (joint work with Emanuel Indrei) Department of Mathematics Purdue University September 20, 2017 Daesung Kim (Purdue) Stability for LSI Probability
More informationFormulas for probability theory and linear models SF2941
Formulas for probability theory and linear models SF2941 These pages + Appendix 2 of Gut) are permitted as assistance at the exam. 11 maj 2008 Selected formulae of probability Bivariate probability Transforms
More informationAsymptotic Statistics-III. Changliang Zou
Asymptotic Statistics-III Changliang Zou The multivariate central limit theorem Theorem (Multivariate CLT for iid case) Let X i be iid random p-vectors with mean µ and and covariance matrix Σ. Then n (
More informationENTROPY VERSUS VARIANCE FOR SYMMETRIC LOG-CONCAVE RANDOM VARIABLES AND RELATED PROBLEMS
ENTROPY VERSUS VARIANCE FOR SYMMETRIC LOG-CONCAVE RANDOM VARIABLES AND RELATED PROBLEMS MOKSHAY MADIMAN, PIOTR NAYAR, AND TOMASZ TKOCZ Abstract We show that the uniform distribution minimises entropy among
More informationOn Concentration Functions of Random Variables
J Theor Probab (05) 8:976 988 DOI 0.007/s0959-03-0504- On Concentration Functions of Random Variables Sergey G. Bobkov Gennadiy P. Chistyakov Received: 4 April 03 / Revised: 6 June 03 / Published online:
More informationContinuous Random Variables
1 / 24 Continuous Random Variables Saravanan Vijayakumaran sarva@ee.iitb.ac.in Department of Electrical Engineering Indian Institute of Technology Bombay February 27, 2013 2 / 24 Continuous Random Variables
More informationNEW FUNCTIONAL INEQUALITIES
1 / 29 NEW FUNCTIONAL INEQUALITIES VIA STEIN S METHOD Giovanni Peccati (Luxembourg University) IMA, Minneapolis: April 28, 2015 2 / 29 INTRODUCTION Based on two joint works: (1) Nourdin, Peccati and Swan
More informationEntropy power inequality for a family of discrete random variables
20 IEEE International Symposium on Information Theory Proceedings Entropy power inequality for a family of discrete random variables Naresh Sharma, Smarajit Das and Siddharth Muthurishnan School of Technology
More informationSpring 2012 Math 541B Exam 1
Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote
More informationOn large deviations of sums of independent random variables
On large deviations of sums of independent random variables Zhishui Hu 12, Valentin V. Petrov 23 and John Robinson 2 1 Department of Statistics and Finance, University of Science and Technology of China,
More informationST5215: Advanced Statistical Theory
Department of Statistics & Applied Probability Wednesday, October 19, 2011 Lecture 17: UMVUE and the first method of derivation Estimable parameters Let ϑ be a parameter in the family P. If there exists
More informationChapter 4: Asymptotic Properties of the MLE
Chapter 4: Asymptotic Properties of the MLE Daniel O. Scharfstein 09/19/13 1 / 1 Maximum Likelihood Maximum likelihood is the most powerful tool for estimation. In this part of the course, we will consider
More informationChapter 3: Unbiased Estimation Lecture 22: UMVUE and the method of using a sufficient and complete statistic
Chapter 3: Unbiased Estimation Lecture 22: UMVUE and the method of using a sufficient and complete statistic Unbiased estimation Unbiased or asymptotically unbiased estimation plays an important role in
More informationConcentration, self-bounding functions
Concentration, self-bounding functions S. Boucheron 1 and G. Lugosi 2 and P. Massart 3 1 Laboratoire de Probabilités et Modèles Aléatoires Université Paris-Diderot 2 Economics University Pompeu Fabra 3
More informationLogarithmic Sobolev Inequalities
Logarithmic Sobolev Inequalities M. Ledoux Institut de Mathématiques de Toulouse, France logarithmic Sobolev inequalities what they are, some history analytic, geometric, optimal transportation proofs
More informationConcentration inequalities and the entropy method
Concentration inequalities and the entropy method Gábor Lugosi ICREA and Pompeu Fabra University Barcelona what is concentration? We are interested in bounding random fluctuations of functions of many
More informationON CONCENTRATION FUNCTIONS OF RANDOM VARIABLES. Sergey G. Bobkov and Gennadiy P. Chistyakov. June 2, 2013
ON CONCENTRATION FUNCTIONS OF RANDOM VARIABLES Sergey G. Bobkov and Gennadiy P. Chistyakov June, 3 Abstract The concentration functions are considered for sums of independent random variables. Two sided
More informationVariance reduction. Michel Bierlaire. Transport and Mobility Laboratory. Variance reduction p. 1/18
Variance reduction p. 1/18 Variance reduction Michel Bierlaire michel.bierlaire@epfl.ch Transport and Mobility Laboratory Variance reduction p. 2/18 Example Use simulation to compute I = 1 0 e x dx We
More informationLARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS. S. G. Bobkov and F. L. Nazarov. September 25, 2011
LARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS S. G. Bobkov and F. L. Nazarov September 25, 20 Abstract We study large deviations of linear functionals on an isotropic
More informationChapter 9: Basic of Hypercontractivity
Analysis of Boolean Functions Prof. Ryan O Donnell Chapter 9: Basic of Hypercontractivity Date: May 6, 2017 Student: Chi-Ning Chou Index Problem Progress 1 Exercise 9.3 (Tightness of Bonami Lemma) 2/2
More informationP (A G) dp G P (A G)
First homework assignment. Due at 12:15 on 22 September 2016. Homework 1. We roll two dices. X is the result of one of them and Z the sum of the results. Find E [X Z. Homework 2. Let X be a r.v.. Assume
More informationx log x, which is strictly convex, and use Jensen s Inequality:
2. Information measures: mutual information 2.1 Divergence: main inequality Theorem 2.1 (Information Inequality). D(P Q) 0 ; D(P Q) = 0 iff P = Q Proof. Let ϕ(x) x log x, which is strictly convex, and
More information(Multivariate) Gaussian (Normal) Probability Densities
(Multivariate) Gaussian (Normal) Probability Densities Carl Edward Rasmussen, José Miguel Hernández-Lobato & Richard Turner April 20th, 2018 Rasmussen, Hernàndez-Lobato & Turner Gaussian Densities April
More informationFisher Information, Compound Poisson Approximation, and the Poisson Channel
Fisher Information, Compound Poisson Approximation, and the Poisson Channel Mokshay Madiman Department of Statistics Yale University New Haven CT, USA Email: mokshaymadiman@yaleedu Oliver Johnson Department
More informationScore functions, generalized relative Fisher information and applications
Score functions, generalized relative Fisher information and applications Giuseppe Toscani January 19, 2016 Abstract Generalizations of the linear score function, a well-known concept in theoretical statistics,
More informationExercises in Extreme value theory
Exercises in Extreme value theory 2016 spring semester 1. Show that L(t) = logt is a slowly varying function but t ǫ is not if ǫ 0. 2. If the random variable X has distribution F with finite variance,
More information18.175: Lecture 15 Characteristic functions and central limit theorem
18.175: Lecture 15 Characteristic functions and central limit theorem Scott Sheffield MIT Outline Characteristic functions Outline Characteristic functions Characteristic functions Let X be a random variable.
More informationProbability and Measure
Part II Year 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2018 84 Paper 4, Section II 26J Let (X, A) be a measurable space. Let T : X X be a measurable map, and µ a probability
More informationTail bound inequalities and empirical likelihood for the mean
Tail bound inequalities and empirical likelihood for the mean Sandra Vucane 1 1 University of Latvia, Riga 29 th of September, 2011 Sandra Vucane (LU) Tail bound inequalities and EL for the mean 29.09.2011
More informationLaplace s Equation. Chapter Mean Value Formulas
Chapter 1 Laplace s Equation Let be an open set in R n. A function u C 2 () is called harmonic in if it satisfies Laplace s equation n (1.1) u := D ii u = 0 in. i=1 A function u C 2 () is called subharmonic
More informationA concentration theorem for the equilibrium measure of Markov chains with nonnegative coarse Ricci curvature
A concentration theorem for the equilibrium measure of Markov chains with nonnegative coarse Ricci curvature arxiv:103.897v1 math.pr] 13 Mar 01 Laurent Veysseire Abstract In this article, we prove a concentration
More informationStat410 Probability and Statistics II (F16)
Stat4 Probability and Statistics II (F6 Exponential, Poisson and Gamma Suppose on average every /λ hours, a Stochastic train arrives at the Random station. Further we assume the waiting time between two
More informationA note on the convex infimum convolution inequality
A note on the convex infimum convolution inequality Naomi Feldheim, Arnaud Marsiglietti, Piotr Nayar, Jing Wang Abstract We characterize the symmetric measures which satisfy the one dimensional convex
More informationCharacteristic Functions and the Central Limit Theorem
Chapter 6 Characteristic Functions and the Central Limit Theorem 6.1 Characteristic Functions 6.1.1 Transforms and Characteristic Functions. There are several transforms or generating functions used in
More informationInformation geometry for bivariate distribution control
Information geometry for bivariate distribution control C.T.J.Dodson + Hong Wang Mathematics + Control Systems Centre, University of Manchester Institute of Science and Technology Optimal control of stochastic
More informationSolutions to Tutorial 11 (Week 12)
THE UIVERSITY OF SYDEY SCHOOL OF MATHEMATICS AD STATISTICS Solutions to Tutorial 11 (Week 12) MATH3969: Measure Theory and Fourier Analysis (Advanced) Semester 2, 2017 Web Page: http://sydney.edu.au/science/maths/u/ug/sm/math3969/
More informationBOUNDS ON THE DEFICIT IN THE LOGARITHMIC SOBOLEV INEQUALITY
BOUNDS ON THE DEFICIT IN THE LOGARITHMIC SOBOLEV INEQUALITY S. G. BOBKOV, N. GOZLAN, C. ROBERTO AND P.-M. SAMSON Abstract. The deficit in the logarithmic Sobolev inequality for the Gaussian measure is
More informationPart IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015
Part IA Probability Theorems Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.
More informationSpectral Gap and Concentration for Some Spherically Symmetric Probability Measures
Spectral Gap and Concentration for Some Spherically Symmetric Probability Measures S.G. Bobkov School of Mathematics, University of Minnesota, 127 Vincent Hall, 26 Church St. S.E., Minneapolis, MN 55455,
More informationConcentration inequalities: basics and some new challenges
Concentration inequalities: basics and some new challenges M. Ledoux University of Toulouse, France & Institut Universitaire de France Measure concentration geometric functional analysis, probability theory,
More informationn! (k 1)!(n k)! = F (X) U(0, 1). (x, y) = n(n 1) ( F (y) F (x) ) n 2
Order statistics Ex. 4. (*. Let independent variables X,..., X n have U(0, distribution. Show that for every x (0,, we have P ( X ( < x and P ( X (n > x as n. Ex. 4.2 (**. By using induction or otherwise,
More informationIntroduction to Self-normalized Limit Theory
Introduction to Self-normalized Limit Theory Qi-Man Shao The Chinese University of Hong Kong E-mail: qmshao@cuhk.edu.hk Outline What is the self-normalization? Why? Classical limit theorems Self-normalized
More informationInformation Theoretic Asymptotic Approximations for Distributions of Statistics
Information Theoretic Asymptotic Approximations for Distributions of Statistics Ximing Wu Department of Agricultural Economics Texas A&M University Suojin Wang Department of Statistics Texas A&M University
More informationCVaR and Examples of Deviation Risk Measures
CVaR and Examples of Deviation Risk Measures Jakub Černý Department of Probability and Mathematical Statistics Stochastic Modelling in Economics and Finance November 10, 2014 1 / 25 Contents CVaR - Dual
More informationMod-φ convergence I: examples and probabilistic estimates
Mod-φ convergence I: examples and probabilistic estimates Valentin Féray (joint work with Pierre-Loïc Méliot and Ashkan Nikeghbali) Institut für Mathematik, Universität Zürich Summer school in Villa Volpi,
More information1 Fourier Integrals of finite measures.
18.103 Fall 2013 1 Fourier Integrals of finite measures. Denote the space of finite, positive, measures on by M + () = {µ : µ is a positive measure on ; µ() < } Proposition 1 For µ M + (), we define the
More informationOn the Entropy of Sums of Bernoulli Random Variables via the Chen-Stein Method
On the Entropy of Sums of Bernoulli Random Variables via the Chen-Stein Method Igal Sason Department of Electrical Engineering Technion - Israel Institute of Technology Haifa 32000, Israel ETH, Zurich,
More informationMath Camp II. Calculus. Yiqing Xu. August 27, 2014 MIT
Math Camp II Calculus Yiqing Xu MIT August 27, 2014 1 Sequence and Limit 2 Derivatives 3 OLS Asymptotics 4 Integrals Sequence Definition A sequence {y n } = {y 1, y 2, y 3,..., y n } is an ordered set
More informationSeries 7, May 22, 2018 (EM Convergence)
Exercises Introduction to Machine Learning SS 2018 Series 7, May 22, 2018 (EM Convergence) Institute for Machine Learning Dept. of Computer Science, ETH Zürich Prof. Dr. Andreas Krause Web: https://las.inf.ethz.ch/teaching/introml-s18
More informationFrom the Newton equation to the wave equation in some simple cases
From the ewton equation to the wave equation in some simple cases Xavier Blanc joint work with C. Le Bris (EPC) and P.-L. Lions (Collège de France) Université Paris Diderot, FRACE http://www.ann.jussieu.fr/
More informationA Criterion for the Compound Poisson Distribution to be Maximum Entropy
A Criterion for the Compound Poisson Distribution to be Maximum Entropy Oliver Johnson Department of Mathematics University of Bristol University Walk Bristol, BS8 1TW, UK. Email: O.Johnson@bristol.ac.uk
More informationEntropic structure of the Landau equation. Coulomb interaction
with Coulomb interaction Laurent Desvillettes IMJ-PRG, Université Paris Diderot May 15, 2017 Use of the entropy principle for specific equations Spatially Homogeneous Kinetic equations: 1 Fokker-Planck:
More informationConcentration Properties of Restricted Measures with Applications to Non-Lipschitz Functions
Concentration Properties of Restricted Measures with Applications to Non-Lipschitz Functions S G Bobkov, P Nayar, and P Tetali April 4, 6 Mathematics Subject Classification Primary 6Gxx Keywords and phrases
More informationECE534, Spring 2018: Solutions for Problem Set #3
ECE534, Spring 08: Solutions for Problem Set #3 Jointly Gaussian Random Variables and MMSE Estimation Suppose that X, Y are jointly Gaussian random variables with µ X = µ Y = 0 and σ X = σ Y = Let their
More informationAsymptotics for posterior hazards
Asymptotics for posterior hazards Pierpaolo De Blasi University of Turin 10th August 2007, BNR Workshop, Isaac Newton Intitute, Cambridge, UK Joint work with Giovanni Peccati (Université Paris VI) and
More informationSTA205 Probability: Week 8 R. Wolpert
INFINITE COIN-TOSS AND THE LAWS OF LARGE NUMBERS The traditional interpretation of the probability of an event E is its asymptotic frequency: the limit as n of the fraction of n repeated, similar, and
More informationSupplement: Universal Self-Concordant Barrier Functions
IE 8534 1 Supplement: Universal Self-Concordant Barrier Functions IE 8534 2 Recall that a self-concordant barrier function for K is a barrier function satisfying 3 F (x)[h, h, h] 2( 2 F (x)[h, h]) 3/2,
More informationMeasure-theoretic probability
Measure-theoretic probability Koltay L. VEGTMAM144B November 28, 2012 (VEGTMAM144B) Measure-theoretic probability November 28, 2012 1 / 27 The probability space De nition The (Ω, A, P) measure space is
More informationOptimization and Simulation
Optimization and Simulation Variance reduction Michel Bierlaire Transport and Mobility Laboratory School of Architecture, Civil and Environmental Engineering Ecole Polytechnique Fédérale de Lausanne M.
More informationHeat Flow Derivatives and Minimum Mean-Square Error in Gaussian Noise
Heat Flow Derivatives and Minimum Mean-Square Error in Gaussian Noise Michel Ledoux University of Toulouse, France Abstract We connect recent developments on Gaussian noise estimation and the Minimum Mean-Square
More informationCALCULUS JIA-MING (FRANK) LIOU
CALCULUS JIA-MING (FRANK) LIOU Abstract. Contents. Power Series.. Polynomials and Formal Power Series.2. Radius of Convergence 2.3. Derivative and Antiderivative of Power Series 4.4. Power Series Expansion
More informationOn asymmetric quantum hypothesis testing
On asymmetric quantum hypothesis testing JMP, Vol 57, 6, 10.1063/1.4953582 arxiv:1612.01464 Cambyse Rouzé (Cambridge) Joint with Nilanjana Datta (University of Cambridge) and Yan Pautrat (Paris-Saclay)
More informationThe largest eigenvalues of the sample covariance matrix. in the heavy-tail case
The largest eigenvalues of the sample covariance matrix 1 in the heavy-tail case Thomas Mikosch University of Copenhagen Joint work with Richard A. Davis (Columbia NY), Johannes Heiny (Aarhus University)
More informationAnalysis Qualifying Exam
Analysis Qualifying Exam Spring 2017 Problem 1: Let f be differentiable on R. Suppose that there exists M > 0 such that f(k) M for each integer k, and f (x) M for all x R. Show that f is bounded, i.e.,
More informationC.7. Numerical series. Pag. 147 Proof of the converging criteria for series. Theorem 5.29 (Comparison test) Let a k and b k be positive-term series
C.7 Numerical series Pag. 147 Proof of the converging criteria for series Theorem 5.29 (Comparison test) Let and be positive-term series such that 0, for any k 0. i) If the series converges, then also
More informationPointwise convergence rates and central limit theorems for kernel density estimators in linear processes
Pointwise convergence rates and central limit theorems for kernel density estimators in linear processes Anton Schick Binghamton University Wolfgang Wefelmeyer Universität zu Köln Abstract Convergence
More informationConsistency of the maximum likelihood estimator for general hidden Markov models
Consistency of the maximum likelihood estimator for general hidden Markov models Jimmy Olsson Centre for Mathematical Sciences Lund University Nordstat 2012 Umeå, Sweden Collaborators Hidden Markov models
More informationCENTRAL LIMIT THEOREM AND DIOPHANTINE APPROXIMATIONS. Sergey G. Bobkov. December 24, 2016
CENTRAL LIMIT THEOREM AND DIOPHANTINE APPROXIMATIONS Sergey G. Bobkov December 24, 206 Abstract Let F n denote the distribution function of the normalized sum Z n = X + +X n /σ n of i.i.d. random variables
More informationECE 4400:693 - Information Theory
ECE 4400:693 - Information Theory Dr. Nghi Tran Lecture 8: Differential Entropy Dr. Nghi Tran (ECE-University of Akron) ECE 4400:693 Lecture 1 / 43 Outline 1 Review: Entropy of discrete RVs 2 Differential
More informationThe Lindeberg central limit theorem
The Lindeberg central limit theorem Jordan Bell jordan.bell@gmail.com Department of Mathematics, University of Toronto May 29, 205 Convergence in distribution We denote by P d the collection of Borel probability
More informationWeak and strong moments of l r -norms of log-concave vectors
Weak and strong moments of l r -norms of log-concave vectors Rafał Latała based on the joint work with Marta Strzelecka) University of Warsaw Minneapolis, April 14 2015 Log-concave measures/vectors A measure
More informationCorrelation Detection and an Operational Interpretation of the Rényi Mutual Information
Correlation Detection and an Operational Interpretation of the Rényi Mutual Information Masahito Hayashi 1, Marco Tomamichel 2 1 Graduate School of Mathematics, Nagoya University, and Centre for Quantum
More informationREGULARIZED DISTRIBUTIONS AND ENTROPIC STABILITY OF CRAMER S CHARACTERIZATION OF THE NORMAL LAW. 1. Introduction
REGULARIZED DISTRIBUTIONS AND ENTROPIC STABILITY OF CRAMER S CHARACTERIZATION OF THE NORMAL LAW S. G. BOBKOV,4, G. P. CHISTYAKOV 2,4, AND F. GÖTZE3,4 Abstract. For regularized distributions we establish
More informationELEMENTS OF PROBABILITY THEORY
ELEMENTS OF PROBABILITY THEORY Elements of Probability Theory A collection of subsets of a set Ω is called a σ algebra if it contains Ω and is closed under the operations of taking complements and countable
More informationCS229T/STATS231: Statistical Learning Theory. Lecturer: Tengyu Ma Lecture 11 Scribe: Jongho Kim, Jamie Kang October 29th, 2018
CS229T/STATS231: Statistical Learning Theory Lecturer: Tengyu Ma Lecture 11 Scribe: Jongho Kim, Jamie Kang October 29th, 2018 1 Overview This lecture mainly covers Recall the statistical theory of GANs
More informationNotes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed
18.466 Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed 1. MLEs in exponential families Let f(x,θ) for x X and θ Θ be a likelihood function, that is, for present purposes,
More informationMarch 1, Florida State University. Concentration Inequalities: Martingale. Approach and Entropy Method. Lizhe Sun and Boning Yang.
Florida State University March 1, 2018 Framework 1. (Lizhe) Basic inequalities Chernoff bounding Review for STA 6448 2. (Lizhe) Discrete-time martingales inequalities via martingale approach 3. (Boning)
More informationTail inequalities for additive functionals and empirical processes of. Markov chains
Tail inequalities for additive functionals and empirical processes of geometrically ergodic Markov chains University of Warsaw Banff, June 2009 Geometric ergodicity Definition A Markov chain X = (X n )
More informationDistance-Divergence Inequalities
Distance-Divergence Inequalities Katalin Marton Alfréd Rényi Institute of Mathematics of the Hungarian Academy of Sciences Motivation To find a simple proof of the Blowing-up Lemma, proved by Ahlswede,
More informationtopics about f-divergence
topics about f-divergence Presented by Liqun Chen Mar 16th, 2018 1 Outline 1 f-gan: Training Generative Neural Samplers using Variational Experiments 2 f-gans in an Information Geometric Nutshell Experiments
More informationEntropy and the Additive Combinatorics of Probability Densities on LCA groups
Entropy and the Additive Combinatorics of Probability Densities on LCA groups Mokshay Madiman University of Delaware Based on joint work with Ioannis Kontoyiannis, Athens Univ. of Economics Jiange Li,
More informationStrong approximation for additive functionals of geometrically ergodic Markov chains
Strong approximation for additive functionals of geometrically ergodic Markov chains Florence Merlevède Joint work with E. Rio Université Paris-Est-Marne-La-Vallée (UPEM) Cincinnati Symposium on Probability
More informationGeometry of log-concave Ensembles of random matrices
Geometry of log-concave Ensembles of random matrices Nicole Tomczak-Jaegermann Joint work with Radosław Adamczak, Rafał Latała, Alexander Litvak, Alain Pajor Cortona, June 2011 Nicole Tomczak-Jaegermann
More informationReducing subspaces. Rowan Killip 1 and Christian Remling 2 January 16, (to appear in J. Funct. Anal.)
Reducing subspaces Rowan Killip 1 and Christian Remling 2 January 16, 2001 (to appear in J. Funct. Anal.) 1. University of Pennsylvania, 209 South 33rd Street, Philadelphia PA 19104-6395, USA. On leave
More informationLecture 35: December The fundamental statistical distances
36-705: Intermediate Statistics Fall 207 Lecturer: Siva Balakrishnan Lecture 35: December 4 Today we will discuss distances and metrics between distributions that are useful in statistics. I will be lose
More informationBIHARMONIC WAVE MAPS INTO SPHERES
BIHARMONIC WAVE MAPS INTO SPHERES SEBASTIAN HERR, TOBIAS LAMM, AND ROLAND SCHNAUBELT Abstract. A global weak solution of the biharmonic wave map equation in the energy space for spherical targets is constructed.
More informationA Hierarchy of Information Quantities for Finite Block Length Analysis of Quantum Tasks
A Hierarchy of Information Quantities for Finite Block Length Analysis of Quantum Tasks Marco Tomamichel, Masahito Hayashi arxiv: 1208.1478 Also discussing results of: Second Order Asymptotics for Quantum
More informationStatistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation
Statistics 62: L p spaces, metrics on spaces of probabilites, and connections to estimation Moulinath Banerjee December 6, 2006 L p spaces and Hilbert spaces We first formally define L p spaces. Consider
More informationLecture 6: Gaussian Channels. Copyright G. Caire (Sample Lectures) 157
Lecture 6: Gaussian Channels Copyright G. Caire (Sample Lectures) 157 Differential entropy (1) Definition 18. The (joint) differential entropy of a continuous random vector X n p X n(x) over R is: Z h(x
More informationConvergence rates in weighted L 1 spaces of kernel density estimators for linear processes
Alea 4, 117 129 (2008) Convergence rates in weighted L 1 spaces of kernel density estimators for linear processes Anton Schick and Wolfgang Wefelmeyer Anton Schick, Department of Mathematical Sciences,
More informationOXPORD UNIVERSITY PRESS
Concentration Inequalities A Nonasymptotic Theory of Independence STEPHANE BOUCHERON GABOR LUGOSI PASCAL MASS ART OXPORD UNIVERSITY PRESS CONTENTS 1 Introduction 1 1.1 Sums of Independent Random Variables
More informationMachine learning - HT Maximum Likelihood
Machine learning - HT 2016 3. Maximum Likelihood Varun Kanade University of Oxford January 27, 2016 Outline Probabilistic Framework Formulate linear regression in the language of probability Introduce
More informationSection 8.2. Asymptotic normality
30 Section 8.2. Asymptotic normality We assume that X n =(X 1,...,X n ), where the X i s are i.i.d. with common density p(x; θ 0 ) P= {p(x; θ) :θ Θ}. We assume that θ 0 is identified in the sense that
More information