Theoretical Statistics. Lecture 19.
|
|
- Luke Casey
- 6 years ago
- Views:
Transcription
1 Theoretical Statistics. Lecture 19. Peter Bartlett 1. Functional delta method. [vdv20] 2. Differentiability in normed spaces: Hadamard derivatives. [vdv20] 3. Quantile estimates. [vdv21] 1
2 Recall: Delta method Theorem: If φ : R k R m is differentiable at θ, and n(t n θ) T, then n(φ(tn ) φ(θ)) φ θ(t) n(φ(tn ) φ(θ)) φ θ( n(t n θ)) P 0. Here, φ θ is the derivative (linear map) satisfying φ(θ+h) φ(θ) = φ θ(h)+o( h ) for h 0. 2
3 Functional delta method What about more complex cases? For instance, what if we are interested in a property of the probability distributionφ(p), and we use an estimator φ(p n )? As a first example, we ll get some intuition by considering a Taylor series expansion about the value φ(p). But what does it mean for the Taylor series expansion ofφto exist? To make this rigorous, we need to consider φ as a map between normed linear spaces, and we need it to be appropriately differentiable. The right notion is Hadamard differentiability. 3
4 Functional delta method Given X 1,...,X n from a distributionp, and a parameter of interest φ(p), suppose that we estimate φ(p) byφ(p n ), where P n is the empirical distribution. What is the asymptotic behavior of φ(p n )? Define the derivative φ (k) P (H) of the map t φ(p +th) at t = 0, where H is a perturbation direction. Then if the derivatives exist we have a Taylor series expansion: φ(p+th) φ(p) = tφ P(H)+ 1 2 t2 φ (2) P (H)+ + 1 m! tm φ (m) P (H)+o(tm ). 4
5 Functional delta method Substitutingt = 1/ n and H = G n (= n(p n P)X), gives the von Mises expansion, th = 1 n G n = P n P, φ(p n ) φ(p) = 1 n φ P(G n )+ 1 2n φ(2) P (G n)+ + 1 m!n m/2φ(m) P (G n)+o(n m/2 ). (This is not rigorous: it requires a stronger notion of differentiability, because the perturbation direction is now random: H = G n.) 5
6 Functional delta method If φ P is a linear map, we have φ(p n ) φ(p) = 1 n φ P(G n )+o(n 1/2 ) = 1 n n φ P(δ Xi P)+o(n 1/2 ), i=1 where δ Xi is the discrete distribution concentrated onx i. So ifφ P (δ X P) is mean zero and finite variance, we have that n(φ(pn ) φ(p)) is asymptotically normal. 6
7 Aside: influence functions The function x φ P (δ x P) is the influence function ofφ: φ P(δ x P) = d dt φ(p +t(δ x P)) t=0 = d dt φ((1 t)p +tδ x). t=0 This measures the impact of changing P by mixing in a tiny amount ofδ Xi. It is important in robust statistics. 7
8 Example: Quantiles Suppose we wish to estimate the pth quantile of P, where p (0,1). We ll write it as φ(f) for the cumulative distribution functionf of P. IfF is continuous at the appropriate point, we can define φ(f) = F 1 (p). We need to calculate φ F(s x F) = d dt φ((1 t)f +ts x), t=0 where s x is the cdf of δ x, i.e., the step function a 1[a x]. 8
9 Example: Quantiles WriteF t = (1 t)f +ts x, then differentiate both sides of the equation p = F t (φ(f t )) at t = 0 (ignoring the non-differentiability at φ(f t ) = x): 0 = d F t (φ(f t )) d t t=0 = d (1 t)f(φ(f t ))+ts x (φ(f t )) d t t=0 = F(φ(F t ))+(1 t)f(φ(f t ))φ F t (s x F)+s x (φ(f t )) t=0 = F(φ(F))+f(φ(F))φ F(s x F)+s x (φ(f)), where f is the density of P. 9
10 Example: Quantiles Rearranging, we have φ F(s x F) = F(φ(F)) s x(φ(f)) f(φ(f)) = p s x(f 1 (p)) f(f 1 (p)) p 1 f(f = 1 (p)) ifx F 1 (p), ifx > F 1 (p). p f(f 1 (p)) 10
11 Example: Quantiles So Eφ F(s X F) = p(p 1)+(1 p)p f(f 1 (p)) varφ F(s X F) = p(1 p)2 +(1 p)p 2 f(f 1 (p)) 2 = p(1 p) f(f 1 (p)) 2. = 0 And we have n(φ(fn ) φ(f)) N ( 0, ) p(1 p) (f(f 1 (p))) 2. 11
12 Differentiability of functions in normed spaces How do we make this rigorous? Functional delta method. Theorem: Supposeφ : D E, whered ande are normed linear spaces. Suppose the statistict n : Ω n D satisfies n(t n θ) T for a random element T ind 0 D. If φ is Hadamard differentiable at θ tangentially tod 0 then n(φ(tn ) φ(θ)) φ θ(t). If we can extend φ : D 0 E to a continuous mapφ : D E, then n(φ(tn ) φ(θ)) = φ θ( n(t n θ))+o P (1). 12
13 Differentiability of functions in normed spaces For the von Mises expansion, we considered φ P(H) = d dt φ(p +th) t=0 for some perturbation direction H. This is the Gateaux derivative. Definition: φ : D E is Gateaux differentiable at θ D if h D, φ θ(h) E, s.t. as t 0, φ(θ+th) φ(θ) φ t θ(h) 0. φ θ might not be (i) linear, or (ii) continuous (even if it s linear, it might not be continuous ifd,e are infinite dimensional). 13
14 Differentiability of functions in normed spaces Definition: φ : D E is Hadamard differentiable at θ D if φ θ : D E (linear, continuous), h D, ift 0, h t h 0, then φ(θ+th t ) φ(θ) φ t θ(h) 0. Gateaux requires the difference quotients to converge to some φ θ (h) for each direction h; Hadamard requires a singleφ θ that works for every direction h. It is equivalent to the convergence in the definition of Gateaux differentiability being uniform over h in a compact subset of D. 14
15 Differentiability of functions in normed spaces Definition: φ : D E is Fréchet differentiable at θ D if φ θ : D E (linear, continuous), h D, if h 0, then φ(θ +h) φ(θ) φ θ (h) h 0. Hadamard requires the difference quotients to converge to zero for each direction, possibly with different rates for different directions; Fréchet requires the same rate for each direction. They are equivalent for D = R d. 15
16 Differentiability of functions in normed spaces We ll consider Hadamard differentiability, but allow the weaker notion of tangential differentiability: the limiting directions h can be constrained. Definition: φ : D E is Hadamard differentiable at θ D tangentially tod 0 D if φ θ : D 0 E (linear, continuous), h D 0, ift 0, h t h 0, then φ(θ+th t ) φ(θ) t φ θ(h) 0. 16
17 Differentiability of functions in normed spaces Theorem: Supposeφ : D E, whered ande are normed linear spaces. Suppose the statistict n : Ω n D satisfies n(t n θ) T for a random element T ind 0 D. If φ is Hadamard differentiable at θ tangentially tod 0 then n(φ(tn ) φ(θ)) φ θ(t). If we can extend φ : D 0 E to a continuous mapφ : D E, then n(φ(tn ) φ(θ)) = φ θ( n(t n θ))+o P (1). 17
18 Differentiability of functions in normed spaces Proof: Based on continuous mapping theorem. Consider the maps f n (h) = ( ( n φ θ + 1 ) ) h φ(θ). n Hadamard differentiability implies that for any sequence h n h D 0, we have f n (h n ) φ θ (h). Sof n( n(t n θ)) φ θ (T). (Second statement follows from continuous mapping theorem for the function h (φ(h),φ θ (h)).) 18
19 Differentiability of functions in normed spaces The chain rule lets us determine Hadamard derivatives of a composition of maps. Theorem: Suppose φ : D E, ψ : E F, where D, E and F are normed linear spaces. If 1. φ is Hadamard differentiable at θ tangentially tod 0, and 2. ψ is Hadamard differentiable at φ(θ) tangentially toφ θ (D 0), then ψ φ : D F is Hadamard differentiable at θ tangentially to D 0, with derivative ψ φ(θ) φ θ. 19
20 Example: Quantiles Definition: The quantile function of F isf 1 : (0,1) R, F 1 (p) = inf{x : F(x) p}. [PICTURE] Forp (0,1) and x R, F 1 (p) x p F(x), which implies the quantile transformation: for U uniform on(0,1), F 1 (U) F. 20
21 Example: Quantiles Also,F(F 1 (p)) p, with equality unless F has a discontinuity at F 1 (p). This implies the probability integral transformation: forx F, F(X) is uniform on [0,1] ifff is continuous onr, because F(X) = F(F 1 (U)) = U in that case. Finally, F 1 (F(x)) x, with equality unlessf is flat to the left of x. Thus, F 1 is an inverse (i.e., F 1 (F(x)) = x and F(F 1 (p)) = p for all x and p) ifff is continuous and strictly increasing. 21
22 Empirical quantile function For a sample with distribution function F, define the empirical quantile function as the quantile function Fn 1 of the empirical distribution function F n. where i is chosen such that F 1 n (p) = inf{x : F n (x) p} = X n(i), i 1 n < p i n, and X n(1),...,x n(n) are the order statistics of the sample, that is, X n(1) X n(n) and ( Xn(1),...,X n(n) ) is a permutation of the sample (X 1,...,X n ). 22
Asymptotic statistics using the Functional Delta Method
Quantiles, Order Statistics and L-Statsitics TU Kaiserslautern 15. Februar 2015 Motivation Functional The delta method introduced in chapter 3 is an useful technique to turn the weak convergence of random
More informationTheoretical Statistics. Lecture 17.
Theoretical Statistics. Lecture 17. Peter Bartlett 1. Asymptotic normality of Z-estimators: classical conditions. 2. Asymptotic equicontinuity. 1 Recall: Delta method Theorem: Supposeφ : R k R m is differentiable
More informationTheoretical Statistics. Lecture 22.
Theoretical Statistics. Lecture 22. Peter Bartlett 1. Recall: Asymptotic testing. 2. Quadratic mean differentiability. 3. Local asymptotic normality. [vdv7] 1 Recall: Asymptotic testing Consider the asymptotics
More informationLecture 16: Sample quantiles and their asymptotic properties
Lecture 16: Sample quantiles and their asymptotic properties Estimation of quantiles (percentiles Suppose that X 1,...,X n are i.i.d. random variables from an unknown nonparametric F For p (0,1, G 1 (p
More informationTheoretical Statistics. Lecture 23.
Theoretical Statistics. Lecture 23. Peter Bartlett 1. Recall: QMD and local asymptotic normality. [vdv7] 2. Convergence of experiments, maximum likelihood. 3. Relative efficiency of tests. [vdv14] 1 Local
More informationEmpirical Processes & Survival Analysis. The Functional Delta Method
STAT/BMI 741 University of Wisconsin-Madison Empirical Processes & Survival Analysis Lecture 3 The Functional Delta Method Lu Mao lmao@biostat.wisc.edu 3-1 Objectives By the end of this lecture, you will
More informationRobust Inference. A central concern in robust statistics is how a functional of a CDF behaves as the distribution is perturbed.
Robust Inference Although the statistical functions we have considered have intuitive interpretations, the question remains as to what are the most useful distributional measures by which to describe a
More informationEfficiency of Profile/Partial Likelihood in the Cox Model
Efficiency of Profile/Partial Likelihood in the Cox Model Yuichi Hirose School of Mathematics, Statistics and Operations Research, Victoria University of Wellington, New Zealand Summary. This paper shows
More informationAsymptotic Distributions for the Nelson-Aalen and Kaplan-Meier estimators and for test statistics.
Asymptotic Distributions for the Nelson-Aalen and Kaplan-Meier estimators and for test statistics. Dragi Anevski Mathematical Sciences und University November 25, 21 1 Asymptotic distributions for statistical
More information7 Influence Functions
7 Influence Functions The influence function is used to approximate the standard error of a plug-in estimator. The formal definition is as follows. 7.1 Definition. The Gâteaux derivative of T at F in the
More informationX n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2)
14:17 11/16/2 TOPIC. Convergence in distribution and related notions. This section studies the notion of the so-called convergence in distribution of real random variables. This is the kind of convergence
More informationRecall the Basics of Hypothesis Testing
Recall the Basics of Hypothesis Testing The level of significance α, (size of test) is defined as the probability of X falling in w (rejecting H 0 ) when H 0 is true: P(X w H 0 ) = α. H 0 TRUE H 1 TRUE
More informationStochastic Convergence, Delta Method & Moment Estimators
Stochastic Convergence, Delta Method & Moment Estimators Seminar on Asymptotic Statistics Daniel Hoffmann University of Kaiserslautern Department of Mathematics February 13, 2015 Daniel Hoffmann (TU KL)
More informationP-adic Functions - Part 1
P-adic Functions - Part 1 Nicolae Ciocan 22.11.2011 1 Locally constant functions Motivation: Another big difference between p-adic analysis and real analysis is the existence of nontrivial locally constant
More informationIEOR 3106: Introduction to Operations Research: Stochastic Models. Fall 2011, Professor Whitt. Class Lecture Notes: Thursday, September 15.
IEOR 3106: Introduction to Operations Research: Stochastic Models Fall 2011, Professor Whitt Class Lecture Notes: Thursday, September 15. Random Variables, Conditional Expectation and Transforms 1. Random
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued
Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations Research
More informationAsymptotic distribution of the sample average value-at-risk
Asymptotic distribution of the sample average value-at-risk Stoyan V. Stoyanov Svetlozar T. Rachev September 3, 7 Abstract In this paper, we prove a result for the asymptotic distribution of the sample
More informationAn effective form of Rademacher s theorem
An effective form of Rademacher s theorem Alex Galicki 1 joint work with Daniel Turetsky 2 1 University of Auckland, 2 Kurt Gödel Research Center 12th June, 2014 Alex Galicki (Auckland) An effective form
More informationChapter 9: Hypothesis Testing Sections
Chapter 9: Hypothesis Testing Sections 9.1 Problems of Testing Hypotheses 9.2 Testing Simple Hypotheses 9.3 Uniformly Most Powerful Tests Skip: 9.4 Two-Sided Alternatives 9.6 Comparing the Means of Two
More informationST745: Survival Analysis: Nonparametric methods
ST745: Survival Analysis: Nonparametric methods Eric B. Laber Department of Statistics, North Carolina State University February 5, 2015 The KM estimator is used ubiquitously in medical studies to estimate
More informationORDER STATISTICS, QUANTILES, AND SAMPLE QUANTILES
ORDER STATISTICS, QUANTILES, AND SAMPLE QUANTILES 1. Order statistics Let X 1,...,X n be n real-valued observations. One can always arrangetheminordertogettheorder statisticsx (1) X (2) X (n). SinceX (k)
More informationLecture 4: September Reminder: convergence of sequences
36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 4: September 6 In this lecture we discuss the convergence of random variables. At a high-level, our first few lectures focused
More informationReproducing Kernel Hilbert Spaces
Reproducing Kernel Hilbert Spaces Lorenzo Rosasco 9.520 Class 03 February 11, 2009 About this class Goal To introduce a particularly useful family of hypothesis spaces called Reproducing Kernel Hilbert
More informationProbability Distribution And Density For Functional Random Variables
Probability Distribution And Density For Functional Random Variables E. Cuvelier 1 M. Noirhomme-Fraiture 1 1 Institut d Informatique Facultés Universitaires Notre-Dame de la paix Namur CIL Research Contact
More informationAsymptotic Statistics-VI. Changliang Zou
Asymptotic Statistics-VI Changliang Zou Kolmogorov-Smirnov distance Example (Kolmogorov-Smirnov confidence intervals) We know given α (0, 1), there is a well-defined d = d α,n such that, for any continuous
More informationUses of Asymptotic Distributions: In order to get distribution theory, we need to norm the random variable; we usually look at n 1=2 ( X n ).
1 Economics 620, Lecture 8a: Asymptotics II Uses of Asymptotic Distributions: Suppose X n! 0 in probability. (What can be said about the distribution of X n?) In order to get distribution theory, we need
More informationChapter 7 Statistical Functionals and the Delta Method
Chapter 7 Statistical Functionals and the Delta Method. Estimators as Functionals of F n or P n 2. Continuity of Functionals of F or P 3. Metrics for Distribution Functions F and Probability Distributions
More informationsimple if it completely specifies the density of x
3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely
More information3 Compact Operators, Generalized Inverse, Best- Approximate Solution
3 Compact Operators, Generalized Inverse, Best- Approximate Solution As we have already heard in the lecture a mathematical problem is well - posed in the sense of Hadamard if the following properties
More informationAnalysis Qualifying Exam
Analysis Qualifying Exam Spring 2017 Problem 1: Let f be differentiable on R. Suppose that there exists M > 0 such that f(k) M for each integer k, and f (x) M for all x R. Show that f is bounded, i.e.,
More informationChapter 6. Order Statistics and Quantiles. 6.1 Extreme Order Statistics
Chapter 6 Order Statistics and Quantiles 61 Extreme Order Statistics Suppose we have a finite sample X 1,, X n Conditional on this sample, we define the values X 1),, X n) to be a permutation of X 1,,
More informationMeasure and Integration: Solutions of CW2
Measure and Integration: s of CW2 Fall 206 [G. Holzegel] December 9, 206 Problem of Sheet 5 a) Left (f n ) and (g n ) be sequences of integrable functions with f n (x) f (x) and g n (x) g (x) for almost
More informationReproducing Kernel Hilbert Spaces
Reproducing Kernel Hilbert Spaces Lorenzo Rosasco 9.520 Class 03 February 12, 2007 About this class Goal To introduce a particularly useful family of hypothesis spaces called Reproducing Kernel Hilbert
More informationMonte Carlo Studies. The response in a Monte Carlo study is a random variable.
Monte Carlo Studies The response in a Monte Carlo study is a random variable. The response in a Monte Carlo study has a variance that comes from the variance of the stochastic elements in the data-generating
More informationFourth Week: Lectures 10-12
Fourth Week: Lectures 10-12 Lecture 10 The fact that a power series p of positive radius of convergence defines a function inside its disc of convergence via substitution is something that we cannot ignore
More informationTheoretical Statistics. Lecture 14.
Theoretical Statistics. Lecture 14. Peter Bartlett Metric entropy. 1. Chaining: Dudley s entropy integral 1 Recall: Sub-Gaussian processes Definition: A stochastic process θ X θ with indexing set T is
More informationMAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9
MAT 570 REAL ANALYSIS LECTURE NOTES PROFESSOR: JOHN QUIGG SEMESTER: FALL 204 Contents. Sets 2 2. Functions 5 3. Countability 7 4. Axiom of choice 8 5. Equivalence relations 9 6. Real numbers 9 7. Extended
More informationTUTORIAL 8 SOLUTIONS #
TUTORIAL 8 SOLUTIONS #9.11.21 Suppose that a single observation X is taken from a uniform density on [0,θ], and consider testing H 0 : θ = 1 versus H 1 : θ =2. (a) Find a test that has significance level
More informationLecture 12. F o s, (1.1) F t := s>t
Lecture 12 1 Brownian motion: the Markov property Let C := C(0, ), R) be the space of continuous functions mapping from 0, ) to R, in which a Brownian motion (B t ) t 0 almost surely takes its value. Let
More informationStatistics 3657 : Moment Approximations
Statistics 3657 : Moment Approximations Preliminaries Suppose that we have a r.v. and that we wish to calculate the expectation of g) for some function g. Of course we could calculate it as Eg)) by the
More informationAsymptotics of minimax stochastic programs
Asymptotics of minimax stochastic programs Alexander Shapiro Abstract. We discuss in this paper asymptotics of the sample average approximation (SAA) of the optimal value of a minimax stochastic programming
More informationRecall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n
Chapter 9 Hypothesis Testing 9.1 Wald, Rao, and Likelihood Ratio Tests Suppose we wish to test H 0 : θ = θ 0 against H 1 : θ θ 0. The likelihood-based results of Chapter 8 give rise to several possible
More informationInterpolation. Chapter Interpolation. 7.2 Existence, Uniqueness and conditioning
76 Chapter 7 Interpolation 7.1 Interpolation Definition 7.1.1. Interpolation of a given function f defined on an interval [a,b] by a polynomial p: Given a set of specified points {(t i,y i } n with {t
More informationExtreme Value Analysis and Spatial Extremes
Extreme Value Analysis and Department of Statistics Purdue University 11/07/2013 Outline Motivation 1 Motivation 2 Extreme Value Theorem and 3 Bayesian Hierarchical Models Copula Models Max-stable Models
More informationSTAT 6385 Survey of Nonparametric Statistics. Order Statistics, EDF and Censoring
STAT 6385 Survey of Nonparametric Statistics Order Statistics, EDF and Censoring Quantile Function A quantile (or a percentile) of a distribution is that value of X such that a specific percentage of the
More informationLecture 3 January 16
Stats 3b: Theory of Statistics Winter 28 Lecture 3 January 6 Lecturer: Yu Bai/John Duchi Scribe: Shuangning Li, Theodor Misiakiewicz Warning: these notes may contain factual errors Reading: VDV Chater
More information6.1 Moment Generating and Characteristic Functions
Chapter 6 Limit Theorems The power statistics can mostly be seen when there is a large collection of data points and we are interested in understanding the macro state of the system, e.g., the average,
More informationSTAT Section 2.1: Basic Inference. Basic Definitions
STAT 518 --- Section 2.1: Basic Inference Basic Definitions Population: The collection of all the individuals of interest. This collection may be or even. Sample: A collection of elements of the population.
More informationChapter 6. Convergence. Probability Theory. Four different convergence concepts. Four different convergence concepts. Convergence in probability
Probability Theory Chapter 6 Convergence Four different convergence concepts Let X 1, X 2, be a sequence of (usually dependent) random variables Definition 1.1. X n converges almost surely (a.s.), or with
More informationSOLUTIONS TO MATH68181 EXTREME VALUES AND FINANCIAL RISK EXAM
SOLUTIONS TO MATH68181 EXTREME VALUES AND FINANCIAL RISK EXAM Solutions to Question A1 a) The marginal cdfs of F X,Y (x, y) = [1 + exp( x) + exp( y) + (1 α) exp( x y)] 1 are F X (x) = F X,Y (x, ) = [1
More informationVARIANCE REDUCTION BY SMOOTHING REVISITED BRIAN J E R S K Y (POMONA)
PROBABILITY AND MATHEMATICAL STATISTICS Vol. 33, Fasc. 1 2013), pp. 79 92 VARIANCE REDUCTION BY SMOOTHING REVISITED BY ANDRZEJ S. KO Z E K SYDNEY) AND BRIAN J E R S K Y POMONA) Abstract. Smoothing is a
More informationDifferentiable Functions
Differentiable Functions Let S R n be open and let f : R n R. We recall that, for x o = (x o 1, x o,, x o n S the partial derivative of f at the point x o with respect to the component x j is defined as
More information1 + lim. n n+1. f(x) = x + 1, x 1. and we check that f is increasing, instead. Using the quotient rule, we easily find that. 1 (x + 1) 1 x (x + 1) 2 =
Chapter 5 Sequences and series 5. Sequences Definition 5. (Sequence). A sequence is a function which is defined on the set N of natural numbers. Since such a function is uniquely determined by its values
More informationLecture 18: L-estimators and trimmed sample mean
Lecture 18: L-estimators and trimmed sample mean L-functional and L-estimator For a function J(t) on [0,1], define the L-functional as T (G) = xj(g(x))dg(x), G F. If X 1,...,X n are i.i.d. from F and T
More informationMAT 473 Intermediate Real Analysis II
MAT 473 Intermediate Real Analysis II John Quigg Spring 2009 revised February 5, 2009 Derivatives Here our purpose is to give a rigorous foundation of the principles of differentiation in R n. Much of
More informationStat 260/CS Learning in Sequential Decision Problems. Peter Bartlett
Stat 260/CS 294-102. Learning in Sequential Decision Problems. Peter Bartlett 1. Multi-armed bandit algorithms. Concentration inequalities. P(X ǫ) exp( ψ (ǫ))). Cumulant generating function bounds. Hoeffding
More informationReproducing Kernel Hilbert Spaces
Reproducing Kernel Hilbert Spaces Lorenzo Rosasco 9.520 Class 03 February 9, 2011 About this class Goal To introduce a particularly useful family of hypothesis spaces called Reproducing Kernel Hilbert
More informationTheoretical Statistics. Lecture 25.
Theoretical Statistics. Lecture 25. Peter Bartlett 1. Relative efficiency of tests [vdv14]: Rescaling rates. 2. Likelihood ratio tests [vdv15]. 1 Recall: Relative efficiency of tests Theorem: Suppose that
More informationLecture 7. Sums of random variables
18.175: Lecture 7 Sums of random variables Scott Sheffield MIT 18.175 Lecture 7 1 Outline Definitions Sums of random variables 18.175 Lecture 7 2 Outline Definitions Sums of random variables 18.175 Lecture
More informationNorthwestern University Department of Electrical Engineering and Computer Science
Northwestern University Department of Electrical Engineering and Computer Science EECS 454: Modeling and Analysis of Communication Networks Spring 2008 Probability Review As discussed in Lecture 1, probability
More informationA Very Brief Summary of Statistical Inference, and Examples
A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2009 Prof. Gesine Reinert Our standard situation is that we have data x = x 1, x 2,..., x n, which we view as realisations of random
More information7 Convergence in R d and in Metric Spaces
STA 711: Probability & Measure Theory Robert L. Wolpert 7 Convergence in R d and in Metric Spaces A sequence of elements a n of R d converges to a limit a if and only if, for each ǫ > 0, the sequence a
More informationLecture 3: Random variables, distributions, and transformations
Lecture 3: Random variables, distributions, and transformations Definition 1.4.1. A random variable X is a function from S into a subset of R such that for any Borel set B R {X B} = {ω S : X(ω) B} is an
More informationCS145: Probability & Computing
CS45: Probability & Computing Lecture 5: Concentration Inequalities, Law of Large Numbers, Central Limit Theorem Instructor: Eli Upfal Brown University Computer Science Figure credits: Bertsekas & Tsitsiklis,
More informationSection 8: Asymptotic Properties of the MLE
2 Section 8: Asymptotic Properties of the MLE In this part of the course, we will consider the asymptotic properties of the maximum likelihood estimator. In particular, we will study issues of consistency,
More information10.1 Sequences. Example: A sequence is a function f(n) whose domain is a subset of the integers. Notation: *Note: n = 0 vs. n = 1.
10.1 Sequences Example: A sequence is a function f(n) whose domain is a subset of the integers. Notation: *Note: n = 0 vs. n = 1 Examples: EX1: Find a formula for the general term a n of the sequence,
More informationLimiting Distributions
Limiting Distributions We introduce the mode of convergence for a sequence of random variables, and discuss the convergence in probability and in distribution. The concept of convergence leads us to the
More informationLecture 1 Measure concentration
CSE 29: Learning Theory Fall 2006 Lecture Measure concentration Lecturer: Sanjoy Dasgupta Scribe: Nakul Verma, Aaron Arvey, and Paul Ruvolo. Concentration of measure: examples We start with some examples
More informationIntroduction. Chapter 1. Contents. EECS 600 Function Space Methods in System Theory Lecture Notes J. Fessler 1.1
Chapter 1 Introduction Contents Motivation........................................................ 1.2 Applications (of optimization).............................................. 1.2 Main principles.....................................................
More informationChapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued
Chapter 3 sections 3.1 Random Variables and Discrete Distributions 3.2 Continuous Distributions 3.3 The Cumulative Distribution Function 3.4 Bivariate Distributions 3.5 Marginal Distributions 3.6 Conditional
More informationA Very Brief Summary of Statistical Inference, and Examples
A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)
More informationSummary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016
8. For any two events E and F, P (E) = P (E F ) + P (E F c ). Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016 Sample space. A sample space consists of a underlying
More informationZ-estimators (generalized method of moments)
Z-estimators (generalized method of moments) Consider the estimation of an unknown parameter θ in a set, based on data x = (x,...,x n ) R n. Each function h(x, ) on defines a Z-estimator θ n = θ n (x,...,x
More informationProbability and Distributions
Probability and Distributions What is a statistical model? A statistical model is a set of assumptions by which the hypothetical population distribution of data is inferred. It is typically postulated
More informationf (r) (a) r! (x a) r, r=0
Part 3.3 Differentiation v1 2018 Taylor Polynomials Definition 3.3.1 Taylor 1715 and Maclaurin 1742) If a is a fixed number, and f is a function whose first n derivatives exist at a then the Taylor polynomial
More informationReproducing Kernel Hilbert Spaces
9.520: Statistical Learning Theory and Applications February 10th, 2010 Reproducing Kernel Hilbert Spaces Lecturer: Lorenzo Rosasco Scribe: Greg Durrett 1 Introduction In the previous two lectures, we
More informationEcon 508B: Lecture 5
Econ 508B: Lecture 5 Expectation, MGF and CGF Hongyi Liu Washington University in St. Louis July 31, 2017 Hongyi Liu (Washington University in St. Louis) Math Camp 2017 Stats July 31, 2017 1 / 23 Outline
More information1 Random Variable: Topics
Note: Handouts DO NOT replace the book. In most cases, they only provide a guideline on topics and an intuitive feel. 1 Random Variable: Topics Chap 2, 2.1-2.4 and Chap 3, 3.1-3.3 What is a random variable?
More information5 Introduction to the Theory of Order Statistics and Rank Statistics
5 Introduction to the Theory of Order Statistics and Rank Statistics This section will contain a summary of important definitions and theorems that will be useful for understanding the theory of order
More informationLecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable
Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed
More informationLecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable
Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed
More informationInfinite Series. 1 Introduction. 2 General discussion on convergence
Infinite Series 1 Introduction I will only cover a few topics in this lecture, choosing to discuss those which I have used over the years. The text covers substantially more material and is available for
More informationMachine Learning. Theory of Classification and Nonparametric Classifier. Lecture 2, January 16, What is theoretically the best classifier
Machine Learning 10-701/15 701/15-781, 781, Spring 2008 Theory of Classification and Nonparametric Classifier Eric Xing Lecture 2, January 16, 2006 Reading: Chap. 2,5 CB and handouts Outline What is theoretically
More informationLarge Sample Theory. Consider a sequence of random variables Z 1, Z 2,..., Z n. Convergence in probability: Z n
Large Sample Theory In statistics, we are interested in the properties of particular random variables (or estimators ), which are functions of our data. In ymptotic analysis, we focus on describing the
More information2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?
MA 645-4A (Real Analysis), Dr. Chernov Homework assignment 1 (Due 9/5). Prove that every countable set A is measurable and µ(a) = 0. 2 (Bonus). Let A consist of points (x, y) such that either x or y is
More informationLogical Connectives and Quantifiers
Chapter 1 Logical Connectives and Quantifiers 1.1 Logical Connectives 1.2 Quantifiers 1.3 Techniques of Proof: I 1.4 Techniques of Proof: II Theorem 1. Let f be a continuous function. If 1 f(x)dx 0, then
More informationSDS 321: Introduction to Probability and Statistics
SDS 321: Introduction to Probability and Statistics Lecture 14: Continuous random variables Purnamrita Sarkar Department of Statistics and Data Science The University of Texas at Austin www.cs.cmu.edu/
More informationThe integers. Chapter 3
Chapter 3 The integers Recall that an abelian group is a set A with a special element 0, and operation + such that x +0=x x + y = y + x x +y + z) =x + y)+z every element x has an inverse x + y =0 We also
More information2 2 + x =
Lecture 30: Power series A Power Series is a series of the form c n = c 0 + c 1 x + c x + c 3 x 3 +... where x is a variable, the c n s are constants called the coefficients of the series. n = 1 + x +
More informationSelf-Concordant Barrier Functions for Convex Optimization
Appendix F Self-Concordant Barrier Functions for Convex Optimization F.1 Introduction In this Appendix we present a framework for developing polynomial-time algorithms for the solution of convex optimization
More informationLecture 5: Hodge theorem
Lecture 5: Hodge theorem Jonathan Evans 4th October 2010 Jonathan Evans () Lecture 5: Hodge theorem 4th October 2010 1 / 15 Jonathan Evans () Lecture 5: Hodge theorem 4th October 2010 2 / 15 The aim of
More informationStatistics (1): Estimation
Statistics (1): Estimation Marco Banterlé, Christian Robert and Judith Rousseau Practicals 2014-2015 L3, MIDO, Université Paris Dauphine 1 Table des matières 1 Random variables, probability, expectation
More informationH 2 : otherwise. that is simply the proportion of the sample points below level x. For any fixed point x the law of large numbers gives that
Lecture 28 28.1 Kolmogorov-Smirnov test. Suppose that we have an i.i.d. sample X 1,..., X n with some unknown distribution and we would like to test the hypothesis that is equal to a particular distribution
More informationHypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3
Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest
More informationChapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued
Chapter 3 sections Chapter 3 - continued 3.1 Random Variables and Discrete Distributions 3.2 Continuous Distributions 3.3 The Cumulative Distribution Function 3.4 Bivariate Distributions 3.5 Marginal Distributions
More informationSTAT 3610: Review of Probability Distributions
STAT 3610: Review of Probability Distributions Mark Carpenter Professor of Statistics Department of Mathematics and Statistics August 25, 2015 Support of a Random Variable Definition The support of a random
More informationOn the mean connected induced subgraph order of cographs
AUSTRALASIAN JOURNAL OF COMBINATORICS Volume 71(1) (018), Pages 161 183 On the mean connected induced subgraph order of cographs Matthew E Kroeker Lucas Mol Ortrud R Oellermann University of Winnipeg Winnipeg,
More informationPhysics 129a Measure Theory Frank Porter
Physics 129a Measure Theory 071126 Frank Porter 1 Introduction The rigorous mathematical underpinning of much of what we have discussed, such as the construction of a proper Hilbert space of functions,
More informationHermite Interpolation
Jim Lambers MAT 77 Fall Semester 010-11 Lecture Notes These notes correspond to Sections 4 and 5 in the text Hermite Interpolation Suppose that the interpolation points are perturbed so that two neighboring
More informationAlmost Sure Convergence of the General Jamison Weighted Sum of B-Valued Random Variables
Acta Mathematica Sinica, English Series Feb., 24, Vol.2, No., pp. 8 92 Almost Sure Convergence of the General Jamison Weighted Sum of B-Valued Random Variables Chun SU Tie Jun TONG Department of Statistics
More information