Robust Inference. A central concern in robust statistics is how a functional of a CDF behaves as the distribution is perturbed.
|
|
- Marybeth Lang
- 5 years ago
- Views:
Transcription
1 Robust Inference Although the statistical functions we have considered have intuitive interpretations, the question remains as to what are the most useful distributional measures by which to describe a given distribution. In a simple case such as a normal distribution, the choices are obvious. For skewed distributions, or distributions that arise from mixtures of simpler distributions, the choices of useful distributional measures are not so obvious. A central concern in robust statistics is how a functional of a CDF behaves as the distribution is perturbed. If a functional is rather sensitive to small changes in the distribution, then one has more to worry about if the observations from the process of interest are contaminated with observations from some other process.
2 Sensitivity of Statistical Functions to Perturbations in the Distribution One of the most interesting things about a function (or a functional) is how its value varies as the argument is perturbed. Two key properties are continuity and differentiability. For the case in which the arguments are functions, the cardinality of the possible perturbations is greater than that of the continuum. We can be precise in discussions of continuity and differentiability of a functional Υ at a point (function) F in a domain F by defining another set D consisting of difference functions over F; that is the set the functions D = F 1 F 2 for F 1, F 2 F.
3 Derivatives of Functionals The concept of differentiability for functionals is necessarily more complicated than for functions over real domains. For a functional Υ over the domain F, we define three levels of differentiability at the function F F. All definitions are in terms of a domain D of difference functions over F, and a linear functional Λ F defined over D in a neighborhood of F. The first type of derivative is very general. The other two types depend on a metric ρ on F F induced by a norm on F.
4 Derivatives of Functionals Gâteaux differentiable. Υ is Gâteaux differentiable at F iff there exists a linear functional Λ F (D) over D such that for t IR for which F + td F, lim t 0 ( Υ (F + td) Υ (F ) t Λ F (D) ) = 0. ρ-hadamard differentiable. Υ is ρ-hadamard differentiable at F iff there exists a linear functional Λ F (D) over D such that for any sequence t j 0 IR and sequence D j D such that ρ(d j, D) 0 and F + t j D j F, lim j ( ) Υ (F + tj D j ) Υ (F ) Λ F (D j ) t j = 0.
5 ρ-fréchet differentiable. Υ is ρ-fréchet differentiable at F iff there exists a linear functional Λ(D) over D such that for any sequence F j F for which ρ(f j, F ) 0, ( Υ (Fj ) Υ (F ) Λ F (F j F ) ) lim j ρ(f j, F ) = 0.
6 Differentials of Functionals The linear functional Λ F is called the [Gâteaux ρ-hadamard ρ-fréchet] differential of Υ at F.
7 Perturbations In statistical applications using functionals defined on the CDF, we often consider a simple type of function in the neighborhood of the CDF. These are CDFs formed by adding a single mass point to the given distribution. For a given CDF P (y), we can define a simple perturbation as where 0 ɛ 1. P x,ɛ (y) = (1 ɛ)p (y) + ɛi [x, ) (y), (1) We will refer to this distribution as an ɛ-mixture distribution. The distribution with CDF P is the reference distribution. (This, of course, is the distribution of interest, so I often refer to it without any qualification.)
8 Perturbations A simple interpretation of the perturbation in equation (1) is that it is the CDF of a mixture of a distribution with CDF P and a degenerate distribution with a single mass point at x, which may or may not be in the support of the distribution. The extent of the perturbation depends on ɛ; if ɛ = 0, the distribution is the reference distribution. If the distribution with CDF P is continuous with PDF p, the PDF of the mixture is dp x,ɛ (y)/dy = (1 ɛ)p(y) + ɛδ(x y), where δ( ) is the Dirac delta function. If the distribution is discrete, the probability mass function has nonzero probabilities (scaled by (1 ɛ)) at each of the mass points associated with P together with a mass point at x with probability ɛ.
9 PDFs and the CDF of the ɛ-mixture Distribution In the left-hand graph the PDF of a continuous reference distribution (solid line) and the PDF of the ɛ-mixture distribution (dotted line together with the mass point at x). ε P x, ε (y) p(y) (1-ε)p(y) ε x x
10 Perturbations A statistical function evaluated at P x,ɛ compared to the function evaluated at P allows us to determine the effect of the perturbation on the statistical function. For example, we can determine the mean of the distribution with CDF P x,ɛ in terms of the mean µ of the reference distribution to be (1 ɛ)µ + ɛx. This is easily seen by thinking of the distribution as a mixture. For example, for the M functional we have M(P x,ɛ ) = = (1 ɛ) y d((1 ɛ)p (y) + ɛi [x, ) (y)) y dp (y) + ɛ yδ(y x) dy = (1 ɛ)µ + ɛx. (2)
11 Perturbations For a discrete distribution we would follow the same steps using summations (instead of an integral of y times a Dirac delta function, we just have a point mass of 1 at x), and would get the same result.
12 Quantiles under Perturbations The π quantile of the mixture distribution, Ξ π (P x,ɛ ) = P 1 x,ɛ (π), is somewhat more difficult to work out. This quantile, which we will call q, is shown relative to the π quantile of the continuous reference distribution, x π for two cases.
13 Quantiles under Perturbations For example, if the reference distribution is a standard normal, π = 0.7, so x π = 0.52, and ɛ = 0.1, we have the graphs p(y) p(y) (1-ε)p(y) ε (1-ε)p(y) ε x1 q yπ yπ q x2 In the left-hand graph, x 1 = 1.25, and in the right-hand graph, x 2 = 1.25.
14 Quantiles under Perturbations We see that in the case of a continuous reference distribution (implying P is strictly increasing), ), for (1 ɛ)p (x) + ɛ < π, P 1 ( π ɛ 1 ɛ Px,ɛ 1 (π) = x, for (1 ɛ)p (x) π (1 ɛ)p (x) + ɛ, P 1 ( ) π 1 ɛ, for π < (1 ɛ)p (x). (3) The conditions in equation (3) can also be expressed in terms of x and quantiles of the reference distribution. For example, the first condition is equivalent to x < y π ɛ 1 ɛ.
15 The Influence Function The extent of the perturbation depends on ɛ, and so we are interested in the relative effect; in particular, the relative effect as ɛ approaches zero. The influence function for the functional Υ and the CDF P, defined at x as Υ (P x,ɛ ) Υ (P ) φ Υ,P (x) = lim (4) ɛ 0 ɛ if the limit exists, is a measure of the sensitivity of the distributional measure defined by Υ to a perturbation of the distribution at the point x. The influence function is also called the influence curve, and denoted by IC. The limit is the right-hand Gâteaux derivative of the functional Υ at P and x.
16 The Influence Function The influence function can also be expressed as the limit of the derivative of Υ (P x,ɛ ) with respect to ɛ: φ Υ,P (x) = lim ɛ 0 ɛ Υ (P x,ɛ). (5) This form is often more convenient for evaluating the influence function.
17 The Influence Function for the M Functional Some influence functions are easy to work out, for example, the influence function for the M functional that defines the mean of a distribution, which we denote by µ. The influence function for this functional operating on the CDF P at x is φ µ,p (x) = M(P x,ɛ ) M(P ) lim ɛ 0 ɛ = (1 ɛ)µ + ɛx µ lim ɛ 0 ɛ = x µ. (6)
18 The Influence Function We note that the influence function of a functional is a type of derivative of the functional, M(P x,ɛ )/ ɛ. The influence function for other moments can be computed in the same way. Note that the influence function for the mean is unbounded in x; that is, it increases or decreases without bound as x increases or decreases without bound. Note also that this result is the same for multivariate or univariate distributions.
19 The Influence Function for Quantiles The influence function for a quantile is more difficult to work out. The problem arises from the difficulty in evaluating the quantile. As I informally described the distribution with CDF P x,ɛ, it is a mixture of some given distribution and a degenerate discrete distribution. Even if the reference distribution is continuous, the CDF of the mixture, P x,ɛ, does not have an inverse over the full support (although for quantiles we will write P 1 x,ɛ ). Let us consider a simple instance: a univariate continuous reference distribution, and assume p(x π ) > 0. We approach the problem by considering the PDF, or the probability mass function.
20 The Influence Function for Quantiles In the left-hand graph the second figure, the total probability mass up to the point y π is (1 ɛ) times the area under the curve, that is, (1 ɛ)π, plus the mass at x 1, that is, ɛ. Assuming ɛ is small enough, the π quantile of the ɛ-mixture distribution is the π ɛ quantile of the reference distribution, or P 1 (π ɛ). It is also the π quantile of the scaled reference distribution; that is, it is the value of the function (1 ɛ)p(x) that corresponds to the proportion π of the total probability (1 ɛ) of that component. Use of the definitions is somewhat messy. It is more straightforward to differentiate P 1 x 1,ɛ and take the limit.
21 The Influence Function for Quantiles For fixed x < y π, we have ( ) ɛ P 1 π ɛ = 1 ɛ 1 p ( P 1 ( π ɛ 1 ɛ )) (π 1)(1 ɛ) (1 ɛ) 2. Likewise, we take the derivatives for the other cases, and then take limits. We get φ Ξπ,P (x) = π 1 p(y π ), for x < y π, 0, for x = y π, π p(y π ), for x > y π.
22 The Influence Function for Quantiles Notice that the actual value of x is not in the influence function; only whether x is less than, equal to, or greater than the quantile. Notice also that, unlike the influence function for the mean, the influence function for a quantile is bounded; hence, a quantile is less sensitive than the mean to perturbations of the distribution. Likewise, quantile-based measures of scale and skewness are less sensitive than the moment-based measures to perturbations of the distribution. The L J and M ρ functionals, depending on J or ρ, can also be very insensitive to perturbations of the distribution.
23 The mean and variance of the influence function at a random point are of interest; in particular, we may wish to restrict the functional so that and E(φ Υ,P (X)) = 0 E ( (φ Υ,P (X)) 2) <.
24 Sensitivity of Estimators Based on Statistical Functions If a distributional measure of interest is defined on the CDF as Υ (P ), we are interested in the performance of the plug-in estimator Υ (P n ); specifically, we are interested in Υ (P n ) Υ (P ). This turns out to depend crucially on the differentiability of Υ. If we assume Gâteaux differentiability, we can write n (Υ (Pn ) Υ (P )) = Λ P ( n(p n P )) + R n = 1 φ Υ,P (Y i ) + R n n where the remainder R n 0. i
25 Convergence of Estimators First, we as- We are interested in the stochastic convergence. sume E(φ Υ,P (X)) = 0 and E ( (φ Υ,P (X)) 2) <. Then the question is the stochastic convergence of R n. Gâteaux differentiability does not guarantee that R n converges fast enough. However, ρ-hadamard differentiability does imply that that R n is o P (1), because it implies that norms of functionals (with or without random arguments) go to 0. We can also get that R n is o P (1) by assuming Υ is ρ-fréchet differentiable and that nρ(p n, P ) is O P (1).
26 Convergence of Estimators Assuming either ρ-hadamard or ρ-fréchet differentiability, given the moment properties of φ Υ,P (X) and R n is o P (1), we have by Slutsky s theorem, n (Υ (Pn ) Υ (P )) d N(0, σ 2 Υ,P ), where σ 2 Υ,P = E ( (φ Υ,P (X)) 2).
27 Asymptotic Variance of Estimators For a given plug-in estimator based on the statistical function Υ, knowing E ( (φ Υ,P (X)) 2) (and assuming E(φ Υ,P (X)) = 0) provides us an estimator of the asymptotic variance of the estimator.
28 Robust Estimators The influence function is very important in leading us to estimators that are robust; that is, to estimators that are relatively insensitive to departures from the underlying assumptions about the distribution. As mentioned above, the functionals L J and M ρ, depending on J or ρ, can be very insensitive to perturbations of the distribution; therefore estimators based on them, called L-estimators and M- estimators, can be robust. A class of L-estimators that are particularly useful are linear combinations of the order statistics. Because of the sufficiency and completeness of the order statistics in many cases of interest, such estimators can be expected to exhibit good statistical properties.
29 Robust Estimators Another class of estimators similar to the L-estimators are those based on ranks, which are simpler than order statistics. These are not sufficient the data values have been converted to their ranks nevertheless they preserve a lot of the information. The fact that they lose some information can actually work in their favor; they can be robust to extreme values of the data. A functional to define even a simple linear combination of ranks is rather complicated. As with the L J functional, we begin with a function J, which in this case we require to be strictly increasing, and also, in order to ensure uniqueness, we require that the CDF P be strictly increasing.
30 R J Estimators The R J functional is defined as the solution to the equation ( ) P (y) + 1 P (2RJ (P ) y) J dp (y) = 0. (7) 2 A functional defined as the solution to this optimization problem is called an R J functional, and an estimator based on applying it to a ECDF is called an R J estimator or just an R-estimator.
Monte Carlo Studies. The response in a Monte Carlo study is a random variable.
Monte Carlo Studies The response in a Monte Carlo study is a random variable. The response in a Monte Carlo study has a variance that comes from the variance of the stochastic elements in the data-generating
More informationReview and continuation from last week Properties of MLEs
Review and continuation from last week Properties of MLEs As we have mentioned, MLEs have a nice intuitive property, and as we have seen, they have a certain equivariance property. We will see later that
More informationCentral limit theorem. Paninski, Intro. Math. Stats., October 5, probability, Z N P Z, if
Paninski, Intro. Math. Stats., October 5, 2005 35 probability, Z P Z, if P ( Z Z > ɛ) 0 as. (The weak LL is called weak because it asserts convergence in probability, which turns out to be a somewhat weak
More informationLecture 21: Convergence of transformations and generating a random variable
Lecture 21: Convergence of transformations and generating a random variable If Z n converges to Z in some sense, we often need to check whether h(z n ) converges to h(z ) in the same sense. Continuous
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued
Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations Research
More informationTheoretical Statistics. Lecture 19.
Theoretical Statistics. Lecture 19. Peter Bartlett 1. Functional delta method. [vdv20] 2. Differentiability in normed spaces: Hadamard derivatives. [vdv20] 3. Quantile estimates. [vdv21] 1 Recall: Delta
More informationAsymptotic distribution of the sample average value-at-risk
Asymptotic distribution of the sample average value-at-risk Stoyan V. Stoyanov Svetlozar T. Rachev September 3, 7 Abstract In this paper, we prove a result for the asymptotic distribution of the sample
More informationB553 Lecture 1: Calculus Review
B553 Lecture 1: Calculus Review Kris Hauser January 10, 2012 This course requires a familiarity with basic calculus, some multivariate calculus, linear algebra, and some basic notions of metric topology.
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationRecall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n
Chapter 9 Hypothesis Testing 9.1 Wald, Rao, and Likelihood Ratio Tests Suppose we wish to test H 0 : θ = θ 0 against H 1 : θ θ 0. The likelihood-based results of Chapter 8 give rise to several possible
More informationRobustness of Principal Components
PCA for Clustering An objective of principal components analysis is to identify linear combinations of the original variables that are useful in accounting for the variation in those original variables.
More informationLecture 3: Random variables, distributions, and transformations
Lecture 3: Random variables, distributions, and transformations Definition 1.4.1. A random variable X is a function from S into a subset of R such that for any Borel set B R {X B} = {ω S : X(ω) B} is an
More informationMultivariate Distribution Models
Multivariate Distribution Models Model Description While the probability distribution for an individual random variable is called marginal, the probability distribution for multiple random variables is
More informationAlgorithms for Picture Analysis. Lecture 07: Metrics. Axioms of a Metric
Axioms of a Metric Picture analysis always assumes that pictures are defined in coordinates, and we apply the Euclidean metric as the golden standard for distance (or derived, such as area) measurements.
More informationAsymptotic Statistics-VI. Changliang Zou
Asymptotic Statistics-VI Changliang Zou Kolmogorov-Smirnov distance Example (Kolmogorov-Smirnov confidence intervals) We know given α (0, 1), there is a well-defined d = d α,n such that, for any continuous
More informationChapter 6. Order Statistics and Quantiles. 6.1 Extreme Order Statistics
Chapter 6 Order Statistics and Quantiles 61 Extreme Order Statistics Suppose we have a finite sample X 1,, X n Conditional on this sample, we define the values X 1),, X n) to be a permutation of X 1,,
More informationConfidence Intervals and Hypothesis Tests
Confidence Intervals and Hypothesis Tests STA 281 Fall 2011 1 Background The central limit theorem provides a very powerful tool for determining the distribution of sample means for large sample sizes.
More informationMultivariate Distributions (Hogg Chapter Two)
Multivariate Distributions (Hogg Chapter Two) STAT 45-1: Mathematical Statistics I Fall Semester 15 Contents 1 Multivariate Distributions 1 11 Random Vectors 111 Two Discrete Random Variables 11 Two Continuous
More informationFrequency Analysis & Probability Plots
Note Packet #14 Frequency Analysis & Probability Plots CEE 3710 October 0, 017 Frequency Analysis Process by which engineers formulate magnitude of design events (i.e. 100 year flood) or assess risk associated
More informationLecture 32: Asymptotic confidence sets and likelihoods
Lecture 32: Asymptotic confidence sets and likelihoods Asymptotic criterion In some problems, especially in nonparametric problems, it is difficult to find a reasonable confidence set with a given confidence
More information1 Solution to Problem 2.1
Solution to Problem 2. I incorrectly worked this exercise instead of 2.2, so I decided to include the solution anyway. a) We have X Y /3, which is a - function. It maps the interval, ) where X lives) onto
More informationAsymptotic statistics using the Functional Delta Method
Quantiles, Order Statistics and L-Statsitics TU Kaiserslautern 15. Februar 2015 Motivation Functional The delta method introduced in chapter 3 is an useful technique to turn the weak convergence of random
More informationInference For High Dimensional M-estimates. Fixed Design Results
: Fixed Design Results Lihua Lei Advisors: Peter J. Bickel, Michael I. Jordan joint work with Peter J. Bickel and Noureddine El Karoui Dec. 8, 2016 1/57 Table of Contents 1 Background 2 Main Results and
More informationRecall the Basics of Hypothesis Testing
Recall the Basics of Hypothesis Testing The level of significance α, (size of test) is defined as the probability of X falling in w (rejecting H 0 ) when H 0 is true: P(X w H 0 ) = α. H 0 TRUE H 1 TRUE
More informationBasic Computations in Statistical Inference
1 Basic Computations in Statistical Inference The purpose of an exploration of data may be rather limited and ad hoc, or the purpose may be more general, perhaps to gain understanding of some natural phenomenon.
More informationThe Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations
The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations John R. Michael, Significance, Inc. and William R. Schucany, Southern Methodist University The mixture
More informationReview of Probability Theory
Review of Probability Theory Arian Maleki and Tom Do Stanford University Probability theory is the study of uncertainty Through this class, we will be relying on concepts from probability theory for deriving
More informationStochastic Convergence, Delta Method & Moment Estimators
Stochastic Convergence, Delta Method & Moment Estimators Seminar on Asymptotic Statistics Daniel Hoffmann University of Kaiserslautern Department of Mathematics February 13, 2015 Daniel Hoffmann (TU KL)
More informationProbability and Estimation. Alan Moses
Probability and Estimation Alan Moses Random variables and probability A random variable is like a variable in algebra (e.g., y=e x ), but where at least part of the variability is taken to be stochastic.
More informationMetric spaces and metrizability
1 Motivation Metric spaces and metrizability By this point in the course, this section should not need much in the way of motivation. From the very beginning, we have talked about R n usual and how relatively
More informationThe Nonparametric Bootstrap
The Nonparametric Bootstrap The nonparametric bootstrap may involve inferences about a parameter, but we use a nonparametric procedure in approximating the parametric distribution using the ECDF. We use
More informationOpening Skew-symmetric distn s Skew-elliptical distn s Other properties Closing. Adelchi Azzalini. Università di Padova, Italia
Stochastic representations and other properties of skew-symmetric distributions Adelchi Azzalini Università di Padova, Italia a talk at Universität Rostock, 29 Sept 2011 Overview a brief introduction to
More informationLecture 4: Graph Limits and Graphons
Lecture 4: Graph Limits and Graphons 36-781, Fall 2016 3 November 2016 Abstract Contents 1 Reprise: Convergence of Dense Graph Sequences 1 2 Calculating Homomorphism Densities 3 3 Graphons 4 4 The Cut
More information6 Lecture 6b: the Euler Maclaurin formula
Queens College, CUNY, Department of Computer Science Numerical Methods CSCI 361 / 761 Fall 217 Instructor: Dr. Sateesh Mane c Sateesh R. Mane 217 March 26, 218 6 Lecture 6b: the Euler Maclaurin formula
More informationGaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008
Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:
More informationOptimization for Compressed Sensing
Optimization for Compressed Sensing Robert J. Vanderbei 2014 March 21 Dept. of Industrial & Systems Engineering University of Florida http://www.princeton.edu/ rvdb Lasso Regression The problem is to solve
More information401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.
401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis
More informationInformation in Data. Sufficiency, Ancillarity, Minimality, and Completeness
Information in Data Sufficiency, Ancillarity, Minimality, and Completeness Important properties of statistics that determine the usefulness of those statistics in statistical inference. These general properties
More informationSection 5.2 Series Solution Near Ordinary Point
DE Section 5.2 Series Solution Near Ordinary Point Page 1 of 5 Section 5.2 Series Solution Near Ordinary Point We are interested in second order homogeneous linear differential equations with variable
More informationReview: mostly probability and some statistics
Review: mostly probability and some statistics C2 1 Content robability (should know already) Axioms and properties Conditional probability and independence Law of Total probability and Bayes theorem Random
More informationLecture 4: September Reminder: convergence of sequences
36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 4: September 6 In this lecture we discuss the convergence of random variables. At a high-level, our first few lectures focused
More informationLocal Minimax Testing
Local Minimax Testing Sivaraman Balakrishnan and Larry Wasserman Carnegie Mellon June 11, 2017 1 / 32 Hypothesis testing beyond classical regimes Example 1: Testing distribution of bacteria in gut microbiome
More informationModulation of symmetric densities
1 Modulation of symmetric densities 1.1 Motivation This book deals with a formulation for the construction of continuous probability distributions and connected statistical aspects. Before we begin, a
More informationmin f(x). (2.1) Objectives consisting of a smooth convex term plus a nonconvex regularization term;
Chapter 2 Gradient Methods The gradient method forms the foundation of all of the schemes studied in this book. We will provide several complementary perspectives on this algorithm that highlight the many
More informationJoint Probability Distributions and Random Samples (Devore Chapter Five)
Joint Probability Distributions and Random Samples (Devore Chapter Five) 1016-345-01: Probability and Statistics for Engineers Spring 2013 Contents 1 Joint Probability Distributions 2 1.1 Two Discrete
More informationLecture 16: Sample quantiles and their asymptotic properties
Lecture 16: Sample quantiles and their asymptotic properties Estimation of quantiles (percentiles Suppose that X 1,...,X n are i.i.d. random variables from an unknown nonparametric F For p (0,1, G 1 (p
More informationWe introduce methods that are useful in:
Instructor: Shengyu Zhang Content Derived Distributions Covariance and Correlation Conditional Expectation and Variance Revisited Transforms Sum of a Random Number of Independent Random Variables more
More informationStatistical Estimation
Statistical Estimation Use data and a model. The plug-in estimators are based on the simple principle of applying the defining functional to the ECDF. Other methods of estimation: minimize residuals from
More information5.6 The Normal Distributions
STAT 41 Lecture Notes 13 5.6 The Normal Distributions Definition 5.6.1. A (continuous) random variable X has a normal distribution with mean µ R and variance < R if the p.d.f. of X is f(x µ, ) ( π ) 1/
More information1 Lyapunov theory of stability
M.Kawski, APM 581 Diff Equns Intro to Lyapunov theory. November 15, 29 1 1 Lyapunov theory of stability Introduction. Lyapunov s second (or direct) method provides tools for studying (asymptotic) stability
More informationRichard S. Palais Department of Mathematics Brandeis University Waltham, MA The Magic of Iteration
Richard S. Palais Department of Mathematics Brandeis University Waltham, MA 02254-9110 The Magic of Iteration Section 1 The subject of these notes is one of my favorites in all mathematics, and it s not
More informationThe Delta Method and Applications
Chapter 5 The Delta Method and Applications 5.1 Local linear approximations Suppose that a particular random sequence converges in distribution to a particular constant. The idea of using a first-order
More informationProbability and Information Theory. Sargur N. Srihari
Probability and Information Theory Sargur N. srihari@cedar.buffalo.edu 1 Topics in Probability and Information Theory Overview 1. Why Probability? 2. Random Variables 3. Probability Distributions 4. Marginal
More informationPage 52. Lecture 3: Inner Product Spaces Dual Spaces, Dirac Notation, and Adjoints Date Revised: 2008/10/03 Date Given: 2008/10/03
Page 5 Lecture : Inner Product Spaces Dual Spaces, Dirac Notation, and Adjoints Date Revised: 008/10/0 Date Given: 008/10/0 Inner Product Spaces: Definitions Section. Mathematical Preliminaries: Inner
More informationCompact operators on Banach spaces
Compact operators on Banach spaces Jordan Bell jordan.bell@gmail.com Department of Mathematics, University of Toronto November 12, 2017 1 Introduction In this note I prove several things about compact
More informationLimiting Distributions
Limiting Distributions We introduce the mode of convergence for a sequence of random variables, and discuss the convergence in probability and in distribution. The concept of convergence leads us to the
More informationEconomics 241B Review of Limit Theorems for Sequences of Random Variables
Economics 241B Review of Limit Theorems for Sequences of Random Variables Convergence in Distribution The previous de nitions of convergence focus on the outcome sequences of a random variable. Convergence
More informationLecture Notes in Advanced Calculus 1 (80315) Raz Kupferman Institute of Mathematics The Hebrew University
Lecture Notes in Advanced Calculus 1 (80315) Raz Kupferman Institute of Mathematics The Hebrew University February 7, 2007 2 Contents 1 Metric Spaces 1 1.1 Basic definitions...........................
More informationChapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued
Chapter 3 sections 3.1 Random Variables and Discrete Distributions 3.2 Continuous Distributions 3.3 The Cumulative Distribution Function 3.4 Bivariate Distributions 3.5 Marginal Distributions 3.6 Conditional
More informationSome Background Material
Chapter 1 Some Background Material In the first chapter, we present a quick review of elementary - but important - material as a way of dipping our toes in the water. This chapter also introduces important
More informationA Novel Nonparametric Density Estimator
A Novel Nonparametric Density Estimator Z. I. Botev The University of Queensland Australia Abstract We present a novel nonparametric density estimator and a new data-driven bandwidth selection method with
More information7 Influence Functions
7 Influence Functions The influence function is used to approximate the standard error of a plug-in estimator. The formal definition is as follows. 7.1 Definition. The Gâteaux derivative of T at F in the
More informationParametric Techniques
Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure
More informationIntroduction to Probability
LECTURE NOTES Course 6.041-6.431 M.I.T. FALL 2000 Introduction to Probability Dimitri P. Bertsekas and John N. Tsitsiklis Professors of Electrical Engineering and Computer Science Massachusetts Institute
More informationLECTURE 10: REVIEW OF POWER SERIES. 1. Motivation
LECTURE 10: REVIEW OF POWER SERIES By definition, a power series centered at x 0 is a series of the form where a 0, a 1,... and x 0 are constants. For convenience, we shall mostly be concerned with the
More informationProbability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur
Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur Lecture No. # 33 Probability Models using Gamma and Extreme Value
More informationLecture 5: Moment generating functions
Lecture 5: Moment generating functions Definition 2.3.6. The moment generating function (mgf) of a random variable X is { x e tx f M X (t) = E(e tx X (x) if X has a pmf ) = etx f X (x)dx if X has a pdf
More informationSparsification by Effective Resistance Sampling
Spectral raph Theory Lecture 17 Sparsification by Effective Resistance Sampling Daniel A. Spielman November 2, 2015 Disclaimer These notes are not necessarily an accurate representation of what happened
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2
MA 575 Linear Models: Cedric E Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2 1 Revision: Probability Theory 11 Random Variables A real-valued random variable is
More informationContinuous random variables
Continuous random variables CE 311S What was the difference between discrete and continuous random variables? The possible outcomes of a discrete random variable (finite or infinite) can be listed out;
More informationSet, functions and Euclidean space. Seungjin Han
Set, functions and Euclidean space Seungjin Han September, 2018 1 Some Basics LOGIC A is necessary for B : If B holds, then A holds. B A A B is the contraposition of B A. A is sufficient for B: If A holds,
More informationCS 542G: The Poisson Problem, Finite Differences
CS 542G: The Poisson Problem, Finite Differences Robert Bridson November 10, 2008 1 The Poisson Problem At the end last time, we noticed that the gravitational potential has a zero Laplacian except at
More informationConvergence for periodic Fourier series
Chapter 8 Convergence for periodic Fourier series We are now in a position to address the Fourier series hypothesis that functions can realized as the infinite sum of trigonometric functions discussed
More informationECE 4400:693 - Information Theory
ECE 4400:693 - Information Theory Dr. Nghi Tran Lecture 8: Differential Entropy Dr. Nghi Tran (ECE-University of Akron) ECE 4400:693 Lecture 1 / 43 Outline 1 Review: Entropy of discrete RVs 2 Differential
More informationChapter 2. Limits and Continuity 2.6 Limits Involving Infinity; Asymptotes of Graphs
2.6 Limits Involving Infinity; Asymptotes of Graphs Chapter 2. Limits and Continuity 2.6 Limits Involving Infinity; Asymptotes of Graphs Definition. Formal Definition of Limits at Infinity.. We say that
More informationLecture 13: Series Solutions near Singular Points
Lecture 13: Series Solutions near Singular Points March 28, 2007 Here we consider solutions to second-order ODE s using series when the coefficients are not necessarily analytic. A first-order analogy
More informationInference For High Dimensional M-estimates: Fixed Design Results
Inference For High Dimensional M-estimates: Fixed Design Results Lihua Lei, Peter Bickel and Noureddine El Karoui Department of Statistics, UC Berkeley Berkeley-Stanford Econometrics Jamboree, 2017 1/49
More informationChapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued
Chapter 3 sections Chapter 3 - continued 3.1 Random Variables and Discrete Distributions 3.2 Continuous Distributions 3.3 The Cumulative Distribution Function 3.4 Bivariate Distributions 3.5 Marginal Distributions
More informationReturn probability on a lattice
Return probability on a lattice Chris H. Rycroft October 24, 2006 To begin, we consider a basic example of a discrete first passage process. Consider an unbiased Bernoulli walk on the integers starting
More information1 Random Variable: Topics
Note: Handouts DO NOT replace the book. In most cases, they only provide a guideline on topics and an intuitive feel. 1 Random Variable: Topics Chap 2, 2.1-2.4 and Chap 3, 3.1-3.3 What is a random variable?
More informationOn prediction and density estimation Peter McCullagh University of Chicago December 2004
On prediction and density estimation Peter McCullagh University of Chicago December 2004 Summary Having observed the initial segment of a random sequence, subsequent values may be predicted by calculating
More informationSTATISTICS/ECONOMETRICS PREP COURSE PROF. MASSIMO GUIDOLIN
Massimo Guidolin Massimo.Guidolin@unibocconi.it Dept. of Finance STATISTICS/ECONOMETRICS PREP COURSE PROF. MASSIMO GUIDOLIN SECOND PART, LECTURE 2: MODES OF CONVERGENCE AND POINT ESTIMATION Lecture 2:
More information6 Central Limit Theorem. (Chs 6.4, 6.5)
6 Central Limit Theorem (Chs 6.4, 6.5) Motivating Example In the next few weeks, we will be focusing on making statistical inference about the true mean of a population by using sample datasets. Examples?
More information14.30 Introduction to Statistical Methods in Economics Spring 2009
MIT OpenCourseWare http://ocw.mit.edu 4.30 Introduction to Statistical Methods in Economics Spring 2009 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
More information14.30 Introduction to Statistical Methods in Economics Spring 2009
MIT OpenCourseWare http://ocw.mit.edu 4.0 Introduction to Statistical Methods in Economics Spring 009 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
More informationOn the Optimal Scaling of the Modified Metropolis-Hastings algorithm
On the Optimal Scaling of the Modified Metropolis-Hastings algorithm K. M. Zuev & J. L. Beck Division of Engineering and Applied Science California Institute of Technology, MC 4-44, Pasadena, CA 925, USA
More informationQuantum Field Theory Prof. Dr. Prasanta Kumar Tripathy Department of Physics Indian Institute of Technology, Madras
Quantum Field Theory Prof. Dr. Prasanta Kumar Tripathy Department of Physics Indian Institute of Technology, Madras Module - 1 Free Field Quantization Scalar Fields Lecture - 4 Quantization of Real Scalar
More informationWEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract
Journal of Data Science,17(1). P. 145-160,2019 DOI:10.6339/JDS.201901_17(1).0007 WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION Wei Xiong *, Maozai Tian 2 1 School of Statistics, University of
More informationEC 521 MATHEMATICAL METHODS FOR ECONOMICS. Lecture 1: Preliminaries
EC 521 MATHEMATICAL METHODS FOR ECONOMICS Lecture 1: Preliminaries Murat YILMAZ Boğaziçi University In this lecture we provide some basic facts from both Linear Algebra and Real Analysis, which are going
More informationlim F n(x) = F(x) will not use either of these. In particular, I m keeping reserved for implies. ) Note:
APPM/MATH 4/5520, Fall 2013 Notes 9: Convergence in Distribution and the Central Limit Theorem Definition: Let {X n } be a sequence of random variables with cdfs F n (x) = P(X n x). Let X be a random variable
More informationMATH Notebook 5 Fall 2018/2019
MATH442601 2 Notebook 5 Fall 2018/2019 prepared by Professor Jenny Baglivo c Copyright 2004-2019 by Jenny A. Baglivo. All Rights Reserved. 5 MATH442601 2 Notebook 5 3 5.1 Sequences of IID Random Variables.............................
More informationLecture #11: Classification & Logistic Regression
Lecture #11: Classification & Logistic Regression CS 109A, STAT 121A, AC 209A: Data Science Weiwei Pan, Pavlos Protopapas, Kevin Rader Fall 2016 Harvard University 1 Announcements Midterm: will be graded
More informationBindel, Fall 2011 Intro to Scientific Computing (CS 3220) Week 3: Wednesday, Jan 9
Problem du jour Week 3: Wednesday, Jan 9 1. As a function of matrix dimension, what is the asymptotic complexity of computing a determinant using the Laplace expansion (cofactor expansion) that you probably
More informationDistributions of Functions of Random Variables. 5.1 Functions of One Random Variable
Distributions of Functions of Random Variables 5.1 Functions of One Random Variable 5.2 Transformations of Two Random Variables 5.3 Several Random Variables 5.4 The Moment-Generating Function Technique
More informationTime-bounded computations
Lecture 18 Time-bounded computations We now begin the final part of the course, which is on complexity theory. We ll have time to only scratch the surface complexity theory is a rich subject, and many
More informationParametric Techniques Lecture 3
Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to
More informationSimulating Uniform- and Triangular- Based Double Power Method Distributions
Journal of Statistical and Econometric Methods, vol.6, no.1, 2017, 1-44 ISSN: 1792-6602 (print), 1792-6939 (online) Scienpress Ltd, 2017 Simulating Uniform- and Triangular- Based Double Power Method Distributions
More informationRefining the Central Limit Theorem Approximation via Extreme Value Theory
Refining the Central Limit Theorem Approximation via Extreme Value Theory Ulrich K. Müller Economics Department Princeton University February 2018 Abstract We suggest approximating the distribution of
More informationExpectation is a positive linear operator
Department of Mathematics Ma 3/103 KC Border Introduction to Probability and Statistics Winter 2017 Lecture 6: Expectation is a positive linear operator Relevant textbook passages: Pitman [3]: Chapter
More informationTime Series 2. Robert Almgren. Sept. 21, 2009
Time Series 2 Robert Almgren Sept. 21, 2009 This week we will talk about linear time series models: AR, MA, ARMA, ARIMA, etc. First we will talk about theory and after we will talk about fitting the models
More information