1 Glivenko-Cantelli type theorems
|
|
- Laurence Allen
- 5 years ago
- Views:
Transcription
1 STA79 Lecture Spring Semester Glivenko-Cantelli type theorems Given i.i.d. observations X,..., X n with unknown distribution function F (t, consider the empirical (sample CDF ˆF n (t = I [Xi t]. n Then as n, ˆF n (t F (t a.s. < t < Without the (i.e. for a fixed t this is just an ordinary LLN for Bernoulli r.v.s The difficult (and usefulness is in the. Notice that F (t = P (X t = P (X (, t], where (, t] can be considered as a set A (indexed by t. And the Glivenko-Cantelli theorem can be rewritten as: A A d[ ˆF n (s F (s] a.s. Does the following convergence hold if A is any Lebesgue measurable set in F? d[ ˆF n (s F (s] a.s. A F A We know the following: ( if F = {(, t], t R}, then the uniform convergence holds; (.5 if F = {(a, b], for any real a < b}, then the uniform convergence holds; (2 if F = { all measurable sets }, then the uniform convergence doesn t holds; (3 if F = Vapnik-Chervonenkis (V-C sets, then the uniform convergence holds. We shall see that the key is F {x, x 2,, x n } should have n k (polynomials many different sets, not exponentially many (2 n.. The proof of Glivenko-Cantelli theorem i.i.d. i.i.d. Suppose X,..., X n F (t, and Y,..., Y n F (t (same CDF. Also assume X s are independent of Y s. Let ˆF n (t = I [Xi t] n and F n(t = n I [Yi t] Step : Symmetrization (See Page 4 of Pollard for details 2P ( ɛ > ; P ( ˆF n (t F (t > ɛ <t< <t< ( ˆF n (t F (t (F n(t F (t > ɛ 2 = 2P ( ˆF n (t Fn(t > ɛ <t< 2
2 STA79 Lecture Spring Semester Since ˆF n (t and Fn(t are piecewise constant functions, thus ˆF n (t Fn(t has at most (2n + different values when < t <. Step 2: Turn infinite many Sup to finite many Max, corresponding to (2n+ different values. 2P ( ˆF n (t Fn(t > ɛ = 2P ( max 2 ˆF n (t F t=t,...,t 2n+ n(t > ɛ 2 <t< 2n+ = 2P ( ˆF n (t i Fn(t i > ɛ 2 2n+ 2 P ( ˆF n (t i Fn(t i > ɛ 2 (By Boole s ineq. Step 3: Hoeffding s Inequality (Pollard, 984 Suppose Y,..., Y n are independent with EY i = (Mean and a i Y b i (bounded then, η >, P ( Y + Y Y n > η 2e 2η 2 n (b i a i 2. Let Y i = n (I [X i t] I [Yi t] then we have n Y i n and E(Y i =. Thus Hoeffding s Inequality can be applied to ˆF n (t i Fn(t i, with η = ɛ 2 2n+ 2 P ( ˆF n (t i Fn(t i > ɛ 2n+ 2 2 ( 2( ɛ 2 2 exp 2 ( 2 n 2 n = (8n + 4e nɛ2 8 as n Remarks: ( The above inequality holds for any ɛ > and any n. So we actually proved P ( <t< ˆF n (t F (t > ɛ (8n + 4e nɛ2 8 ; ( (2 It is worth noting that how fast this bound (8n + 4e nɛ2 8 goes to. For example, n= (8n + 4e nɛ2 8 <, an application of Borel-Cantalli lemma turns this into a.s. convergence. so Glivenko-Cantelli is almost surely converge. This also works if we replace (8n + 4 with any polynomials of n like n k. 2
3 STA79 Lecture Spring Semester.2 Generalizations Many generalizations are possible.. The random variables X, X 2,, X n need only be independent; and do not have to be identically distributed. The limiting distribution is then F n (t = /n F i (t. (The limit is always obtained by replace the random variables by the expectations 2. The constant /n may be replaced by other constants or a sequence of n constants: a, a 2,, a n. The result will be P ( <t< [ a i I[X i t] a i F i (t > ɛ (8n + 4 exp ɛ 2 8 n /a2 i 3. The limit do not have to be distribution functions. Any bounded non random function will do. In particular a sub-distrbution function. ] ; where U i (t = EI [Xi t, δ i =]. t a i I [Xi t, δ i =] U i (t Excercise: Suppose, as n we have <t< n N(t EN(t a.s. and <t< n R(t ER(t a.s. as n. Show that t dn(s t R(s den(s ER(s again, uniformly for those t that ER(t > η >. Furthermore, pose g(t is a function that the integral in the limit below is well defined. Let g n (t be a random sequence of functions that Show that t g n (t g(t a.s.. <t< g n (sdn(s R(s again, uniformly for those t that ER(t > η >. t g(sden(s ER(s 3
4 STA79 Lecture Spring Semester Let F = {A t = (, t], < t < } A t {x,..., x n } = {φ}, {x }, {x x 2 },..., {x...x n } (WLOG assume the x i s are ordered. The number of all subsets of {x,..., x n } is 2 n, but the number of all sets of the form A t {x,..., x n } is (n +. In general, if the number of all sets of the type A {x,..., x n } is a polynomial function in n (i.e. O(n k 2 n, then the sets contained in A is a V-C class of sets. For example, if A = A ab = (a, b], < a < b <, then the number of all sets of type A {x,..., x n } is n(n+ + (including empty set. Therefore the sets of A 2 ab = (a, b] is a V-C class of sets. Claim: If and only if F is a V-C class of sets, then P ( A F I [A] d ˆF n (t I [A] df (t > ɛ.3 Applications In the Cox model, the Breslow estimate of Baseline hazard and Fisher information matrix. ˆΛ (t = t n dn(s n n I [Y i s]e βz i We focus on the denominator. ( P s n I [Yi s]e βz i n P (Y i se βz i > ɛ (8n + 4e nɛ2 8M 2 (Condition : z i M < Where P (Y i s = P (T i sp (C i s = e Λ i(s [ G(s] = e Λ (se βz i [ G(s] Similar for Fisher information matrix n z 2 i I [Yi s]e βz i 4
5 STA79 Lecture Spring Semester The Glivenko-Cantelli can also be formulated for functions. P ( f F f(xd ˆF n (x f(xdf (x > ɛ = P ( f F n f(x i f(xdf (x > ɛ What is the condition on F to make above? V-C class of function: if a function s graph is a V-C class of sets. f(x graph{(x, y f(x > y} More dimensions (example: 2 dimensions The number of all sets of A {x,..., x n } is a polynomial function in n A = rectangles V-C class of sets Hence the Glivenko-Cantelli convergence works in 2 dimensions etc. Homework: Is the following true? Prove if it is true. ˆF n (t a.s. F (t <t x (n If not, what bound instead of x (n will make the convergence hold? Homework: Suppose ˆΛ n (t is the Nelson-Aalen estimator based on n right censored observations, and the Λ(t is the true cumulative hazard. Assume Λ(t is continuous, also assume Λ(t as t. Show that, as n either in probability or almost surely. Any speed? Can we make it t<? ˆΛ n (t Λ(t t M Reference: Pollard D. (984 Convergence of Stochastic Processes. Springer 5
6 STA79 Lecture Spring Semester 2 Empirical Likelihood and Bootstrap The idea of Boostrap: In the correspondence (or the link between that of ˆF n ( (ˆθ n θ, bootstrap apply a random perturb to the ˆF n, and see how (ˆθ n θ change accordingly. Repeat this many times and you have a sampling distribution of (ˆθ n θ. The random perturbation is obtained by a random sampling (or re-sampling to ˆF n. The idea of Empirical Likelihood: In the correspondence of ˆFn ( ˆθ n, EL force the statistic ˆθ n to the value θ, and find the tilted ˆF n that corresponding to this perturbed ˆθ n. We denote the tilted distribution as ˆF λ n for some nonzero λ. It turns out 2 log EL( ˆF λ n EL( ˆF n will have a chi square distribution, a pivatol distribution when θ is the true value of the parameter. Under null hypothesis, the perturbation of ˆθ n to θ is of order / n (usually. In bootstrap, the perturbation of re-sampling to ˆF n is also of order / n. The differrence: the bootstrap is a random perturbation but EL is a fixed perturbation, so bootstrap usually need simulation to repeat many times, and result may be slightly difference due to random re-sample errors. On the other hand, bootstrap can be applied to any statistic, but EL works most successfully for the case ˆθ is NPMLE. (has anyone try it on non-mle? In some setup, it may not be clear how a random perturbation should be applied to the ˆF becausse there are several plausible ways to do it. On the other hand, for EL there is usually clear, and only one way to set ˆθ to θ. Bootstrap needs to estimate a whole distribution (or percentile, and the EL can rely on the fact that the distribution of the likelihood ratio is a pivatol chi square. The introduction of the λ turns the non-parametric problem into a parametric problem. In the new parametric problem, we are estimate the true value of zero, and the information of λ is just the negative second derivative of the log likelihood and the MLE is ˆλ n which has an asymptotic normal distribution. This sub-family of parametric distributions are the so-called least favorable sub-distributions, an idea first proposed by Stone in
7 STA79 Lecture Spring Semester For any (square integrable function φ(t and a distribution F (, define φ(t = φ F (t = φ(udf (u F (t where τ F = {x : F (x < }. (t, τ F ] Theorem have that is Denote the Kaplan-Meier estimator based on n i.i.d. observations as ˆF n. We ˆF φ(ud ˆF n (u φ(udf (u n (t (t, τ ˆFn ] F (t (t, τ F ] φ ˆFn (t φ F (t The convergence is uniformly, almost sure, i.e. φ ˆFn (t φ F (t, a.s. t Theorem Assume φ(t is square integrable with respect to F (t. Then we have [φ(t φ ˆF (t] 2 d ˆF n (t [φ(t φ F (t] 2 df (t 7
8 STA79 Lecture Spring Semester Akritas (2 studied the central limit theorem for the Kaplan-Meier integrals. There are earlier papers about the same topic, but the asymptotic variance expression of Akritas (2 is new and interesting. Theorem (Akritas 2 The asymtotic variance of Kaplan-Meier integrals are ( n AsyV ar φ(td ˆF KM (t = τ [φ(t φ(t] 2 [ F (t]df (t H(t A multivariate version of this theorem can be easily obtained. Denote Φ(t = (φ (t,, φ k (t, then the asymptotic variance-covariance matrix of the k-vector of Kaplan-Meier integrals is ( n AsyV arcov Φ(td ˆF KM (t = [σ ij ], with τ σ ij = [φ i (t φ i (t][φ j (t φ [ F (t]df (t j (t]. H(t This multivariate version can be obtained by using the representation of Akritas (2, his Theorem 6. An easier to check sufficient condition to insure the variance are well defined is τ φ 2 (s df (s <. G(s. When there is no censoring, the Kaplan-Meier estimator become the empirical distribution and the integral with respect to empirical distribution is just the i.i.d. summation (or average. Finally, when there is no censoring H(s = F (s, the covariance formula of Akritas above simplify to the following AsyCov ( n φ i (X u, u= n φ j (X u can be written as τ [φ i (t φ i (t][φ j (t φ [ F (t]df (t j (t]. F (t On the other hand, the said covariance can obviously be written as τ u= [φ i (t Eφ i ][φ j (t Eφ j ]df (t. We, therefore, arrive at the following identity Lemma For function φ i and φ j that are square integrable with respect to F (t we have τ [φ i (t φ i (t][φ j (t φ [ F (t]df (t j (t] F (t = τ [φ i (t Eφ i ][φ j (t Eφ j ]df (t. When either the expectations Eφ i = or Eφ j = or both, the above identity can further be simplified to τ [φ i (t φ i (t][φ j (t φ [ F (t]df (t j (t] F (t = τ [φ i (t][φ j (t]df (t. We comment that this identity holds for any distribution F (, we later will use this when F (t is the Kaplan-Meier distribution. 8
9 STA79 Lecture Spring Semester A general empirical likelihood theorem. For a sample of n independent observations with distribution belongs to a family F n (β here β is the finite dimentional parameter, F n can be nonparametric. If there exist a distribution F n such that F n << F n, that is all distributions are dominated by a single (but can depend on n distribution F n, then empirical likelihood works for test the finite parameter β of the distributions F n. 9
11 Survival Analysis and Empirical Likelihood
11 Survival Analysis and Empirical Likelihood The first paper of empirical likelihood is actually about confidence intervals with the Kaplan-Meier estimator (Thomas and Grunkmeier 1979), i.e. deals with
More informationEMPIRICAL LIKELIHOOD AND DIFFERENTIABLE FUNCTIONALS
University of Kentucky UKnowledge Theses and Dissertations--Statistics Statistics 2016 EMPIRICAL LIKELIHOOD AND DIFFERENTIABLE FUNCTIONALS Zhiyuan Shen University of Kentucky, alanshenpku10@gmail.com Digital
More informationLecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL
Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL The Cox PH model: λ(t Z) = λ 0 (t) exp(β Z). How do we estimate the survival probability, S z (t) = S(t Z) = P (T > t Z), for an individual with covariates
More informationHomework # , Spring Due 14 May Convergence of the empirical CDF, uniform samples
Homework #3 36-754, Spring 27 Due 14 May 27 1 Convergence of the empirical CDF, uniform samples In this problem and the next, X i are IID samples on the real line, with cumulative distribution function
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued
Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations Research
More informationOther Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model
Other Survival Models (1) Non-PH models We briefly discussed the non-proportional hazards (non-ph) model λ(t Z) = λ 0 (t) exp{β(t) Z}, where β(t) can be estimated by: piecewise constants (recall how);
More informationStatistics 262: Intermediate Biostatistics Non-parametric Survival Analysis
Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis Jonathan Taylor & Kristin Cobb Statistics 262: Intermediate Biostatistics p.1/?? Overview of today s class Kaplan-Meier Curve
More informationSTAT Sample Problem: General Asymptotic Results
STAT331 1-Sample Problem: General Asymptotic Results In this unit we will consider the 1-sample problem and prove the consistency and asymptotic normality of the Nelson-Aalen estimator of the cumulative
More informationEfficiency of Profile/Partial Likelihood in the Cox Model
Efficiency of Profile/Partial Likelihood in the Cox Model Yuichi Hirose School of Mathematics, Statistics and Operations Research, Victoria University of Wellington, New Zealand Summary. This paper shows
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 12: Glivenko-Cantelli and Donsker Results
Introduction to Empirical Processes and Semiparametric Inference Lecture 12: Glivenko-Cantelli and Donsker Results Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics
More informationInvestigation of goodness-of-fit test statistic distributions by random censored samples
d samples Investigation of goodness-of-fit test statistic distributions by random censored samples Novosibirsk State Technical University November 22, 2010 d samples Outline 1 Nonparametric goodness-of-fit
More informationLecture 13: Subsampling vs Bootstrap. Dimitris N. Politis, Joseph P. Romano, Michael Wolf
Lecture 13: 2011 Bootstrap ) R n x n, θ P)) = τ n ˆθn θ P) Example: ˆθn = X n, τ n = n, θ = EX = µ P) ˆθ = min X n, τ n = n, θ P) = sup{x : F x) 0} ) Define: J n P), the distribution of τ n ˆθ n θ P) under
More informationEMPIRICAL ENVELOPE MLE AND LR TESTS. Mai Zhou University of Kentucky
EMPIRICAL ENVELOPE MLE AND LR TESTS Mai Zhou University of Kentucky Summary We study in this paper some nonparametric inference problems where the nonparametric maximum likelihood estimator (NPMLE) are
More informationA Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints
Noname manuscript No. (will be inserted by the editor) A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints Mai Zhou Yifan Yang Received: date / Accepted: date Abstract In this note
More information7 Influence Functions
7 Influence Functions The influence function is used to approximate the standard error of a plug-in estimator. The formal definition is as follows. 7.1 Definition. The Gâteaux derivative of T at F in the
More informationA COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky
A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky Empirical likelihood with right censored data were studied by Thomas and Grunkmier (1975), Li (1995),
More informationStatistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach
Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score
More informationEmpirical Likelihood in Survival Analysis
Empirical Likelihood in Survival Analysis Gang Li 1, Runze Li 2, and Mai Zhou 3 1 Department of Biostatistics, University of California, Los Angeles, CA 90095 vli@ucla.edu 2 Department of Statistics, The
More informationAFT Models and Empirical Likelihood
AFT Models and Empirical Likelihood Mai Zhou Department of Statistics, University of Kentucky Collaborators: Gang Li (UCLA); A. Bathke; M. Kim (Kentucky) Accelerated Failure Time (AFT) models: Y = log(t
More informationPractice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes:
Practice Exam 1 1. Losses for an insurance coverage have the following cumulative distribution function: F(0) = 0 F(1,000) = 0.2 F(5,000) = 0.4 F(10,000) = 0.9 F(100,000) = 1 with linear interpolation
More informationSize and Shape of Confidence Regions from Extended Empirical Likelihood Tests
Biometrika (2014),,, pp. 1 13 C 2014 Biometrika Trust Printed in Great Britain Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests BY M. ZHOU Department of Statistics, University
More informationEfficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models
NIH Talk, September 03 Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models Eric Slud, Math Dept, Univ of Maryland Ongoing joint project with Ilia
More informationTesting Algebraic Hypotheses
Testing Algebraic Hypotheses Mathias Drton Department of Statistics University of Chicago 1 / 18 Example: Factor analysis Multivariate normal model based on conditional independence given hidden variable:
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models
Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations
More informationST745: Survival Analysis: Nonparametric methods
ST745: Survival Analysis: Nonparametric methods Eric B. Laber Department of Statistics, North Carolina State University February 5, 2015 The KM estimator is used ubiquitously in medical studies to estimate
More informationGoodness-of-fit tests for the cure rate in a mixture cure model
Biometrika (217), 13, 1, pp. 1 7 Printed in Great Britain Advance Access publication on 31 July 216 Goodness-of-fit tests for the cure rate in a mixture cure model BY U.U. MÜLLER Department of Statistics,
More informationPart III. Hypothesis Testing. III.1. Log-rank Test for Right-censored Failure Time Data
1 Part III. Hypothesis Testing III.1. Log-rank Test for Right-censored Failure Time Data Consider a survival study consisting of n independent subjects from p different populations with survival functions
More informationThe International Journal of Biostatistics
The International Journal of Biostatistics Volume 1, Issue 1 2005 Article 3 Score Statistics for Current Status Data: Comparisons with Likelihood Ratio and Wald Statistics Moulinath Banerjee Jon A. Wellner
More informationDirection: This test is worth 250 points and each problem worth points. DO ANY SIX
Term Test 3 December 5, 2003 Name Math 52 Student Number Direction: This test is worth 250 points and each problem worth 4 points DO ANY SIX PROBLEMS You are required to complete this test within 50 minutes
More informationSurvival Analysis: Weeks 2-3. Lu Tian and Richard Olshen Stanford University
Survival Analysis: Weeks 2-3 Lu Tian and Richard Olshen Stanford University 2 Kaplan-Meier(KM) Estimator Nonparametric estimation of the survival function S(t) = pr(t > t) The nonparametric estimation
More informationH 2 : otherwise. that is simply the proportion of the sample points below level x. For any fixed point x the law of large numbers gives that
Lecture 28 28.1 Kolmogorov-Smirnov test. Suppose that we have an i.i.d. sample X 1,..., X n with some unknown distribution and we would like to test the hypothesis that is equal to a particular distribution
More informationand Comparison with NPMLE
NONPARAMETRIC BAYES ESTIMATOR OF SURVIVAL FUNCTIONS FOR DOUBLY/INTERVAL CENSORED DATA and Comparison with NPMLE Mai Zhou Department of Statistics, University of Kentucky, Lexington, KY 40506 USA http://ms.uky.edu/
More informationQuantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing
Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu October
More informationEstimation of the Bivariate and Marginal Distributions with Censored Data
Estimation of the Bivariate and Marginal Distributions with Censored Data Michael Akritas and Ingrid Van Keilegom Penn State University and Eindhoven University of Technology May 22, 2 Abstract Two new
More informationThe bootstrap. Patrick Breheny. December 6. The empirical distribution function The bootstrap
Patrick Breheny December 6 Patrick Breheny BST 764: Applied Statistical Modeling 1/21 The empirical distribution function Suppose X F, where F (x) = Pr(X x) is a distribution function, and we wish to estimate
More informationEmpirical Likelihood
Empirical Likelihood Patrick Breheny September 20 Patrick Breheny STA 621: Nonparametric Statistics 1/15 Introduction Empirical likelihood We will discuss one final approach to constructing confidence
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationNotes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed
18.466 Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed 1. MLEs in exponential families Let f(x,θ) for x X and θ Θ be a likelihood function, that is, for present purposes,
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 09: Stochastic Convergence, Continued
Introduction to Empirical Processes and Semiparametric Inference Lecture 09: Stochastic Convergence, Continued Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and
More informationSurvival Analysis Math 434 Fall 2011
Survival Analysis Math 434 Fall 2011 Part IV: Chap. 8,9.2,9.3,11: Semiparametric Proportional Hazards Regression Jimin Ding Math Dept. www.math.wustl.edu/ jmding/math434/fall09/index.html Basic Model Setup
More informationCentral Limit Theorem ( 5.3)
Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately
More informationIntroduction to repairable systems STK4400 Spring 2011
Introduction to repairable systems STK4400 Spring 2011 Bo Lindqvist http://www.math.ntnu.no/ bo/ bo@math.ntnu.no Bo Lindqvist Introduction to repairable systems Definition of repairable system Ascher and
More informationLecture 5 Models and methods for recurrent event data
Lecture 5 Models and methods for recurrent event data Recurrent and multiple events are commonly encountered in longitudinal studies. In this chapter we consider ordered recurrent and multiple events.
More informationLecture 3. Truncation, length-bias and prevalence sampling
Lecture 3. Truncation, length-bias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in
More information1 Degree distributions and data
1 Degree distributions and data A great deal of effort is often spent trying to identify what functional form best describes the degree distribution of a network, particularly the upper tail of that distribution.
More informationSupplementary Materials for Residuals and Diagnostics for Ordinal Regression Models: A Surrogate Approach
Supplementary Materials for Residuals and Diagnostics for Ordinal Regression Models: A Surrogate Approach Part A: Figures and tables Figure 2: An illustration of the sampling procedure to generate a surrogate
More informationMAS3301 / MAS8311 Biostatistics Part II: Survival
MAS3301 / MAS8311 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-10 1 13 The Cox proportional hazards model 13.1 Introduction In the
More informationSurvival Regression Models
Survival Regression Models David M. Rocke May 18, 2017 David M. Rocke Survival Regression Models May 18, 2017 1 / 32 Background on the Proportional Hazards Model The exponential distribution has constant
More informationAdvanced Statistics II: Non Parametric Tests
Advanced Statistics II: Non Parametric Tests Aurélien Garivier ParisTech February 27, 2011 Outline Fitting a distribution Rank Tests for the comparison of two samples Two unrelated samples: Mann-Whitney
More informationMaster s Written Examination
Master s Written Examination Option: Statistics and Probability Spring 016 Full points may be obtained for correct answers to eight questions. Each numbered question which may have several parts is worth
More informationProduct-limit estimators of the survival function with left or right censored data
Product-limit estimators of the survival function with left or right censored data 1 CREST-ENSAI Campus de Ker-Lann Rue Blaise Pascal - BP 37203 35172 Bruz cedex, France (e-mail: patilea@ensai.fr) 2 Institut
More informationConsistency of the Semi-parametric MLE under the Cox Model with Linearly Time-dependent Covariates and Interval-censored Data
Journal of Advanced Statistics, Vol. 4, No. 1, March 2019 https://dx.doi.org/10.22606/jas.2019.41001 1 Consistency of the Semi-parametric MLE under the Cox Model with Linearly Time-dependent Covariates
More informationQualifying Exam in Probability and Statistics. https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf
Part : Sample Problems for the Elementary Section of Qualifying Exam in Probability and Statistics https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf Part 2: Sample Problems for the Advanced Section
More informationNow consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown.
Weighting We have seen that if E(Y) = Xβ and V (Y) = σ 2 G, where G is known, the model can be rewritten as a linear model. This is known as generalized least squares or, if G is diagonal, with trace(g)
More informationsimple if it completely specifies the density of x
3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely
More informationProbability and Measure
Probability and Measure Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USA Convergence of Random Variables 1. Convergence Concepts 1.1. Convergence of Real
More informationLecture 2: Martingale theory for univariate survival analysis
Lecture 2: Martingale theory for univariate survival analysis In this lecture T is assumed to be a continuous failure time. A core question in this lecture is how to develop asymptotic properties when
More informationMultistate Modeling and Applications
Multistate Modeling and Applications Yang Yang Department of Statistics University of Michigan, Ann Arbor IBM Research Graduate Student Workshop: Statistics for a Smarter Planet Yang Yang (UM, Ann Arbor)
More informationENEE 621 SPRING 2016 DETECTION AND ESTIMATION THEORY THE PARAMETER ESTIMATION PROBLEM
c 2007-2016 by Armand M. Makowski 1 ENEE 621 SPRING 2016 DETECTION AND ESTIMATION THEORY THE PARAMETER ESTIMATION PROBLEM 1 The basic setting Throughout, p, q and k are positive integers. The setup With
More informationNonparametric Econometrics
Applied Microeconometrics with Stata Nonparametric Econometrics Spring Term 2011 1 / 37 Contents Introduction The histogram estimator The kernel density estimator Nonparametric regression estimators Semi-
More informationGoodness-of-fit tests for randomly censored Weibull distributions with estimated parameters
Communications for Statistical Applications and Methods 2017, Vol. 24, No. 5, 519 531 https://doi.org/10.5351/csam.2017.24.5.519 Print ISSN 2287-7843 / Online ISSN 2383-4757 Goodness-of-fit tests for randomly
More informationSTAT 512 sp 2018 Summary Sheet
STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}
More informationProbability reminders
CS246 Winter 204 Mining Massive Data Sets Probability reminders Sammy El Ghazzal selghazz@stanfordedu Disclaimer These notes may contain typos, mistakes or confusing points Please contact the author so
More informationQualifying Exam in Probability and Statistics. https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf
Part 1: Sample Problems for the Elementary Section of Qualifying Exam in Probability and Statistics https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf Part 2: Sample Problems for the Advanced Section
More informationLecture 2: Convergence of Random Variables
Lecture 2: Convergence of Random Variables Hyang-Won Lee Dept. of Internet & Multimedia Eng. Konkuk University Lecture 2 Introduction to Stochastic Processes, Fall 2013 1 / 9 Convergence of Random Variables
More informationAsymptotic Statistics-III. Changliang Zou
Asymptotic Statistics-III Changliang Zou The multivariate central limit theorem Theorem (Multivariate CLT for iid case) Let X i be iid random p-vectors with mean µ and and covariance matrix Σ. Then n (
More informationlarge number of i.i.d. observations from P. For concreteness, suppose
1 Subsampling Suppose X i, i = 1,..., n is an i.i.d. sequence of random variables with distribution P. Let θ(p ) be some real-valued parameter of interest, and let ˆθ n = ˆθ n (X 1,..., X n ) be some estimate
More informationProblem set 1, Real Analysis I, Spring, 2015.
Problem set 1, Real Analysis I, Spring, 015. (1) Let f n : D R be a sequence of functions with domain D R n. Recall that f n f uniformly if and only if for all ɛ > 0, there is an N = N(ɛ) so that if n
More informationFULL LIKELIHOOD INFERENCES IN THE COX MODEL
October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach
More informationPower and Sample Size Calculations with the Additive Hazards Model
Journal of Data Science 10(2012), 143-155 Power and Sample Size Calculations with the Additive Hazards Model Ling Chen, Chengjie Xiong, J. Philip Miller and Feng Gao Washington University School of Medicine
More informationSpring 2012 Math 541B Exam 1
Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote
More informationScore Statistics for Current Status Data: Comparisons with Likelihood Ratio and Wald Statistics
Score Statistics for Current Status Data: Comparisons with Likelihood Ratio and Wald Statistics Moulinath Banerjee 1 and Jon A. Wellner 2 1 Department of Statistics, Department of Statistics, 439, West
More informationAsymptotic Distributions for the Nelson-Aalen and Kaplan-Meier estimators and for test statistics.
Asymptotic Distributions for the Nelson-Aalen and Kaplan-Meier estimators and for test statistics. Dragi Anevski Mathematical Sciences und University November 25, 21 1 Asymptotic distributions for statistical
More informationLecture 9: March 26, 2014
COMS 6998-3: Sub-Linear Algorithms in Learning and Testing Lecturer: Rocco Servedio Lecture 9: March 26, 204 Spring 204 Scriber: Keith Nichols Overview. Last Time Finished analysis of O ( n ɛ ) -query
More informationStep-Stress Models and Associated Inference
Department of Mathematics & Statistics Indian Institute of Technology Kanpur August 19, 2014 Outline Accelerated Life Test 1 Accelerated Life Test 2 3 4 5 6 7 Outline Accelerated Life Test 1 Accelerated
More informationSTAT 461/561- Assignments, Year 2015
STAT 461/561- Assignments, Year 2015 This is the second set of assignment problems. When you hand in any problem, include the problem itself and its number. pdf are welcome. If so, use large fonts and
More informationSelected Exercises on Expectations and Some Probability Inequalities
Selected Exercises on Expectations and Some Probability Inequalities # If E(X 2 ) = and E X a > 0, then P( X λa) ( λ) 2 a 2 for 0 < λ
More informationLecture 2: CDF and EDF
STAT 425: Introduction to Nonparametric Statistics Winter 2018 Instructor: Yen-Chi Chen Lecture 2: CDF and EDF 2.1 CDF: Cumulative Distribution Function For a random variable X, its CDF F () contains all
More informationSpring 2012 Math 541A Exam 1. X i, S 2 = 1 n. n 1. X i I(X i < c), T n =
Spring 2012 Math 541A Exam 1 1. (a) Let Z i be independent N(0, 1), i = 1, 2,, n. Are Z = 1 n n Z i and S 2 Z = 1 n 1 n (Z i Z) 2 independent? Prove your claim. (b) Let X 1, X 2,, X n be independent identically
More informationOptional Stopping Theorem Let X be a martingale and T be a stopping time such
Plan Counting, Renewal, and Point Processes 0. Finish FDR Example 1. The Basic Renewal Process 2. The Poisson Process Revisited 3. Variants and Extensions 4. Point Processes Reading: G&S: 7.1 7.3, 7.10
More informationUnderstanding Generalization Error: Bounds and Decompositions
CIS 520: Machine Learning Spring 2018: Lecture 11 Understanding Generalization Error: Bounds and Decompositions Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the
More informationVerifying Regularity Conditions for Logit-Normal GLMM
Verifying Regularity Conditions for Logit-Normal GLMM Yun Ju Sung Charles J. Geyer January 10, 2006 In this note we verify the conditions of the theorems in Sung and Geyer (submitted) for the Logit-Normal
More informationEmpirical Processes & Survival Analysis. The Functional Delta Method
STAT/BMI 741 University of Wisconsin-Madison Empirical Processes & Survival Analysis Lecture 3 The Functional Delta Method Lu Mao lmao@biostat.wisc.edu 3-1 Objectives By the end of this lecture, you will
More informationExercises. (a) Prove that m(t) =
Exercises 1. Lack of memory. Verify that the exponential distribution has the lack of memory property, that is, if T is exponentially distributed with parameter λ > then so is T t given that T > t for
More informationResiduals and model diagnostics
Residuals and model diagnostics Patrick Breheny November 10 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/42 Introduction Residuals Many assumptions go into regression models, and the Cox proportional
More informationHypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3
Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest
More informationNorthwestern University Department of Electrical Engineering and Computer Science
Northwestern University Department of Electrical Engineering and Computer Science EECS 454: Modeling and Analysis of Communication Networks Spring 2008 Probability Review As discussed in Lecture 1, probability
More informationSpring 2014 Advanced Probability Overview. Lecture Notes Set 1: Course Overview, σ-fields, and Measures
36-752 Spring 2014 Advanced Probability Overview Lecture Notes Set 1: Course Overview, σ-fields, and Measures Instructor: Jing Lei Associated reading: Sec 1.1-1.4 of Ash and Doléans-Dade; Sec 1.1 and A.1
More informationProblem Selected Scores
Statistics Ph.D. Qualifying Exam: Part II November 20, 2010 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. Problem 1 2 3 4 5 6 7 8 9 10 11 12 Selected
More informationResearch Article Exponential Inequalities for Positively Associated Random Variables and Applications
Hindawi Publishing Corporation Journal of Inequalities and Applications Volume 008, Article ID 38536, 11 pages doi:10.1155/008/38536 Research Article Exponential Inequalities for Positively Associated
More informationReview. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda
Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with
More informationStat 710: Mathematical Statistics Lecture 31
Stat 710: Mathematical Statistics Lecture 31 Jun Shao Department of Statistics University of Wisconsin Madison, WI 53706, USA Jun Shao (UW-Madison) Stat 710, Lecture 31 April 13, 2009 1 / 13 Lecture 31:
More informationSYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions
SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu
More informationChapter 2. Discrete Distributions
Chapter. Discrete Distributions Objectives ˆ Basic Concepts & Epectations ˆ Binomial, Poisson, Geometric, Negative Binomial, and Hypergeometric Distributions ˆ Introduction to the Maimum Likelihood Estimation
More informationTwo-stage Adaptive Randomization for Delayed Response in Clinical Trials
Two-stage Adaptive Randomization for Delayed Response in Clinical Trials Guosheng Yin Department of Statistics and Actuarial Science The University of Hong Kong Joint work with J. Xu PSI and RSS Journal
More information14.30 Introduction to Statistical Methods in Economics Spring 2009
MIT OpenCourseWare http://ocw.mit.edu 4.0 Introduction to Statistical Methods in Economics Spring 009 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
More informationEmpirical likelihood ratio with arbitrarily censored/truncated data by EM algorithm
Empirical likelihood ratio with arbitrarily censored/truncated data by EM algorithm Mai Zhou 1 University of Kentucky, Lexington, KY 40506 USA Summary. Empirical likelihood ratio method (Thomas and Grunkmier
More informationChapter 3. Point Estimation. 3.1 Introduction
Chapter 3 Point Estimation Let (Ω, A, P θ ), P θ P = {P θ θ Θ}be probability space, X 1, X 2,..., X n : (Ω, A) (IR k, B k ) random variables (X, B X ) sample space γ : Θ IR k measurable function, i.e.
More informationModeling and Analysis of Recurrent Event Data
Edsel A. Peña (pena@stat.sc.edu) Department of Statistics University of South Carolina Columbia, SC 29208 New Jersey Institute of Technology Conference May 20, 2012 Historical Perspective: Random Censorship
More information1 The Glivenko-Cantelli Theorem
1 The Glivenko-Cantelli Theorem Let X i, i = 1,..., n be an i.i.d. sequence of random variables with distribution function F on R. The empirical distribution function is the function of x defined by ˆF
More information