Analysis methods of heavy-tailed data

Size: px
Start display at page:

Download "Analysis methods of heavy-tailed data"

Transcription

1 Institute of Control Sciences Russian Academy of Sciences, Moscow, Russia February, 13-18, 2006, Bamberg, Germany June, 19-23, 2006, Brest, France May, 14-19, 2007, Trondheim, Norway PhD course

2 Chapter 3 Heavy-tailed density estimation. Combined parametric-nonparametric methods, Barron s estimate and χ 2 -optimality. Kernel estimators with the variable bandwidth and their smoothing methods: weighted version of squared error cross-validation (WISE), discrepancy method. Re-transformed nonparametric estimators.

3 In Section 3 the problems of the heavy-tailed density estimation are discussed. Three approaches to heavy-tailed density estimation are considered. 1 Combined parametric-nonparametric methods, where the tail domain of the density is fitted by some parametric model and the main part of the density (the body) is fitted by some nonparametric method like a histogram. Similar approach realized by Barron s estimator is considered. 2 Kernel estimates with variable bandwidth. The optimal accuracy of these estimates as well as their disadvantages for heavy-tailed density estimation are discussed. 3 Re-transformed estimates, that use a preliminary transformation of underlying random variable to a new one with the density that is more convenient for restoration.

4 Specific features of analysis of heavy-tailed distributions are the following: Aim: heavy tail goes to zero at slower than at an exponential rate; Cramér s condition is violated; sparse observations in the tail domain of the distribution. Non-parametric PDF estimation with accurate tail behavior. Comparison of PDF s is needed in the classification: classification of measurements belonging to different sources: mobile, fax, normal calls, Internet...; classification of service, using customers behavior.

5 Example of heavy-tailed density estimation. The Fréchet PDF : the body and the tail estimation.

6 Statement of the problem Combined parametric-non-parametric estimators with separate estimation of the tail and the body of the PDF. Kernel estimator ˆfh (x) = 1 nh n ( ) x xi K h i=1 provide peeks at the tail domain or over-smooth the main part of the PDF for the finite samples;

7 Statement of the problem Variable bandwidth kernel estimator ( ) ˆf A (x) = 1 n f(x i ) 1/2 (x X K i )f(x i ) 1/2, nh h i=1 ( ) f A (x) = 1 n ˆf(Xi ) 1/2 (x X K i )ˆf(X i ) 1/2, nh h i=1 ˆf(Xi ) is a pilot estimate of f(x), e.g. a standard kernel estimate. Advantage in comparison with standard kernel estimator. Local adaptation to the sample by means of ˆf(X i ) 1/2 h with a fixed h.

8 Problems of kernel estimates at finite PDF s. Boundary effects of kernel estimates. Epanechnikov s kernel for h1 < 1 Y (n) : the truncation of the kernel; h2 = 1 Y (n) : the kernel corresponds to a triangular PDF at the neighborhood of 1; h3 > 1 Y (n) : oversmoothing of the PDF. Y (n) = 0.8

9 Outline: Heavy-tailed density estimation: 1 the combined approach; 2 variable bandwidth kernel estimators. 3 the usage of the transform-re-transform scheme. Boundary kernels The discrepancy method and cross-validation as smoothing tools for a variable bandwidth kernel estimator.

10 Main assumption: The asymptotical behavior of F(x) at is based on the asymptotic limit distribution of the maximum of the sample. Gnedenko, (1943): if F(x) is such that the limit distribution of the maximum M n = max(x 1, X 2,...,X n ) exists, then this limit distribution can only be of the following form for some normalizing constants a n > 0, b n R P{(M n b n )/a n x} = F n (b n + a n x) n H γ (x), x R, where H γ (x) = exp( x 1/γ ), x > 0, γ > 0 Fréchet, exp( ( x) 1/γ ), x < 0, γ < 0 Weibull, exp( e x ), γ = 0, x R Gumbel.

11 Combined estimators for heavy-tailed densities. Combined parametric-nonparametric method. { f f(t, N (t), t [0, X γ, N) = (n k) ], f γ (t), t (X (n k), ), X (n k) is some r.v., f γ (x) = (1/γ)x 1/γ 1 + (2/γ)x 2/γ 1, is the parametrical tail model of Pareto type, f N (t) = 1 X (n k) N t λ j ϕ j ( ), X (n k) is the non-parametrical estimator of the main part of the PDF that is an expansion by basis functions ϕ k (t), k = 1, 2,.... j=1

12 Estimation of mixtures of two PDFs by combined estimator. Estimation of the PDF of the mixture of Gamma and Pareto distributions (left) and two Gamma distributions (right) by combined estimator.

13 Barron s estimator and χ 2 -optimality. Let P n = {A n1... A nmn } be partitions of the real line (0, ) into finite intervals (bins) by quantiles G 1 (j/m n ), 1 j m n 1 of arbitrary distribution G(x), δ i = df n (x) = 1 n 1 Ani (X i ), n is sample size. A ni n i=1 Estimator, Barron, Györfi, van der Meulen, (1992): 1/n + δ ˆfB (x) = g(x) i, x A nj, 1 j m n, 1/n + 1/m n g(x) is a tail model. Histogram-type estimate at [0, X (k) ] superposes with the tail model g(x). The estimate is consistent in a sense of χ 2 -divergence if m n and m n /n 0 as n.

14 Problems of ˆf B (x): Optimal selection of partitions. Optimal selection of g(x). The behavior of the true DF for x > X (n) is unknown. One has to apply asymptotic results of the extreme value theory regarding the behavior of the DF at infinity. Examples of auxiliary distribution G(x) are lognormal, normal, Weibull distributions. The choice of the auxiliary density g(x) = G (x) influences hardly on the estimate in the tail domain x A nmn = [X (k), ): ˆF(x) = 1/n + δ m n 1/n + 1/m n x g(x)dx = 1/n + δ m n 1/n + 1/m n (1 G(x)). For samples of moderate sizes the tail model g(x) distorts the estimate of the body of the PDF. For large samples this influence becomes weaker.

15 Kernel estimates with the variable bandwidth. Let X n = {X 1,...,X n } be a sample of i.i.d. r.v. distributed with the heavy-tailed DF F(x) and the PDF f(x). Variable bandwidth kernel estimate, Abramson (1982). n ( ) ˆf A (x h) = (nh) 1 f(x i ) 1/2 K (x X i )f(x i ) 1/2 /h Practical version f A (x h 1, h) = (nh) 1 i=1 n ˆfh1 (X i ) 1/2 K i=1 ( ) (x X i )ˆf h1 (X i ) 1/2 /h Main advantages: Non-negativity; The best mean squared error.

16 Mean squared errors (MSE) for kernel estimates: MSE = E (ˆfh (x) f(x)) 2 dx MSE of a standard kernel estimate. MSE(ˆf h ) n 4/5 (bias h 2 ; variance (nh) 1 ) if a second-order kernel is used, h n 1/5, f has two continuous derivatives. MSE of a variable bandwidth kernel estimate. MSE(ˆf A (x h)) n 8/9 (bias h 4 ; variance (nh) 1 ) if a symmetric kernel such as x 4 K(x) dx < is used, h n 1/9, f has four continuous derivatives.

17 Cross-validation for a variable bandwidth kernel estimator (P.Hall (1992)). Weighted integrated squared error where WISE = f i (x; h) = 1 nh p f i (x; h) 2 ω(x)dx 2 n j=1,j i ˆf i (X j, h 1 ) p/2 K 1 ( x X j Ah ), A > 0, f i (x; h) 2 f(x)ω(x)dx, ω(x) is a bounded, nonnegative function (a weight). ( ) (x Xj )ˆf i (X j, h 1 ) 1/2 h

18 Cross-validation for a variable bandwidth kernel estimator (P.Hall (1992)). Example of the weight function: ω(x) = { 1, for Σ 1/2 (x µ) 2 z η, 0, otherwise, where µ and Σ denote the sample mean and variance, is Euclidean distance, z η is the upper (1 η)-level critical point of the chi-squared distribution. Practical version: ŴISE = f i (x; h) 2 ω(x)dx 2 n How good is h? n f i (X i ; h) 2 ω(x i ) i=1

19 Discrepancy method for variable bandwidth kernel estimator. Let h be a solution of the discrepancy equation sup F n (x) Fh,h A 1 (x) = δn 1/2, (1) <x< F n (x) is an empirical DF. F A h,h 1 (x) = x f A (t h 1, h)dt, f A (t h 1, h) is a variable bandwidth kernel estimator. δ is a quantile of the Kolmogorov-Smirnov statistic ndn = n sup <x< F n (x) F(x).

20 Discrepancy method for variable bandwidth kernel estimator. The bias rate. Let h is a solution of the discrepancy equation. It is possible to prove the following. Assuming h 1 = cn 1/5 we have ( P{h > n 1/9 } < exp 2n 1 2/α), for α > 2. ( P{E f A (x h 1, h ) f(x) > ψ(x)n 4/9 } < 2 exp 2n 1/9), where ψ(x) is a function that is independent on n.

21 Discrepancy method for variable bandwidth kernel estimator. Practical version. nmax (ˆD+ n, ˆD ) n = 0.5, where ˆD n ˆD + n = ( ) i n max 1 i n n F h,h A 1 (X (i) ), = ( n max Fh,h A 1 i n 1 (X (i) ) i 1 ), n X (1) X (2)... X (n) are order statistics.

22 Approach with transformations. X 1,..., X n T Y 1,..., Y n, Y j = T(X j ), j = 1,...,n Let T(x) be a monotone increasing one-to-one transformation function (T is continuous). The re-transformed estimate of the PDF of X i is ˆf(x) = ĝ(t(x))t (x), g(x) is the PDF of the r.v. Y i. The DF of the r.v. Y i is G(x) = P{Y i x} = P{T(X i ) x} = F(T 1 (x))

23 Approach with transformations. Preliminary transformations to a new r.v. Y j = T(X j ), j = 1,...,n: Fixed transformations: ln x, 2/π arctan x. Features of fixed transformations: Advantage: they do not require any knowledge about the distribution of X. Disadvantage: they can lead to densities of the transformed r.v.s Y j with discontinuity, which are difficult for the estimation.

24 Approach with transformations. First adapted transformations to a new r.v. Y j = T(X j ), j = 1,...,n: (Wand et al, 1991) T(x) = { xλ sign(λ), λ 0, ln x, λ = 0 ( λ = arg min g (y) ) 2 dy, R where g is the PDF of the transformed r.v. Y 1 = T λ (X 1 ).

25 Approach with transformations. First adapted transformations to a new r.v. Y j = T(X j ), j = 1,...,n: T : R + [0, 1], F(x) is some parametric model (Devroye and Györfi, 1985) The transformation to an isosceles triangular PDF φ tri (x) on [0, 1] F(x), F(x) 0.5, T (x) = F(x) 2, F(x) > 0.5, for kernel estimates with compact kernels and the transformation T(x) = F(x) to a uniform PDF φ uni (x) for a histogram, provide the minimal convergence rate in L 1 : min g E 1 0 ĝ 0 (x) g(x) dx

26 Approach with transformations. Problems of transform-re-transform scheme: The DF F(x) is unknown: impossibility to transform to exact desirable PDF. Selection of a parametric or non-parametric family of distributions as guess DF s. Selection of a target PDF to provide the stability of the re-transformed estimates to minor perturbations in the tail index estimates. Selection of the PDF estimate to keep the tail decay rate (of the true PDF ) after the inverse transformation.

27 Approach with transformations. Adaptive transformation (Maiboroda & Markovich (2004)): Tˆγ (x) = Φ 1 (Ψˆγ (x)) = 1 (1 + ˆγx) 1/(2ˆγ), the guess DF F of X i is assumed to be the Generalized Pareto distribution { 1 (1 + ˆγx) Ψˆγ (x) = 1/ˆγ, x 0, 0, x < 0, the target DF G of Y i is Φ(x) = (2x x 2 )1{x [0, 1]} + 1{x > 1}.

28 Approach with transformations. Adaptive transformation The transformation provides the PDF g(x) at [0, 1] that is continuous in the neighborhood of 1 for typical distributions (with regularly varying tails, lognormal-type tails and Weibull-like tails) for a consistent estimate ˆγ of γ. Estimators ĝ(x): polygram, kernel estimate 1/(nh) n i=1 K ((x x i)/h) The choice of Generalized Pareto distribution is widespread and motivated by Pickands s theorem which states that, for a certain class of distributions and for a sufficiently high threshold u of the r.v. X, the conditional distribution of the overshoot Y = X u, provided that X exceeds u, converges to a Generalized Pareto distribution.

29 Adaptive transformation approach. Estimation algorithm: 1 The tail index of X j is estimated by the sample {X 1,...,X n } using the Hill estimate ˆγ k = 1/k k i=1 log X (n i+1) log X (n k). (X (1)... X (n) is the order statistics of the sample.) 2 The transformation T = Tˆγk is constructed as follows: if ξ has the guess DF Ψˆγk then Tˆγk (ξ) has the target DF Φ (e.g. a triangular). (Here ˆγ k is considered as a fixed value). 3 The transformed sample Y j = Tˆγk (X j ), j = 1,...,n is constructed. 4 The PDF of Y 1,...,Y n is estimated by some estimate ĝ h (x). 5 The PDF of X j is estimated by ˆf h (x) = ĝ h (T(x))T (x).

30 The Pareto PDF with γ = 1. Sample size is n = 100. The PDF of the Gaussian distribution N(0, σ) is used as ˆf(x) for f A (x). The Gaussian kernel is used in the re-transformed kernel estimator. h1 = σ(y)n 1/5 = 0.099, h2 = 1.06σ(X)n 1/5 = 9.453, σ(x) and σ(y) are standard deviations of samples X n = {X 1,..., X n } and Y n = Tˆγ (X n ).

31 The Pareto PDF with γ = 1. Sample size is n = 100. The Gaussian kernel is used in the re-transformed kernel estimator and for ˆf(x) in f A (x). Tˆγ (x) = 1 (1 + ˆγx) 1/(2ˆγ) is the adapted transformation. ˆγ is the Hill s estimator, k is estimated by bootstrap.

32 The Weibull PDF with γ = 0.5. Sample size is n = 100. The Gaussian kernel is used for the re-transformed kernel estimator and for ˆf(x) in f A (x). h1 = σ(y)n 1/5 = 0.102, h2 = 1.06σ(X)n 1/5 = 3.673, Tˆγ (x) = 1 (1 + ˆγx) 1/(2ˆγ)

33 The accuracy of the re-transformed estimators. Mean integrated squared error (MISE). MISE h (ˆγ,Ω) = E = E = E Ω Ω (ˆf(x) f(x)) 2 dx (ĝ h (Tˆγ (x)) g(tˆγ (x))) 2 T ˆγ (x)dtˆγ(x) Ω (ĝ h (y) g(y)) 2 T 1 ˆγ (Tˆγ (y))dy, where Ω = Tˆγ (Ω). For the fixed transformations and non-random intervals Ω : MISE h (Ω) = T (T 1 (y))e(ĝ h (y) g(y)) 2 dy. Ω

34 The MISE of re-transformed kernel estimators. Mean integrated squared error (MISE). If 0 < T (T 1 (x)) c holds at Ω for the transformation T (not necessary fixed), then we have MISE h (Ω) c E(ĝ h (y) g(y)) 2 dy Ω for a non-random Ω. MSE of kernel estimates. MSE(ĝ h ) n 4/5 if a non-variable bandwidth kernel estimator as ĝ h (y) is used, h n 1/5, g (2) is continuous, MSE(ĝ h ) n 8/9 if a variable bandwidth kernel estimator as ĝ h (y) is used, h n 1/9, g (4) is continuous.

35 The rate of decay of re-transformed estimators at by γ. Boundary kernels. A bias of the estimate at the boundary.

36 Boundary kernels. Example. Let the PDF be f(x) = The re-transformed estimate is { l(x)(1 + γx) (1/γ+1), x 0 0, x < 0 ˆf(x) = ĝh (Tˆγ (x))t ˆγ (x) = 0.5ĝ h (Tˆγ (x))(1 + ˆγx) 1/(2ˆγ) 1, Transformation is Tˆγ (x) = 1 (1 + ˆγx) 1/(2ˆγ) Smoothed polygram gives ĝ n (x) = C n (1 x), x 1, ˆfn (x) 0.5C n (1 + ˆγx) (1/ˆγ+1) Kernel estimator gives ˆf h (x) 0.5ĝ h (1)(1 + ˆγx) (1/(2ˆγ)+1), i.e. the EVI is two times larger than needed.

37 Boundary kernels. Example. Principles of the selection of boundary kernels. The kernel coincides with the target PDF : K(y) = g(y), y [Y (n), 1], Direct fitting of the boundary : ( ) 1 T(x) Y(n) h : h K T (x) = h ˆf(x), because ĝ h (y) 1 ( ) y Y(n) h K h y (Y (n), 1]

38 Reduction of the boundary bias, Simonoff (1996): Let a new kernel independent on the PDF be B(x) = (a 2(p) a 1 (p)x) K(x) a 0 (p)a 2 (p) a1 2(p), a l (p) = 1 1 p u l K(u)du, 0 < p < 1. The bias of the kernel estimator with such a kernel in the boundary region is O(h 2 ) and the variance is O((nh) 1 ) (the same as in the interior) when second derivative of the underlying density is continuous.

39 Overcoming of boundary effects. Combination of two approaches: usage of B(x) and K(y) = g(y). For the adapted transformation and h = 1 Y (n) we have ĝ h (Tˆγ (x)) = 1 h 1 ( ) Tˆγ h B (x) Y (n) h ( a 2 (p) a 1 (p) Tˆγ(x) Y (n) h (1 + ˆγx) 1/(2ˆγ), ĝ h (y) 1 ( ) y Y(n) h K h ) K a 0 (p)a 2 (p) a 2 1 (p) y (Y (n), 1] ( ) Tˆγ (x) Y (n) h

40 Re-transformed kernel estimators applied to Web data. PDF estimation of the sizes of sub-sessions (left) and inter-response times (right). K(x) is Epanechnikov s kernel. h = σn 1/5, h1 = 1.01 Tˆγ (X (n) ), h < h1, σ is the variance of the transformed data.

41 Comparison of re-transformed kernel estimate and variable bandwidth kernel estimate. Retransformed standard kernel estimate and variable bandwidth kernel estimate with Epanechnikov s kernel for Pareto distribution: body (left) and tail (right). h is selected by D-method.

42 Comparison of re-transformed kernel estimate and variable bandwidth kernel estimate. Conclusions: Pure variable bandwidth kernel estimator does not fit the density at infinity at least with compact supported kernels in contrast to a variable bandwidth kernel estimator that uses transformation of the data.

43 Papers: Markovitch, N.M., Krieger U.R. (2000a) Nonparametric estimation of long-tailed density functions and its application to the analysis of World Wide Web traffic. Performance Evaluation, 42(2-3), Markovitch, N.M. and Krieger, U.R. (2002) The estimation of heavy-tailed probability density functions, their mixtures and quantiles. Computer Networks, Vol. 40, Issue 3, Maiboroda R.E., Markovich N.M. (2004) Estimation of heavy-tailed probability density function with application to Web data. Computational Statistics, 4. Barron,A.R., Chyong-Hwa Sheu. (1991) Approximation of density functions by sequences of exponential families. Annals of statistics, 19, 3,

44 Papers: Barron, A.R., Györfi, L., van der Meulen, E. (1992) Distribution estimation consistent in total variation and in two types of information divergence. IEEE Trans.Inform Theory, 38, Silverman, B.W. (1986) Density Estimation for Statistics and Data Analysis, New York: Chapman&Hall. Hall, P. (1992) On global properties of variable bandwidth density estimators. Annals of Statistics, 20, 2,

Analysis methods of heavy-tailed data

Analysis methods of heavy-tailed data Institute of Control Sciences Russian Academy of Sciences, Moscow, Russia February, 13-18, 2006, Bamberg, Germany June, 19-23, 2006, Brest, France May, 14-19, 2007, Trondheim, Norway PhD course Chapter

More information

Nonparametric estimation of long-tailed density functions and its application to the analysis of World Wide Web traffic

Nonparametric estimation of long-tailed density functions and its application to the analysis of World Wide Web traffic Performance Evaluation 4 ) 5 Nonparametric estimation of long-tailed density functions and its application to the analysis of World Wide Web traffic Natalia M. Markovitch a,, Udo R. Krieger b,c a Institute

More information

Math 576: Quantitative Risk Management

Math 576: Quantitative Risk Management Math 576: Quantitative Risk Management Haijun Li lih@math.wsu.edu Department of Mathematics Washington State University Week 11 Haijun Li Math 576: Quantitative Risk Management Week 11 1 / 21 Outline 1

More information

Adaptive Kernel Estimation of The Hazard Rate Function

Adaptive Kernel Estimation of The Hazard Rate Function Adaptive Kernel Estimation of The Hazard Rate Function Raid Salha Department of Mathematics, Islamic University of Gaza, Palestine, e-mail: rbsalha@mail.iugaza.edu Abstract In this paper, we generalized

More information

Quantitative Economics for the Evaluation of the European Policy. Dipartimento di Economia e Management

Quantitative Economics for the Evaluation of the European Policy. Dipartimento di Economia e Management Quantitative Economics for the Evaluation of the European Policy Dipartimento di Economia e Management Irene Brunetti 1 Davide Fiaschi 2 Angela Parenti 3 9 ottobre 2015 1 ireneb@ec.unipi.it. 2 davide.fiaschi@unipi.it.

More information

On variable bandwidth kernel density estimation

On variable bandwidth kernel density estimation JSM 04 - Section on Nonparametric Statistics On variable bandwidth kernel density estimation Janet Nakarmi Hailin Sang Abstract In this paper we study the ideal variable bandwidth kernel estimator introduced

More information

Chapter 9. Non-Parametric Density Function Estimation

Chapter 9. Non-Parametric Density Function Estimation 9-1 Density Estimation Version 1.2 Chapter 9 Non-Parametric Density Function Estimation 9.1. Introduction We have discussed several estimation techniques: method of moments, maximum likelihood, and least

More information

Introduction to Curve Estimation

Introduction to Curve Estimation Introduction to Curve Estimation Density 0.000 0.002 0.004 0.006 700 800 900 1000 1100 1200 1300 Wilcoxon score Michael E. Tarter & Micheal D. Lock Model-Free Curve Estimation Monographs on Statistics

More information

Histogram Härdle, Müller, Sperlich, Werwatz, 1995, Nonparametric and Semiparametric Models, An Introduction

Histogram Härdle, Müller, Sperlich, Werwatz, 1995, Nonparametric and Semiparametric Models, An Introduction Härdle, Müller, Sperlich, Werwatz, 1995, Nonparametric and Semiparametric Models, An Introduction Tine Buch-Kromann Construction X 1,..., X n iid r.v. with (unknown) density, f. Aim: Estimate the density

More information

Boundary Correction Methods in Kernel Density Estimation Tom Alberts C o u(r)a n (t) Institute joint work with R.J. Karunamuni University of Alberta

Boundary Correction Methods in Kernel Density Estimation Tom Alberts C o u(r)a n (t) Institute joint work with R.J. Karunamuni University of Alberta Boundary Correction Methods in Kernel Density Estimation Tom Alberts C o u(r)a n (t) Institute joint work with R.J. Karunamuni University of Alberta November 29, 2007 Outline Overview of Kernel Density

More information

Kernel density estimation

Kernel density estimation Kernel density estimation Patrick Breheny October 18 Patrick Breheny STA 621: Nonparametric Statistics 1/34 Introduction Kernel Density Estimation We ve looked at one method for estimating density: histograms

More information

Chapter 9. Non-Parametric Density Function Estimation

Chapter 9. Non-Parametric Density Function Estimation 9-1 Density Estimation Version 1.1 Chapter 9 Non-Parametric Density Function Estimation 9.1. Introduction We have discussed several estimation techniques: method of moments, maximum likelihood, and least

More information

12 - Nonparametric Density Estimation

12 - Nonparametric Density Estimation ST 697 Fall 2017 1/49 12 - Nonparametric Density Estimation ST 697 Fall 2017 University of Alabama Density Review ST 697 Fall 2017 2/49 Continuous Random Variables ST 697 Fall 2017 3/49 1.0 0.8 F(x) 0.6

More information

O Combining cross-validation and plug-in methods - for kernel density bandwidth selection O

O Combining cross-validation and plug-in methods - for kernel density bandwidth selection O O Combining cross-validation and plug-in methods - for kernel density selection O Carlos Tenreiro CMUC and DMUC, University of Coimbra PhD Program UC UP February 18, 2011 1 Overview The nonparametric problem

More information

Density and Distribution Estimation

Density and Distribution Estimation Density and Distribution Estimation Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 04-Jan-2017 Nathaniel E. Helwig (U of Minnesota) Density

More information

Nonparametric Density Estimation

Nonparametric Density Estimation Nonparametric Density Estimation Econ 690 Purdue University Justin L. Tobias (Purdue) Nonparametric Density Estimation 1 / 29 Density Estimation Suppose that you had some data, say on wages, and you wanted

More information

Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood

Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood Kuangyu Wen & Ximing Wu Texas A&M University Info-Metrics Institute Conference: Recent Innovations in Info-Metrics October

More information

Adaptive Nonparametric Density Estimators

Adaptive Nonparametric Density Estimators Adaptive Nonparametric Density Estimators by Alan J. Izenman Introduction Theoretical results and practical application of histograms as density estimators usually assume a fixed-partition approach, where

More information

Nonparametric Methods

Nonparametric Methods Nonparametric Methods Michael R. Roberts Department of Finance The Wharton School University of Pennsylvania July 28, 2009 Michael R. Roberts Nonparametric Methods 1/42 Overview Great for data analysis

More information

Extreme Value Analysis and Spatial Extremes

Extreme Value Analysis and Spatial Extremes Extreme Value Analysis and Department of Statistics Purdue University 11/07/2013 Outline Motivation 1 Motivation 2 Extreme Value Theorem and 3 Bayesian Hierarchical Models Copula Models Max-stable Models

More information

EMPIRICAL EVALUATION OF DATA-BASED DENSITY ESTIMATION

EMPIRICAL EVALUATION OF DATA-BASED DENSITY ESTIMATION Proceedings of the 2006 Winter Simulation Conference L. F. Perrone, F. P. Wieland, J. Liu, B. G. Lawson, D. M. Nicol, and R. M. Fujimoto, eds. EMPIRICAL EVALUATION OF DATA-BASED DENSITY ESTIMATION E. Jack

More information

The Convergence Rate for the Normal Approximation of Extreme Sums

The Convergence Rate for the Normal Approximation of Extreme Sums The Convergence Rate for the Normal Approximation of Extreme Sums Yongcheng Qi University of Minnesota Duluth WCNA 2008, Orlando, July 2-9, 2008 This talk is based on a joint work with Professor Shihong

More information

Statistic Distribution Models for Some Nonparametric Goodness-of-Fit Tests in Testing Composite Hypotheses

Statistic Distribution Models for Some Nonparametric Goodness-of-Fit Tests in Testing Composite Hypotheses Communications in Statistics - Theory and Methods ISSN: 36-926 (Print) 532-45X (Online) Journal homepage: http://www.tandfonline.com/loi/lsta2 Statistic Distribution Models for Some Nonparametric Goodness-of-Fit

More information

Estimation of risk measures for extreme pluviometrical measurements

Estimation of risk measures for extreme pluviometrical measurements Estimation of risk measures for extreme pluviometrical measurements by Jonathan EL METHNI in collaboration with Laurent GARDES & Stéphane GIRARD 26th Annual Conference of The International Environmetrics

More information

Time Series and Forecasting Lecture 4 NonLinear Time Series

Time Series and Forecasting Lecture 4 NonLinear Time Series Time Series and Forecasting Lecture 4 NonLinear Time Series Bruce E. Hansen Summer School in Economics and Econometrics University of Crete July 23-27, 2012 Bruce Hansen (University of Wisconsin) Foundations

More information

Statistical inference on Lévy processes

Statistical inference on Lévy processes Alberto Coca Cabrero University of Cambridge - CCA Supervisors: Dr. Richard Nickl and Professor L.C.G.Rogers Funded by Fundación Mutua Madrileña and EPSRC MASDOC/CCA student workshop 2013 26th March Outline

More information

Nonparametric Function Estimation with Infinite-Order Kernels

Nonparametric Function Estimation with Infinite-Order Kernels Nonparametric Function Estimation with Infinite-Order Kernels Arthur Berg Department of Statistics, University of Florida March 15, 2008 Kernel Density Estimation (IID Case) Let X 1,..., X n iid density

More information

Maximum Likelihood vs. Least Squares for Estimating Mixtures of Truncated Exponentials

Maximum Likelihood vs. Least Squares for Estimating Mixtures of Truncated Exponentials Maximum Likelihood vs. Least Squares for Estimating Mixtures of Truncated Exponentials Helge Langseth 1 Thomas D. Nielsen 2 Rafael Rumí 3 Antonio Salmerón 3 1 Department of Computer and Information Science

More information

A Novel Nonparametric Density Estimator

A Novel Nonparametric Density Estimator A Novel Nonparametric Density Estimator Z. I. Botev The University of Queensland Australia Abstract We present a novel nonparametric density estimator and a new data-driven bandwidth selection method with

More information

Nonparametric estimation of extreme risks from heavy-tailed distributions

Nonparametric estimation of extreme risks from heavy-tailed distributions Nonparametric estimation of extreme risks from heavy-tailed distributions Laurent GARDES joint work with Jonathan EL METHNI & Stéphane GIRARD December 2013 1 Introduction to risk measures 2 Introduction

More information

A Conditional Approach to Modeling Multivariate Extremes

A Conditional Approach to Modeling Multivariate Extremes A Approach to ing Multivariate Extremes By Heffernan & Tawn Department of Statistics Purdue University s April 30, 2014 Outline s s Multivariate Extremes s A central aim of multivariate extremes is trying

More information

Extreme Value Theory as a Theoretical Background for Power Law Behavior

Extreme Value Theory as a Theoretical Background for Power Law Behavior Extreme Value Theory as a Theoretical Background for Power Law Behavior Simone Alfarano 1 and Thomas Lux 2 1 Department of Economics, University of Kiel, alfarano@bwl.uni-kiel.de 2 Department of Economics,

More information

n! (k 1)!(n k)! = F (X) U(0, 1). (x, y) = n(n 1) ( F (y) F (x) ) n 2

n! (k 1)!(n k)! = F (X) U(0, 1). (x, y) = n(n 1) ( F (y) F (x) ) n 2 Order statistics Ex. 4. (*. Let independent variables X,..., X n have U(0, distribution. Show that for every x (0,, we have P ( X ( < x and P ( X (n > x as n. Ex. 4.2 (**. By using induction or otherwise,

More information

Minimum Hellinger Distance Estimation in a. Semiparametric Mixture Model

Minimum Hellinger Distance Estimation in a. Semiparametric Mixture Model Minimum Hellinger Distance Estimation in a Semiparametric Mixture Model Sijia Xiang 1, Weixin Yao 1, and Jingjing Wu 2 1 Department of Statistics, Kansas State University, Manhattan, Kansas, USA 66506-0802.

More information

ESTIMATING BIVARIATE TAIL

ESTIMATING BIVARIATE TAIL Elena DI BERNARDINO b joint work with Clémentine PRIEUR a and Véronique MAUME-DESCHAMPS b a LJK, Université Joseph Fourier, Grenoble 1 b Laboratoire SAF, ISFA, Université Lyon 1 Framework Goal: estimating

More information

Central limit theorem for the variable bandwidth kernel density estimators

Central limit theorem for the variable bandwidth kernel density estimators Central limit theorem for the variable bandwidth kernel density estimators Janet Nakarmi a and Hailin Sang b a Department of Mathematics, University of Central Arkansas, Conway, AR 72035, USA. E-mail address:

More information

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics Jiti Gao Department of Statistics School of Mathematics and Statistics The University of Western Australia Crawley

More information

Analysis methods of heavy-tailed data

Analysis methods of heavy-tailed data Institute of Control Sciences Russian Academy of Sciences, Moscow, Russia February, 13-18, 2006, Bamberg, Germany June, 19-23, 2006, Brest, France May, 14-19, 2007, Trondheim, Norway PhD course Chapter

More information

ON SOME TWO-STEP DENSITY ESTIMATION METHOD

ON SOME TWO-STEP DENSITY ESTIMATION METHOD UNIVESITATIS IAGELLONICAE ACTA MATHEMATICA, FASCICULUS XLIII 2005 ON SOME TWO-STEP DENSITY ESTIMATION METHOD by Jolanta Jarnicka Abstract. We introduce a new two-step kernel density estimation method,

More information

Smooth simultaneous confidence bands for cumulative distribution functions

Smooth simultaneous confidence bands for cumulative distribution functions Journal of Nonparametric Statistics, 2013 Vol. 25, No. 2, 395 407, http://dx.doi.org/10.1080/10485252.2012.759219 Smooth simultaneous confidence bands for cumulative distribution functions Jiangyan Wang

More information

Frontier estimation based on extreme risk measures

Frontier estimation based on extreme risk measures Frontier estimation based on extreme risk measures by Jonathan EL METHNI in collaboration with Ste phane GIRARD & Laurent GARDES CMStatistics 2016 University of Seville December 2016 1 Risk measures 2

More information

Econ 582 Nonparametric Regression

Econ 582 Nonparametric Regression Econ 582 Nonparametric Regression Eric Zivot May 28, 2013 Nonparametric Regression Sofarwehaveonlyconsideredlinearregressionmodels = x 0 β + [ x ]=0 [ x = x] =x 0 β = [ x = x] [ x = x] x = β The assume

More information

Financial Econometrics and Volatility Models Extreme Value Theory

Financial Econometrics and Volatility Models Extreme Value Theory Financial Econometrics and Volatility Models Extreme Value Theory Eric Zivot May 3, 2010 1 Lecture Outline Modeling Maxima and Worst Cases The Generalized Extreme Value Distribution Modeling Extremes Over

More information

A Note on Tail Behaviour of Distributions. the max domain of attraction of the Frechét / Weibull law under power normalization

A Note on Tail Behaviour of Distributions. the max domain of attraction of the Frechét / Weibull law under power normalization ProbStat Forum, Volume 03, January 2010, Pages 01-10 ISSN 0974-3235 A Note on Tail Behaviour of Distributions in the Max Domain of Attraction of the Frechét/ Weibull Law under Power Normalization S.Ravi

More information

Nonparametric Inference via Bootstrapping the Debiased Estimator

Nonparametric Inference via Bootstrapping the Debiased Estimator Nonparametric Inference via Bootstrapping the Debiased Estimator Yen-Chi Chen Department of Statistics, University of Washington ICSA-Canada Chapter Symposium 2017 1 / 21 Problem Setup Let X 1,, X n be

More information

MFM Practitioner Module: Quantitiative Risk Management. John Dodson. October 14, 2015

MFM Practitioner Module: Quantitiative Risk Management. John Dodson. October 14, 2015 MFM Practitioner Module: Quantitiative Risk Management October 14, 2015 The n-block maxima 1 is a random variable defined as M n max (X 1,..., X n ) for i.i.d. random variables X i with distribution function

More information

Nonparametric Econometrics

Nonparametric Econometrics Applied Microeconometrics with Stata Nonparametric Econometrics Spring Term 2011 1 / 37 Contents Introduction The histogram estimator The kernel density estimator Nonparametric regression estimators Semi-

More information

Investigation of goodness-of-fit test statistic distributions by random censored samples

Investigation of goodness-of-fit test statistic distributions by random censored samples d samples Investigation of goodness-of-fit test statistic distributions by random censored samples Novosibirsk State Technical University November 22, 2010 d samples Outline 1 Nonparametric goodness-of-fit

More information

Shape of the return probability density function and extreme value statistics

Shape of the return probability density function and extreme value statistics Shape of the return probability density function and extreme value statistics 13/09/03 Int. Workshop on Risk and Regulation, Budapest Overview I aim to elucidate a relation between one field of research

More information

41903: Introduction to Nonparametrics

41903: Introduction to Nonparametrics 41903: Notes 5 Introduction Nonparametrics fundamentally about fitting flexible models: want model that is flexible enough to accommodate important patterns but not so flexible it overspecializes to specific

More information

Tail bound inequalities and empirical likelihood for the mean

Tail bound inequalities and empirical likelihood for the mean Tail bound inequalities and empirical likelihood for the mean Sandra Vucane 1 1 University of Latvia, Riga 29 th of September, 2011 Sandra Vucane (LU) Tail bound inequalities and EL for the mean 29.09.2011

More information

Modelling Non-linear and Non-stationary Time Series

Modelling Non-linear and Non-stationary Time Series Modelling Non-linear and Non-stationary Time Series Chapter 2: Non-parametric methods Henrik Madsen Advanced Time Series Analysis September 206 Henrik Madsen (02427 Adv. TS Analysis) Lecture Notes September

More information

Intensity Analysis of Spatial Point Patterns Geog 210C Introduction to Spatial Data Analysis

Intensity Analysis of Spatial Point Patterns Geog 210C Introduction to Spatial Data Analysis Intensity Analysis of Spatial Point Patterns Geog 210C Introduction to Spatial Data Analysis Chris Funk Lecture 4 Spatial Point Patterns Definition Set of point locations with recorded events" within study

More information

Lecturer: Olga Galinina

Lecturer: Olga Galinina Renewal models Lecturer: Olga Galinina E-mail: olga.galinina@tut.fi Outline Reminder. Exponential models definition of renewal processes exponential interval distribution Erlang distribution hyperexponential

More information

The high order moments method in endpoint estimation: an overview

The high order moments method in endpoint estimation: an overview 1/ 33 The high order moments method in endpoint estimation: an overview Gilles STUPFLER (Aix Marseille Université) Joint work with Stéphane GIRARD (INRIA Rhône-Alpes) and Armelle GUILLOU (Université de

More information

Quantile-quantile plots and the method of peaksover-threshold

Quantile-quantile plots and the method of peaksover-threshold Problems in SF2980 2009-11-09 12 6 4 2 0 2 4 6 0.15 0.10 0.05 0.00 0.05 0.10 0.15 Figure 2: qqplot of log-returns (x-axis) against quantiles of a standard t-distribution with 4 degrees of freedom (y-axis).

More information

A Perturbation Technique for Sample Moment Matching in Kernel Density Estimation

A Perturbation Technique for Sample Moment Matching in Kernel Density Estimation A Perturbation Technique for Sample Moment Matching in Kernel Density Estimation Arnab Maity 1 and Debapriya Sengupta 2 Abstract The fundamental idea of kernel smoothing technique can be recognized as

More information

Density estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas

Density estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas 0 0 5 Motivation: Regression discontinuity (Angrist&Pischke) Outcome.5 1 1.5 A. Linear E[Y 0i X i] 0.2.4.6.8 1 X Outcome.5 1 1.5 B. Nonlinear E[Y 0i X i] i 0.2.4.6.8 1 X utcome.5 1 1.5 C. Nonlinearity

More information

Density estimators for the convolution of discrete and continuous random variables

Density estimators for the convolution of discrete and continuous random variables Density estimators for the convolution of discrete and continuous random variables Ursula U Müller Texas A&M University Anton Schick Binghamton University Wolfgang Wefelmeyer Universität zu Köln Abstract

More information

STAT 6350 Analysis of Lifetime Data. Probability Plotting

STAT 6350 Analysis of Lifetime Data. Probability Plotting STAT 6350 Analysis of Lifetime Data Probability Plotting Purpose of Probability Plots Probability plots are an important tool for analyzing data and have been particular popular in the analysis of life

More information

Extreme Value Theory and Applications

Extreme Value Theory and Applications Extreme Value Theory and Deauville - 04/10/2013 Extreme Value Theory and Introduction Asymptotic behavior of the Sum Extreme (from Latin exter, exterus, being on the outside) : Exceeding the ordinary,

More information

Extremogram and Ex-Periodogram for heavy-tailed time series

Extremogram and Ex-Periodogram for heavy-tailed time series Extremogram and Ex-Periodogram for heavy-tailed time series 1 Thomas Mikosch University of Copenhagen Joint work with Richard A. Davis (Columbia) and Yuwei Zhao (Ulm) 1 Jussieu, April 9, 2014 1 2 Extremal

More information

Confidence intervals for kernel density estimation

Confidence intervals for kernel density estimation Stata User Group - 9th UK meeting - 19/20 May 2003 Confidence intervals for kernel density estimation Carlo Fiorio c.fiorio@lse.ac.uk London School of Economics and STICERD Stata User Group - 9th UK meeting

More information

ECON 721: Lecture Notes on Nonparametric Density and Regression Estimation. Petra E. Todd

ECON 721: Lecture Notes on Nonparametric Density and Regression Estimation. Petra E. Todd ECON 721: Lecture Notes on Nonparametric Density and Regression Estimation Petra E. Todd Fall, 2014 2 Contents 1 Review of Stochastic Order Symbols 1 2 Nonparametric Density Estimation 3 2.1 Histogram

More information

2 Functions of random variables

2 Functions of random variables 2 Functions of random variables A basic statistical model for sample data is a collection of random variables X 1,..., X n. The data are summarised in terms of certain sample statistics, calculated as

More information

One-Sample Numerical Data

One-Sample Numerical Data One-Sample Numerical Data quantiles, boxplot, histogram, bootstrap confidence intervals, goodness-of-fit tests University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html

More information

Model-free prediction intervals for regression and autoregression. Dimitris N. Politis University of California, San Diego

Model-free prediction intervals for regression and autoregression. Dimitris N. Politis University of California, San Diego Model-free prediction intervals for regression and autoregression Dimitris N. Politis University of California, San Diego To explain or to predict? Models are indispensable for exploring/utilizing relationships

More information

Kernel density estimation for heavy-tailed distributions...

Kernel density estimation for heavy-tailed distributions... Kernel density estimation for heavy-tailed distributions using the Champernowne transformation Buch-Larsen, Nielsen, Guillen, Bolance, Kernel density estimation for heavy-tailed distributions using the

More information

Kernel density estimation of reliability with applications to extreme value distribution

Kernel density estimation of reliability with applications to extreme value distribution University of South Florida Scholar Commons Graduate Theses and Dissertations Graduate School 2008 Kernel density estimation of reliability with applications to extreme value distribution Branko Miladinovic

More information

Estimation de mesures de risques à partir des L p -quantiles

Estimation de mesures de risques à partir des L p -quantiles 1/ 42 Estimation de mesures de risques à partir des L p -quantiles extrêmes Stéphane GIRARD (Inria Grenoble Rhône-Alpes) collaboration avec Abdelaati DAOUIA (Toulouse School of Economics), & Gilles STUPFLER

More information

Akaike Information Criterion to Select the Parametric Detection Function for Kernel Estimator Using Line Transect Data

Akaike Information Criterion to Select the Parametric Detection Function for Kernel Estimator Using Line Transect Data Journal of Modern Applied Statistical Methods Volume 12 Issue 2 Article 21 11-1-2013 Akaike Information Criterion to Select the Parametric Detection Function for Kernel Estimator Using Line Transect Data

More information

Asymptotic Statistics-VI. Changliang Zou

Asymptotic Statistics-VI. Changliang Zou Asymptotic Statistics-VI Changliang Zou Kolmogorov-Smirnov distance Example (Kolmogorov-Smirnov confidence intervals) We know given α (0, 1), there is a well-defined d = d α,n such that, for any continuous

More information

S6880 #7. Generate Non-uniform Random Number #1

S6880 #7. Generate Non-uniform Random Number #1 S6880 #7 Generate Non-uniform Random Number #1 Outline 1 Inversion Method Inversion Method Examples Application to Discrete Distributions Using Inversion Method 2 Composition Method Composition Method

More information

Nonparametric Estimation of Luminosity Functions

Nonparametric Estimation of Luminosity Functions x x Nonparametric Estimation of Luminosity Functions Chad Schafer Department of Statistics, Carnegie Mellon University cschafer@stat.cmu.edu 1 Luminosity Functions The luminosity function gives the number

More information

Mean-Shift Tracker Computer Vision (Kris Kitani) Carnegie Mellon University

Mean-Shift Tracker Computer Vision (Kris Kitani) Carnegie Mellon University Mean-Shift Tracker 16-385 Computer Vision (Kris Kitani) Carnegie Mellon University Mean Shift Algorithm A mode seeking algorithm Fukunaga & Hostetler (1975) Mean Shift Algorithm A mode seeking algorithm

More information

UNIVERSITÄT POTSDAM Institut für Mathematik

UNIVERSITÄT POTSDAM Institut für Mathematik UNIVERSITÄT POTSDAM Institut für Mathematik Testing the Acceleration Function in Life Time Models Hannelore Liero Matthias Liero Mathematische Statistik und Wahrscheinlichkeitstheorie Universität Potsdam

More information

Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix

Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix Yingying Dong and Arthur Lewbel California State University Fullerton and Boston College July 2010 Abstract

More information

Brief Review on Estimation Theory

Brief Review on Estimation Theory Brief Review on Estimation Theory K. Abed-Meraim ENST PARIS, Signal and Image Processing Dept. abed@tsi.enst.fr This presentation is essentially based on the course BASTA by E. Moulines Brief review on

More information

Bootstrap. Director of Center for Astrostatistics. G. Jogesh Babu. Penn State University babu.

Bootstrap. Director of Center for Astrostatistics. G. Jogesh Babu. Penn State University  babu. Bootstrap G. Jogesh Babu Penn State University http://www.stat.psu.edu/ babu Director of Center for Astrostatistics http://astrostatistics.psu.edu Outline 1 Motivation 2 Simple statistical problem 3 Resampling

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued

Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations Research

More information

Extreme value theory and high quantile convergence

Extreme value theory and high quantile convergence Journal of Operational Risk 51 57) Volume 1/Number 2, Summer 2006 Extreme value theory and high quantile convergence Mikhail Makarov EVMTech AG, Baarerstrasse 2, 6300 Zug, Switzerland In this paper we

More information

Extremogram and ex-periodogram for heavy-tailed time series

Extremogram and ex-periodogram for heavy-tailed time series Extremogram and ex-periodogram for heavy-tailed time series 1 Thomas Mikosch University of Copenhagen Joint work with Richard A. Davis (Columbia) and Yuwei Zhao (Ulm) 1 Zagreb, June 6, 2014 1 2 Extremal

More information

PENULTIMATE APPROXIMATIONS FOR WEATHER AND CLIMATE EXTREMES. Rick Katz

PENULTIMATE APPROXIMATIONS FOR WEATHER AND CLIMATE EXTREMES. Rick Katz PENULTIMATE APPROXIMATIONS FOR WEATHER AND CLIMATE EXTREMES Rick Katz Institute for Mathematics Applied to Geosciences National Center for Atmospheric Research Boulder, CO USA Email: rwk@ucar.edu Web site:

More information

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition NONLINEAR CLASSIFICATION AND REGRESSION Nonlinear Classification and Regression: Outline 2 Multi-Layer Perceptrons The Back-Propagation Learning Algorithm Generalized Linear Models Radial Basis Function

More information

Non-parametric Inference and Resampling

Non-parametric Inference and Resampling Non-parametric Inference and Resampling Exercises by David Wozabal (Last update 3. Juni 2013) 1 Basic Facts about Rank and Order Statistics 1.1 10 students were asked about the amount of time they spend

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 09: Stochastic Convergence, Continued

Introduction to Empirical Processes and Semiparametric Inference Lecture 09: Stochastic Convergence, Continued Introduction to Empirical Processes and Semiparametric Inference Lecture 09: Stochastic Convergence, Continued Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and

More information

Local Polynomial Wavelet Regression with Missing at Random

Local Polynomial Wavelet Regression with Missing at Random Applied Mathematical Sciences, Vol. 6, 2012, no. 57, 2805-2819 Local Polynomial Wavelet Regression with Missing at Random Alsaidi M. Altaher School of Mathematical Sciences Universiti Sains Malaysia 11800

More information

Does k-th Moment Exist?

Does k-th Moment Exist? Does k-th Moment Exist? Hitomi, K. 1 and Y. Nishiyama 2 1 Kyoto Institute of Technology, Japan 2 Institute of Economic Research, Kyoto University, Japan Email: hitomi@kit.ac.jp Keywords: Existence of moments,

More information

Statistics: Learning models from data

Statistics: Learning models from data DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

Practical conditions on Markov chains for weak convergence of tail empirical processes

Practical conditions on Markov chains for weak convergence of tail empirical processes Practical conditions on Markov chains for weak convergence of tail empirical processes Olivier Wintenberger University of Copenhagen and Paris VI Joint work with Rafa l Kulik and Philippe Soulier Toronto,

More information

Introduction to Regression

Introduction to Regression Introduction to Regression p. 1/97 Introduction to Regression Chad Schafer cschafer@stat.cmu.edu Carnegie Mellon University Introduction to Regression p. 1/97 Acknowledgement Larry Wasserman, All of Nonparametric

More information

Gaussian processes for inference in stochastic differential equations

Gaussian processes for inference in stochastic differential equations Gaussian processes for inference in stochastic differential equations Manfred Opper, AI group, TU Berlin November 6, 2017 Manfred Opper, AI group, TU Berlin (TU Berlin) inference in SDE November 6, 2017

More information

n! (k 1)!(n k)! = F (X) U(0, 1). (x, y) = n(n 1) ( F (y) F (x) ) n 2

n! (k 1)!(n k)! = F (X) U(0, 1). (x, y) = n(n 1) ( F (y) F (x) ) n 2 Order statistics Ex. 4.1 (*. Let independent variables X 1,..., X n have U(0, 1 distribution. Show that for every x (0, 1, we have P ( X (1 < x 1 and P ( X (n > x 1 as n. Ex. 4.2 (**. By using induction

More information

Overview of Extreme Value Theory. Dr. Sawsan Hilal space

Overview of Extreme Value Theory. Dr. Sawsan Hilal space Overview of Extreme Value Theory Dr. Sawsan Hilal space Maths Department - University of Bahrain space November 2010 Outline Part-1: Univariate Extremes Motivation Threshold Exceedances Part-2: Bivariate

More information

Bayesian Point Process Modeling for Extreme Value Analysis, with an Application to Systemic Risk Assessment in Correlated Financial Markets

Bayesian Point Process Modeling for Extreme Value Analysis, with an Application to Systemic Risk Assessment in Correlated Financial Markets Bayesian Point Process Modeling for Extreme Value Analysis, with an Application to Systemic Risk Assessment in Correlated Financial Markets Athanasios Kottas Department of Applied Mathematics and Statistics,

More information

EXPLICIT NONPARAMETRIC CONFIDENCE INTERVALS FOR THE VARIANCE WITH GUARANTEED COVERAGE

EXPLICIT NONPARAMETRIC CONFIDENCE INTERVALS FOR THE VARIANCE WITH GUARANTEED COVERAGE EXPLICIT NONPARAMETRIC CONFIDENCE INTERVALS FOR THE VARIANCE WITH GUARANTEED COVERAGE Joseph P. Romano Department of Statistics Stanford University Stanford, California 94305 romano@stat.stanford.edu Michael

More information

Large deviations for random walks under subexponentiality: the big-jump domain

Large deviations for random walks under subexponentiality: the big-jump domain Large deviations under subexponentiality p. Large deviations for random walks under subexponentiality: the big-jump domain Ton Dieker, IBM Watson Research Center joint work with D. Denisov (Heriot-Watt,

More information

Random variables. DS GA 1002 Probability and Statistics for Data Science.

Random variables. DS GA 1002 Probability and Statistics for Data Science. Random variables DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Motivation Random variables model numerical quantities

More information

ESTIMATORS IN THE CONTEXT OF ACTUARIAL LOSS MODEL A COMPARISON OF TWO NONPARAMETRIC DENSITY MENGJUE TANG A THESIS MATHEMATICS AND STATISTICS

ESTIMATORS IN THE CONTEXT OF ACTUARIAL LOSS MODEL A COMPARISON OF TWO NONPARAMETRIC DENSITY MENGJUE TANG A THESIS MATHEMATICS AND STATISTICS A COMPARISON OF TWO NONPARAMETRIC DENSITY ESTIMATORS IN THE CONTEXT OF ACTUARIAL LOSS MODEL MENGJUE TANG A THESIS IN THE DEPARTMENT OF MATHEMATICS AND STATISTICS PRESENTED IN PARTIAL FULFILLMENT OF THE

More information