Endpoint estimation for observations with normal measurement errors
|
|
- Mabel Kelly
- 5 years ago
- Views:
Transcription
1 Endpoint estimation for observations with normal measurement errors Xuan Leng Liang Peng Xing Wang Chen Zhou January 6, 27 Abstract This paper investigates the estimation of the finite endpoint of a distribution function when the observations are contaminated by normally distributed measurement errors. Under the framewor of Extreme Value Theory, we propose a class of estimators for the standard deviation of the measurement errors as well as for the endpoint. Asymptotic theories for the proposed estimators are established while their finite sample performance are demonstrated by simulations. In addition, we apply the proposed methods to the outdoor long jump data to estimate the ultimate limit for human beings in the long jump. Keywords: Convolution, extreme value theory, ultimate world record, Weibull domain of attraction. Introduction For a continuous distribution function F, the right endpoint of the distribution is defined as θ = sup{x : F x < }. Estimating the endpoint θ has been applied in various contexts when θ < +. For example, Aarssen and de Haan 994 estimated the endpoint of the distribution Erasmus University Rotterdam, P.O. Box 738, 3 DR Rotterdam, The Netherlands. Department of Ris Management and Insurance, Robinson College of Business, Georgia State University, Atlanta, GA 333, USA. Department of Ris Management and Insurance, Robinson College of Business, Georgia State University, Atlanta, GA 333, USA. Erasmus University Rotterdam, P.O. Box 738, 3 DR Rotterdam, The Netherlands.
2 of the life span of human beings, namely, the maximum life span. In productivity analysis, estimating the production frontier can be viewed as estimating the endpoint of the distribution of outputs conditional on the inputs; see e.g. Cazals et al. 22. Another notable application in Einmahl and Magnus 28 considers estimating the ultimate world record, in other words, the limit of human being, in a specific sport. For general statistical methods designed for estimating the endpoint, see, e.g., Hall 982, Athreya and Fuuchi 997, Hall and Wang 999, 25, and Girard et al. 22 among others. Since the endpoint is only related to the right tail region of the distribution, it is natural to consider Extreme Value Theory EVT in endpoint estimation. EVT models the tail region of a distribution function. The endpoint is finite if the distribution function belongs to the domain of attraction of the Weibull distribution. By regarding the endpoint as a high quantile with a probability level tending to one, the limit of the high quantile estimator can be considered as an endpoint estimator. Therefore, one may estimate the endpoint using the estimators on the extreme value index, scale and shift; see e.g. de Haan and Ferreira 26, Chapter 4.5. In reality, data are often contaminated by measurement errors. In other words, instead of observing independent and identically distributed i.i.d. sample drawn from the underlying distribution with a finite endpoint, we observe data as the convolution of the initial random variable and an independent measurement error term. Mathematically, suppose X,..., X n are random variables with a continuous distribution function F X, with a finite endpoint θ. Instead of observing {X i } n i=, we observe Y i = X i +ε i, i =, 2,..., n, where {ε i } are i.i.d. random errors with mean zero, and {ε i } are independent of {X i }. The goal in this paper is to estimate the endpoint θ based on the observations {Y i } n i= when ε follows a normal distribution. Notice that in this case, the distribution function of Y has no finite endpoint. In a broader context, extracting the distribution of X based on the observations {Y i } n i= is related to the so-called deconvolution problem. In the literature of nonparametric deconvolution, ernel estimators for the density function of {X i } were proposed; see, e.g., Carroll and Hall 988, Stefansi and Carroll 99, Meister26 and Meister and Neumann 2. Based on estimating the density function, Hall and Simar 22 proposed an estimator of θ 2
3 assuming that the density f X is approximated by a flat function in the neighborhood of θ and the variance of the measurement error, n = varε i, depending on the sample size n, shrins to zero as n. By contrast, Kneip et al. 25 did not require that σ tends to zero as the sample size increases while dealing with a constant σ. They proposed a joint estimation of θ and σ when the measurement errors {ε i } follow a normal distribution N,. Nevertheless, both of these two approaches require that f X θ >. Our approach is close to that in Goldenshluger and Tsybaov 24. They modeled f X near θ by a power function with index α. In other words f X θ =. The Goldenshluger and Tsybaov 24 approach requires that the variance and the index α are nown. More specifically, by deriving that σ 2 log n max Y d i θ b n Λ, as n, where i n b n = σ 2 log n [ ] σα log n 2 log log n + log 2 + log σ, and Λ is the Gumbel distribution, Goldenshluger and Tsybaov 24 suggested the estimator ˆθ = max i n Y i b n, which has a speed of convergence at 2 log n. Motivated by the estimator in Goldenshluger and Tsybaov 24, we aim at estimating the endpoint when σ is unnown with a broader class of F X. Under an EVT model for F X, we show that σ can be estimated by using the top order statistics of {Y i }, where is an intermediate sequence such that and /n as n. With some proper conditions on the intermediate sequence, the estimator for σ possesses asymptotic normality with a speed of convergence. In addition, the conditions on ensures that apart from the main term σ 2 log n, the other minor terms in b n play no role asymptotically. Consequently, we can directly estimate the endpoint by ˆθ = max Y i ˆσ 2 log n. The estimator inherits the i n asymptotic normality of ˆσ, however, with a compromised speed of convergence. Compared to the existing literature, our approach has two main advantages. Firstly, we do not require that F X has a density, but only assume a second order expansion of the survival function F X x = F X x in the neighborhood of θ. Therefore, it is an EVT approach. Compared to Kneip et al. 25, our model assumption allows for a broader class of F X, which includes the model in Goldenshluger and Tsybaov 24 as a special case. Secondly, we 3
4 do not require any additional information such as the variance of the measurement errors,. Therefore, our approach is more close to the real situation encountered in applications. Our estimation procedure and its asymptotic properties are discussed in Section 2. We conduct a simulation study in Section 3 and then apply our method to sports data in Section 4. Section 5 concludes. The proofs are postponed to Section 6. 2 Methodology and asymptotic results Recall that X,..., X n are i.i.d. random variables with a continuous distribution function F X that has a finite right endpoint θ := sup{x : F X x < }. Further, in a neighborhood of θ, we assume the following second order expansion of the survival function F X x = F X x : as u +, F X θ u = Lu α + du α+β + o u α+β, where L, α and β are positive constants, and d. Under the condition, F X belongs to the maximum domain of attraction of the Weibull distribution with an extreme value index γ = /α. Suppose we do not observe {X i } directly. Instead, we observe Y i = X i +ε i, i =, 2,..., n, where {ε i } are i.i.d. normally distributed random errors with mean zero and unnown variance, and {ε i } are independent of {X i }. The question is how to estimate θ based on the observed {Y i }. By showing that, as n, log n max log log n Y i σ 2 log n θ = O p, i n Goldenshluger and Tsybaov 24 proposed to estimate θ by max i n Y i σ 2 log n, where σ is assumed to be nown. Since σ is often unnown in reality, we propose an estimator for σ first, and consequently an estimator for θ. Let Z i = Y i θ, and denote the distribution function and survival function of Z i as F Z and F Z = F Z, respectively. We first show F Z has a second order expansion as in the following Proposition. 4
5 Proposition. Suppose a random variable X has a distribution function F X with a finite right endpoint θ = sup{x : F X x < } <. Assume that F X satisfies the condition. ε is a normally distributed random error with mean zero and an unnown variance and is independent of X. Denote Z = X + ε θ. Then, as t, log F Z t = t2 2 + α + log t log c c 2 c t β + o, where β = minβ, 2, c = Γα + Lσ2α+ 2π, and c 2 = 2π Γα + β + dα+β+ β =β 2π α 2 + Γα + 2Lα+3 β =2. The tail property of F Z motivates the following estimator of σ. Denote Y,n Y 2,n Y n,n as the order statistics of Y,, Y n. Then the order statistics of the unobserved random variables {Z i } n i= are Z n i,n = Y n i,n θ. Consider an intermediate sequence = n such that and /n as n. Since F Z Z n i,n i/n, we have approximately from Proposition that logi/n Y n i,n θ 2σ, for i =, 2,...,. By taing the difference between Y n i,n and Y n,n, we get that Y n i,n Y n,n 2σ logi/n log/n = log/i logn/i + logn/ log/i 2 logn/. We use the equation above and the idea of Generalized Method of Moments to estimate σ. Tae a positive continuous function g on, ] such that gs log s ds = 2. We construct a weighted sum of the differences {Y n i,n Y n,n } using the weights { gi/} as gi/ Y n i,n Y n,n 2σ i= Hence, we get an estimator of σ as 2 logn/ / gs log s ds logn/. ˆσ g = logn/ 2 gi/y n i,n Y n,n. 2 i= 5
6 Replacing σ with ˆσ g in Y n,n σ 2 log n, an estimator of θ is then given as ˆθ g = Y n,n ˆσ g 2 log n. 3 To obtain the asymptotic property of the estimator, we further assume that β > and the intermediate sequence satisfies the following condition: as n, = n, /n, and logn/ β 2 log logn/ 2 β =2 = O. 4 We first prove the asymptotic property for ˆσ g. That of ˆθ g will then follow as a direct corollary. The asymptotic normality of ˆσ g requires the following conditions on gs, s, ]. There exists ɛ >, such that lim gs s s/2 ɛ =, and gs log s ds = 2. 5 In addition, we assume a joint condition on the intermediate sequence and the function g as follows, lim n sup s t /,/ s,t gs log s gt log t =. 6 Examples such as gs = log s, and gs = 2ν + 2 s ν, ν > 2, satisfy all conditions. The following theorem gives the asymptotic normality of ˆσ g. Theorem 2. Suppose X,..., X n are i.i.d. random variables with a continuous distribution function F X. Assume that F X has a finite right endpoint θ = sup{x : F X x < } and satisfies the condition with β >. Suppose Y i = X i +ε i, i =, 2,..., n, where {ε i } are i.i.d. normally distributed random errors with mean zero and an unnown variance, and {ε i } are independent of {X i }. Assume that g :, ], satisfies 5 and := n is an intermediate sequence satisfying 4 and 6. asymptotic property: ˆσg σ d N, Then, as n, the estimator ˆσ g defined in 2 has the following 4 mins, t gsgt dsdt. st Consequently, the estimator ˆθ g defined in 3 possesses asymptotic normality as follows. Corollary. Under the same conditions as in Theorem 2, as n, d mins, t ˆθ g θ N, gsgt log n 2 st 6 dsdt.
7 3 Simulations In this section, we investigate, through simulations, the finite sample behavior of the suggested endpoint estimator. We generate observations Y i = X i + ε i, i =, 2,..., n, where {X i } n i= and {ε i } n i= are two sets of i.i.d. random variables independently drawn from the following data generating processes. In all three cases, we set the true value of the endpoint for X i to θ =, while setting the distribution of ε i to a normal distribution N, with two potential levels of σ. a X i follows a uniform distribution on [, ] and σ =. or.2. Notice that condition holds for the uniform distribution with α = and β =. b X i follows a reversed Burr distribution with the following distribution function F X x = + x 2, x <, and σ = 2 or 3. Notice that condition holds for the reversed Burr distribution with α = 2 and β = 22. c X i follows a shifted Beta distribution with the following probability density function f X x = 42x + x 5, < x <, and σ =. or.2. Notice that condition holds for the shifted Beta distribution with α = 2 and β =, which violates our required condition β > in Theorem 2. From each data generating process, we draw r = samples with a sample size n = 5. Then we estimate the endpoint for each sample using an estimator ˆθ g with a specific choice, g s = log s on, ]. It is straightforward to chec that g satisfies the condition 5. In addition, the condition 6 is degenerated for the g function. Corollary implies the asymptotic property of ˆθ g as follows. As n, d ˆθ g θ N, 52 log n σ2. The quantiles of the reversed Burr distribution are q. = 9.72, q.25 = 5.85, q.75 = 2.59 and q
8 During the estimation procedure, we need to determine the value of, i.e. the number of high upper order statistics used. For that purpose, we perform a pre-study as follows. By varying from 5 to 5, we plot the average estimate 2 θg := r r ˆθ i= g,i, the standard deviation: r r i= ˆθ g,i θ g 2 and the root mean squared error RMSE: r r ˆθ i= g 2,i for each data generating process. In Figure, we demonstrate the results for the two data generating processes in b. We observe that the optimal that minimizes the RMSE is achieved at about =. We do choose = throughout the simulation study, also for the other data generating processes in a and c. 3 Figure : Endpoint estimation for various : Reversed Burr Distribution reversed Burr σ = 2 reversed Burr σ = 3 Estimation error Bias Standard Deviation RMSE Estimation error Bias Standard Deviation RMSE Note: The figure shows the bias, standard deviation and RMSE for the estimates ˆθ g across samples with sample size n = 5. The observations are generated by combining the reversed Burr distribution in b with measurement errors following N,, σ = 2 left or σ = 3 right. We compare the performance of our suggested endpoint estimator with that of the probability weighted moment PWM estimator for the endpoint, ignoring the existence of measurement 2 Since the true endpoint equals, the average estimate can be read as the estimation bias. 3 The figures for the other data generating processes are available upon request. 8
9 errors. The PWM estimator is defined as where ˆθ P W M = Y n,n âp W Mn/ ˆγ P W M, 7 ˆγ P W M := I 4I 2 I 2I 2, and â P W M n/ := 2I I 2 I 2I 2, with the probability weighted moments given by I j = i= i j Y n,n i+ Y n,n, j =, 2. Note that ˆθ P W M is an estimator of θ only if there is no measurement error in the observations, i.e. ε i =, i =, 2,..., n. To determine the optimal choice of used in the estimator ˆθ P W M, we also conduct a pre-study similar to the aforementioned procedure. We decide to choose = 2 for all data generating processes when applying the PWM estimator. Notice that for the PWM estimator, we stop estimating the endpoint if ˆγ P W M > because the PWM estimator is valid only for γ <. In other words, from samples, we may end up with less than endpoint estimates when using the PWM estimator. For each data generating process, we plot the estimated endpoints using the two estimators across all samples in boxplots; see Figure 2. We observe that ˆθ P W M largely overestimates the true endpoint. In contrast, our estimator ˆθ g performs well across all data generating processes, and consistently outperforms the PWM estimator. The medians across simulated samples are close to the true endpoint and the variations are lower. 9
10 Figure 2: Boxplots of estimated endpoints σ =. σ =.2 σ = 2 σ = 3 Estimated endpoints Estimated endpoints θ^g θ^pwm θ^g θ^pwm i θ^g θ^pwm θ^g θ^pwm ii σ =. σ =.2 Estimated endpoints θ^g θ^pwm θ^g θ^pwm iii Note: The three plots show the estimated endpoints using the suggested endpoint estimator and the PWM estimator for six data generating processes. Each plot is based on samples with sample size n=5. In the panel i, the observations are generated by combining the uniform distribution on [, ] with measurement errors following N,, σ =. left or σ =.2 right. In the panel ii, the observations are generated by combining the reversed Burr distribution in b with measurement errors following N,, σ = 2 left or σ = 3 right. In the panel iii, the observations are generated by combining the shifted beta distribution in c with measurement errors following N,, σ = 2 left or σ = 3 right. The endpoints are estimated by ˆθ g with = and ˆθ P W M in 7 with = 2. Horizontal lines indicate the true endpoints.
11 4 Application In order to investigate the limit of human being in sports, Einmahl and Magnus 28 applies the endpoint estimation to the training data of top athletes. Initially, they gathered data for 28 types of sports. However, they stopped estimating the endpoint for five out of the 28 sports due to the fact that the estimated extreme value indices, γ, for these five sports are close to. 4 Continuing from their study, we shall apply our estimator on the endpoint to the outdoor long jump data both men and women for two reasons. Firstly, from a theoretical perspective, if we assume that the observed training data are contaminated by normally distributed measurement errors, the extreme value index γ for the observations should be equal to. This is in line with the empirical observations in Einmahl and Magnus 28. We will justify this argument by repeating the estimation of the extreme value index γ for the training data and testing whether it is significantly below zero. Different from Einmahl and Magnus 28, we employ the PWM estimator for this analysis. Secondly, for the outdoor long jump, the presence of wind can be a potential factor generating such a measurement error. To justify this argument, we shall apply our suggested estimator to the training data for the outdoor long jump, while comparing it with applying the PWM endpoint estimator to the training data for the indoor long jump. Assuming that the indoor long jump is much less affected by wind, we expect that the two endpoints estimated from these two different datasets mutually agree with each other. We collect the data from the official website of the International Association of Athletics Federations IAAF 5 for the indoor and outdoor long jump, both for men and women. In total, we construct four datasets for these four sports. For each sport, the website presents the all time personal bests of the top athletes. In addition, for indoor long jump both men and women, the website provides the personal bests of the top athletes in each year from 999 to 26. Consequently, for each of these two sports, we combine the data across the aforementioned 9 lists. Similarly for men s outdoor long jump, we combine the data across 7 lists because the 4 The five sports are, m running and outdoor long jump for both men and women, together with men s 4m running. 5 See
12 records of the years 2 and 2 are not available. For women s outdoor long jump, 8 lists are combined due to the missing records in 2. When combining the data for each sport, we eep only the best record for each athlete across the lists. Table gives a summary of the number of athletes and the best and worst achievements for each of the four sports. Since there are clusters in the present data, we smooth each dataset using the method suggested in Einmahl and Magnus 28 as follows. For example, if c athletes share the same personal best, l = 8.47 m, we smooth them by l i = i, i =,..., c. 2c Table : Data description men women Long jump the Number Best Worst the Number Best Worst Outdoor Indoor Note: The table presents the descriptive information regarding the four datasets used in the application. The four datasets correspond to the indoor and outdoor long jump data for both men and women. Each dataset consists of top athletes personal best in the corresponding sport. We start with estimating the extreme value index γ by the PWM estimator. Figure 3 shows the estimates ˆγ P W M against various values of. To balance the estimation bias and variance, we choose from the first stable region in each plot. Table 2 reports the chosen values of and the corresponding estimated γ with its 95% confidence intervals. From the confidence intervals, we observe that for indoor long jump, γ = is rejected for both men and women. Hence, the endpoints of these two distributions exist. By contrast, for outdoor long jump, we cannot reject γ = for either men or women. These results agree with the finding in Einmahl and Magnus 28 and support considering the outdoor data as contaminated by measurement errors. 2
13 Next, we continue using the PWM method to estimate the endpoints for the indoor long jump data, while using our new estimator, ˆθ g, to estimate the endpoint for the outdoor long jump data. We plot the endpoint estimates against various values of in Figure 4. A technical difference between these two estimators is regarding the sample size n. For the PWM estimator, the sample size n is not used whereas for our new method, it is necessary to now the sample size n in advance. Obviously, a lower bound for n is the current sample size, i.e. the number of top athletes included in the IAAF website. We are aware of the caveat that this number may underestimate the true number of top athletes who may potentially produce a similar performance. To address this caveat and test the sensitivity of n, we tae an arbitrary value n = 3 which is much higher than the current sample sizes. Table 2 presents the endpoint estimates with their 95% confidence intervals based on the selected values of. Firstly, we observe that the values of ˆθ g based on the outdoor long jump data are not sensitive to the sample size n, particularly after considering the estimation error reflected by the confidence intervals. Secondly, although the point estimates from applying our new estimator to the outdoor long jump data are consistently lower than that from applying the PWM method to the indoor long jump data, the two results agree with each other to certain extent. The point estimates from applying our new estimator to the outdoor long jump data falls into the confidence interval based on applying the PWM method to the indoor long jump data and vice versa. Finally, we compare the results to the actual observations in the dataset. Although the PWM estimator based on the indoor long jump data suggested that the endpoints for men s and women s long jump are at 8.79 and respectively, there are multiple observations in the outdoor long jump data that exceed those estimated endpoints: for men s long jump there are four observations higher than 8.79, while for women s long jump there are six observations higher than Following our assumption, having an observation Y above the endpoint of the distribution of X must be due to a positive value in the measurement error ε. In the context of long jump, it implies that the wind should have helped in delivering these personal bests. The website of IAAF provides the wind speed at the time when the long jump data was recorded: 3
14 for eight out of the nine cases, the recorded wind speeds were positive. The only exception is the 8.87m set by Carl Lewis in the 99 Toyo World Championships, where the wind speed was recorded as -.2m/s. Aside from this exceptional case, the other eight cases support the view that the wind can be a potential factor causing measurement errors in the long jump performance. Figure 3: Estimation of the extreme value indices indoor long jump for men indoor long jump for women γ^pwm γ^pwm outdoor long jump for men outdoor long jump for women γ^pwm γ^pwm Note: The plots show the estimated extreme value indices with the corresponding 95% confidence intervals for various values of using the PWM estimator. 4
15 Figure 4: Estimation of the endpoints indoor long jump for men indoor long jump for women θ^pwm θ^pwm outdoor long jump for men n = 776 outdoor long jump for women n = 76 θ^g θ^g Note: The plots show the estimated endpoints with the corresponding 95% confidence intervals for various values of. For the indoor long jump data, the endpoints are estimated by the PWM estimator ˆθ P W M. For the outdoor long jump data, the endpoints are estimated by the estimator ˆθ g. 5
16 Table 2: Estimation results: long jump data men women n Point 95% C.I. n Point 95% C.I. Outdoor ˆγ P W M [ -.28,.23 ] [ -.277,.9 ] ˆθ g [ 7.8, 8.85 ] [ 6.97, ] ˆθ g [ 7.528, ] [ 6.824, 7.48 ] Indoor ˆγ P W M [ -.75, ] [ -.56, -. ] ˆθ P W M [ 8.226, ] [ 6.85, ] Note: The table shows the estimation results based on the long jump data. For both indoor and outdoor long jump, the estimated extreme value indices using the PWM estimator are presented. For the outdoor long jump data, the endpoints are estimated using the estimator ˆθ g with setting the number of athletes n to either the current sample size or 3. For the indoor long jump data, the endpoints are estimated using the PWM estimator. For all estimates, the corresponding 95% confidence intervals are provided. 6
17 5 Conclusion In this paper, we consider the estimation of the finite endpoint θ of a distribution function F X. Instead of having observations drawn from F X, we only observe a contaminated sample Y i = X i + ε i, i =, 2,, n, where X i follows the distribution F X and ε i is a measurement error following N,. We start with proposing a class of estimators ˆσ g for σ, depending on an appropriate weighing function g on, ]. Then we suggest an estimator ˆθ g = max i n Y i ˆσ g 2 log n for estimating the endpoint. Both the estimators ˆσ g and ˆθ g possess asymptotic normality. We demonstrate, by extensive simulation studies, the superior performance of our suggested estimator to that of the PWM estimator when ignoring the presence of measurement errors. In addition, we apply our suggested estimator to resolve the difficulties encountered by Einmahl and Magnus 28: the estimated extreme value index is close to, and not significantly different from, zero. By assuming the presence of measurement errors stemming from the wind, we apply our suggested endpoint estimator to the outdoor long jump data. The results are comparable with applying the PWM estimator to the indoor long jump data, for which the impact of wind is negligible. 6 Proofs Proof of Proposition. Write F Z x = 2πσ e x t2 2 F X θ + tdt = 2πσ e x2 2 tx exp t2 2 F X θ + tdt. The proof is split into two parts. First, we show that as x, the integral above will be dominated by only integrating in the neighborhood of zero, i.e., for some ɛ >, F Z x = e x2 tx 2 exp 2πσ t2 [ 2 F X θ + tdt + O x α+ e ɛx/σ2]. 8 ɛ Then, we will calculate the integral in the right hand side of 8. 7
18 We first handle the equation 8. Since ɛ tx exp t2 ɛx 2 F X θ + tdt exp and exp ɛ ɛ tx t2 2 F X θ + tdt e ɛ2 2 exp ɛ 8 is proved by showing that, as x, ɛ exp exp t2 2 dt 2πσ exp tx F X θ + tdt, ɛx, tx F X θ + t dt = O x α+. 9 Denote Gt := F X θ t. Then as t, Gt = Lt α + dt α+β + o t α+β. The left hand side in 9 is then calculated as ɛ tx exp F X θ + t dt = σ2 x [ = σ2 x x G ɛx =: σ2 x G x [I + I 2 ]. ɛx e /t t 2 G tx/σ 2 G x/ t α tx e /t G t 2 dt dt + ɛx e /t t α+2 dt Gtu To deal with I, we use the fact that the function G is regularly varying, i.e. lim t Gt = u α. From Proposition B.. in de Haan and Ferreira 26, for any η > there exists x x = x η such that if x and tx x, G tx/ G x/ t α η max t α+η, t α η. By choosing ɛ /x, we have that for all t > σ2 ɛx and sufficiently large x such that x > σ2 x the two required conditions hold. Therefore, we can apply the inequality above to obtain that e /t t 2 G tx/ G x/ t α dt η e /t t 2 α t η + t η dt <. ɛx Thus we can apply the dominated convergence theorem to get that I as x. ] 8
19 By verifying that I 2 Γα + as x and σ2 x G x = Ox α, we proved 9 and consequently 8, for any ɛ /x. Next we calculate the integral in the right hand of 8. We first perform a variable transformation such that the term in the exponential part tx t2 is replaced by a new term s, 2 i.e. we define s = t2 tx. This transformation is one to one for t [ ɛ, ] with the inverse 2 transformation where t = x + 2sσ2 x 2 =: sσ2 φs; x, x φs; x = sσ2 + o, 2x2 as x, and the o term is uniform for all s ɛx+ɛ/2. Write ɛx+ɛ/2 /2 tx exp ɛ t2 2 F X θ + tdt = σ2 e s + 2σ2 x x x 2 s G ds. sφs; x To calculate this integral, we again need a comparison between G x sφs;x and G x. However, here we need a second order expansion. Notice that the function G satisfies the second order regular variation condition as lim w Gwu Gw u α dβ = u α u β. L w β β By Theorem B.2.8 in de Haan and Ferreira 26, for all η >, there exists x = x η, such that for w, wu x, Gwu Aw Gw u α u α u β β η max u α+β+η, u α+β η, where A w := dβ L w β. We intend to apply this inequality with w = x and wu = x sφs;x. For sufficiently large x, w = x > x. For wu, notice that x sφs;x = /t /ɛ. We get the required condition by choosing ɛ such that ɛ < /x. Hence we obtain from the above inequality that Zs; x := A x G x sφs;x G x sφs; x α 9 sφs; x α sφs; x β β
20 η max sφs; x α+β η, sφs; x α+β+η. This inequality allows to write the integral in the right hand of 8 as x ɛx+ɛ/2 /2 e s + 2σ2 x x 2 s G = σ2 x ɛx+ɛ/2 x G e s + 2σ2 + σ2 x A x G x ɛx+ɛ/2 + σ2 x x ɛx+ɛ/2 x A σ G 2 =:J + J 2 + J 3. ds sφs; x /2 x 2 s sφs; x α ds e s + 2σ2 x 2 s e s + 2σ2 x 2 s In all three terms, we need to deal with integrals in the form Ix; ν := ɛx+ɛ/2 /2 sφs; x α sφs; xβ ds β /2 Zs; x, ds /2 e s + 2σ2 x 2 s sφs; x ν ds, for ν >. Notice that 2σ2 s and φs; x as x hold uniformly for s ɛx+ɛ/2. By x 2 using dominance convergence theorem, we get that as x, Ix; ν e s s ν ds = Γν +. 2 /2 Further write + 2σ2 s x = s + o, as x, where the term o is again 2 x 2 uniform for all s ɛx+ɛ/2. Then, Ix; ν = ɛx+ɛ/2 ɛx+ɛ/2 e s sφs; x ν ds σ2 x 2 e s ssφs; x ν ds + o =: II II 2. By applying the dominance convergence theorem again, we obtain that as x, x 2 II 2 Γν + 2. For II, we use the expansion of φs; x in to get that II = ɛx+ɛ/2 ɛx+ɛ/2 e s s ν ds νσ2 2x 2 e s s ν+ ds + o 2
21 = Γν + + ox 2 νσ2 νσ2 Γν o = Γν + Γν o. 2x2 2x2 By combining the two terms we get a second order expansion of Ix; µ: as x, Ix; ν = Γν + σ2 x 2 ν 2 + Γν o. 3 By applying 3 with ν = α, we have that as x, J = σ2 x x G Γα + σ2 α x Γα o. By applying 2 with ν = α and ν = α + β, we get that as x, J 2 = σ2 x x Γα + β + Γα + x A G. β Lastly, based on the inequality and 2, we get that J 3 = oj 2 as x. By combining all three terms, we obtain that as x, tx exp ɛ t2 2 F X θ + tdt = σ2 x x G Γα + σ2 α x Γα + β + Γα + x Γα o + A + o β = σ2 x x G 2πc2 Γα + + Lα+ x β + o d L Γα + σ2β x β =Lx α σ 2α+2 + d L σ2β x β + o 2πc2 Γα + + Lα+ x β + o d L Γα + σ2β x β = 2πσx α c + c 2 x β + o, where c, β and c 2 are defined as in the proposition. Finally the proposition is proved by substituting this relation to 8. Next we prove Theorem 2. Write the estimator ˆσ g in 2 as ˆσ g σ = logn/ / g[s]/ Z n [s],n Z n,n σ 2 logn/ We firstly establish the asymptotic properties of tail quantile process {Z n [s],n, s } as follows. 2 ds.
22 Proposition 3. Assume the same conditions as in Proposition. Then there exists a sequence of standard Brownian motions {W n s : s } such that as n, Z n [s],n σ 2 logn/ = ψ,n + ψ 2,n s + /2 s W n s + s /2 δ o p 2 logn/ where α + log logn/ 4 logn/ ψ,n = + logn/, c 3 = 2 log c α + logσ 2, qn/ = ψ 2,n s = σ β c 2 2 +β/2 c logn/ β 2 β < 2 α+2 log s 2 logn/ and all terms o p are uniform for s [, ]., 32 logn/ log logn/ 2 β 2 α + log logn/ + + o, 4 logn/ c 3 + qn/ logn/ + o p Proof. Denote U = log F Z, where denotes the left continuous inverse function. Write Z i = UE i where {E i } n i= is a sample of i.i.d. standard exponential distributed random variables. Let E,n E 2,n E n,n be the corresponding order statistics. To obtain the asymptotic properties of {Z n [s],n, s }, we derive the expansion of Ut and the asymptotic properties of {E n [s],n, s }. From the tail expansion of log F Z in Proposition, we get that as t, t = 2 Ut2 + α + log Ut log c c 2 c Ut β + o. 4 Since Ut as t, it implies that Ut σ 2t. Write Ut = σ 2t + rt, where rt = o t. We intend to obtain further explicit expression for the rt term. Substituting Ut in 4 by σ 2t + rt yields that, as t, = r 2 t + 2σ 2t rt + α + log t + 2 α + logσ 2 log c + 2 α + log + rt σ 2σ2 c 2 σ β 2t + o. 2t c 22
23 Since rt t as t, we can solve this equation to obtain that σα + rt = 2 t /2 log t + σ log c α + logσ 2 t /2 + t /2 qt, 2 2 where qt = o as t. By reiterating this procedure, we eventually obtain that as t, where Ut = σ σα + 2t 2 t /2 log t + σ 2c 3 t /2 + t /2 qt + o, 5 2 qt = σ β c 2 2 +β c t β 2 β < 2 σα t log 2 t β 2. Note that q is related to the q function defined in this proposition by qt = qlog t 2σ. Next we derive the asymptotic properties of {E n [s],n, s } from Theorem in de Haan and Ferreira 26 as follows. For any δ >, there exists a sequence of standard Brownian motions {{W n s} s } such that as n, s /2+δ En [s],n logn/ + log s s W n s sup s p. When plugging the asymptotic expansion of E n [s],n into Z n [s],n = UE n [s],n, we observe Z that the asymptotic property of n [s],n is mainly driven by En [s],n /2. logn/ logn/ In order to mae γ a smooth substitution, we first derive the asymptotic behavior of for any given γ R. Write En [s],n logn/ γ = [ + En [s],n logn/ ] γ log s logn/ + s W n s + o p s /2 δ logn/ = + γ log s + /2 s W n s + o p s /2 δ logn/ log s + /2 s W n s + o p s /2 δ + θ logn/ where θ = x γ x 2 x=ξ for some ξ between and log s + /2 s W ns+o ps /2 δ logn/. p Since as, uniformly for all / s log s + /2 s W ns+o ps /2 δ logn/ 23 2
24 and 2 + x γ is bounded in the neighborhood of zero, we get that with probability tending to, θ is bounded uniformly for all / s. By verifying that the quadratic term x 2 2 can be uniformly written as logn/ s /2 δ o p, we log s + /2 s W ns+o ps /2 δ logn/ get that as n En [s],n γ = + γ log s + /2 s W n s + o p s /2 δ, logn/ logn/ where the o p term is uniformly for all / s. For γ =, the relation should be read as log E n [s],n log logn/ = log s + /2 s W n s + o p s /2 δ. logn/ Finally, by plugging E n [s],n into Z n [s],n = UE n [s],n, while using the expansion of U γ in 5 and the asymptotic expansion of for γ = /2, /2, and + β/2, we obtain the result of Proposition 3. En [s],n logn/ Remar. In the expansion of the tail quantile process, ψ,n is a deterministic term not depending on s, ψ 2,n is a deterministic term depending on s, the third component gives a random term, and finally qn/ logn/ + o p has a uniform approximation independent of s. It will turn out to be clear in the proof of Theorem 2 that such a detailed expansion is necessary for achieving the intended asymptotic results. We use Proposition 3 to prove the asymptotic normality of ˆσ g. Proof of Theorem 2. Write ˆσ g σ = logn/ g / Z n,n Z n,n σ 2 logn/ + logn/ g[s]/ Z n [s],n Z n,n 2/ σ ds 2 logn/ =: I + I 2. From Proposition 3, we get that, as n, Z n [s],n Z n,n σ 2 logn/ = log s + /2 s W n s W n + o p s /2 δ 2 logn/ 24
25 holds uniformly for / s. α + log logn/ log s + o + qn/ 4 logn/ 2 logn/ logn/ o p, 6 First, for I, replacing s by in 6, yields that as n, I = logn/ log + /2 W n / + δ o p log log logn/ /2 g / + 2 logn/ logn/ 2 O + qn/ logn/ o p. Using the modulus of continuity for W n / and note that from the condition 5, g / = o /2 ɛ for any δ < ɛ, as n, we then get from the above equation that I = /2 g/ δ O p = δ ɛ o p = o p. Next, we deal with I 2. By multiplying both sides of 6 with logn/gs and taing an integral on the interval [2/, ], we get that as n, I2 = logn/ gs Z n [s],n Z n,n 2/ σ ds 2 logn/ = gs log s ds + 2 2/ 2 gs s W n s W n ds + o p gss /2 δ ds 2/ 2/ α + log logn/ + gs log s ds + o + qn/o p 4 logn/ 2 2/ = + 2 gs s W n s W n ds + o p + α + log logn/ + o + qn/o p. 4 logn/ 2/ gs ds. To obtain the last equality, we used the condition 5 to derive the following facts. First, both gss /2 δ and gs are integrable on [,] for any δ < ɛ. Second, as n, 2/ gs s W n s W n 2/ ds s /2+ɛ s W n s W n ds = o p. Lastly, as, 2/ gs log s ds 2/ s /2+ɛ log s ds = 25 log log 2 + e /2 ɛ t t de t
26 2 + e ɛt t dt = o. log log 2 Finally, the condition 4 implies that log logn// logn/ and qn/ is bounded as n, which leads to the asymptotic property of I2 : as n, I 2 gs s W n s W n ds p. 2 We remar that the difference between I 2 and I 2 is of an order /2 o p. This is shown by using the condition 6 as follows. Write I2 I2 logn/ g[s]/ gs Z n [s],n Z n,n 2/ σ ds 2 logn/ g[s]/ gs log s + /2 O p s /2 δ ds 2/ g[s]/ 2/ log[s]/ gs log s log log s s + /2 O p s /2 δ ds + log[s]/ log s g[s]/ 2/ log[s]/ sup gs s t /,/ s,t log s gt log t log s 2/ + log[s]/s g[s]/ 2/ log[s]/ log s ds + O p log[s]/s g[s]/ log[s]/ s /2 δ ds 2/ =: I 2 + I 22 + I 23. The condition 6 implies that as n, I 2 = o p. Next, for I 22, notice that log[s]/s = log {s} s the condition 5 implies that there exists c such that gs < c s /2+ɛ applying the condition 6 again, we get that as n. log s + /2 O p s /2 δ ds log s + /2 O p s /2 δ ds c s < for s 2/. In addition, I 22 log[s]/s gs ds + o log[s]/s log sds 2/ 2/ 26 for all s, ]. By
27 c 2/ c c 2 /2 ɛ /2 c /2 s s /2+ɛ ds + o ɛ /2 2/ 2/ c log sds s s +ɛ /2 ds + o c 2. Lastly, for I 23, notice that for all 2/ s <, log[s]/ max log s, log/ and s 2 < [s]/ < s implies that g[s]/ < c [s]/ /2+ɛ < c 2 s /2+ɛ. Hence, as n c I 23 O p 2/ s c 2s +ɛ δ max log s, log/ ds / c O p s c 2s +ɛ δ log s ds + log/ 2/ Since as n, log/, / s 2+ɛ δ ds and / c s c 2s +ɛ δ ds. lim / 2/ s 2+ɛ δ log s ds [ / 2+ɛ δ = lim log / 2 + 2/ 2+ɛ δ log2/ ] 2 2 =, we obtain that I 23 as n. Combining the three terms, we have shown that as n, I2 I 2 = o p, which leads to I2 2 gs s W ns W n ds p. The theorem is thus proved by combining the two parts I and I 2 and further calculating the asymptotic variance. Proof of Corollary. Write ˆθg θ = σ Zn,n 2 log n σ 2 log n σ ˆσg 2 σ =: I I 2. Theorem 2 gives the asymptotic property of I 2. Hence we only need to show that I n. Since as n, E n,n log n exp{ e x }, we get that for γ, γ En,n log n + Op γ = = + γ log logn/ logn/ logn/ + o p. d 27 p, as
28 Plugging this expansion to the U function given in 5 yields that Z n,n σ 2 logn/ = UE n,n σ 2 logn/ = ψ,n + log 2 logn/ + o p, as n. The corollary is proved by verifying that ψ,n and log / logn/, as n, which are implied by the condition 6. References [] Aarssen, K. and L. De Haan 994. On the maximal life span of humans. Mathematical Population Studies 4, [2] Athreya, K.B. and J. Fuuchi 997. Confidence intervals for endpoints of a c.d.f. via bootstrap. Journal of Statistical Planning and Inference 58, [3] Carroll, R.J., and P. Hall 988. Optimal rates of convergence for deconvolving a density. Journal of the American Statistical Association 83, [4] Cazals, C., J.P. Florens and L. Simar 22. Nonparametric frontier estimation: a robust approach. Journal of Econometrics 6, -25. [5] Einmahl, J. H., and J. R. Magnus 22. Records in athletics through extreme-value theory. Journal of the American Statistical Association 3, [6] Girard, S., A. Guillou, and G. Stupfler 22. Estimating an endpoint with high-order moments. Test 2, [7] Goldenshluger, A., and A. Tsybaov 24. Estimating the endpoint of a distribution in the presence of additive observation errors. Statistics and Probability Letters 68, [8] de Haan, L., and A. Ferreira 26. Extreme Value Theory: An Introduction. Springer Series in Operations Research and Financial Engineering. New Yor: Springer. 28
29 [9] Hall, P On estimating the endpoint of a distribution. The Annals of Statistics, [] Hall, P., and L. Simar 22. Estimating a changepoint, boundary, or frontier in the presence of observation error. Journal of the American Statistical Association 97, [] Hall, P., and J.Z. Wang 999. Estimating the end-point of a probability distribution using minimum-distance methods. Bernoulli 5, [2] Hall, P., and J.Z. Wang 25. Bayesian lielihood methods for estimating the endpoint of a distribution. Journal of the Royal Statistical Society: Series B 675, [3] Kneip, A., L. Simar and I.V. Keilegom 25. Frontier estimation in the presence of measurement error with unnown variance. Journal of Econometrics 84, [4] Meister, A. 26. Density estimation with normal measurement error with unnown variance. Statistica Sinica 6, [5] Meister, A. and Neumann, M.H. 2. Deconvolution from non-standard error densities under replicated measurements. Statistica Sinica 2, [6] Stefansi, L. and R.J. Carroll 99. Deconvoluting ernel density estimators. Statistics 2,
The high order moments method in endpoint estimation: an overview
1/ 33 The high order moments method in endpoint estimation: an overview Gilles STUPFLER (Aix Marseille Université) Joint work with Stéphane GIRARD (INRIA Rhône-Alpes) and Armelle GUILLOU (Université de
More informationAN ASYMPTOTICALLY UNBIASED MOMENT ESTIMATOR OF A NEGATIVE EXTREME VALUE INDEX. Departamento de Matemática. Abstract
AN ASYMPTOTICALLY UNBIASED ENT ESTIMATOR OF A NEGATIVE EXTREME VALUE INDEX Frederico Caeiro Departamento de Matemática Faculdade de Ciências e Tecnologia Universidade Nova de Lisboa 2829 516 Caparica,
More informationJournal of Statistical Planning and Inference
Journal of Statistical Planning Inference 39 9 336 -- 3376 Contents lists available at ScienceDirect Journal of Statistical Planning Inference journal homepage: www.elsevier.com/locate/jspi Maximum lielihood
More informationDiscussion on Human life is unlimited but short by Holger Rootzén and Dmitrii Zholud
Extremes (2018) 21:405 410 https://doi.org/10.1007/s10687-018-0322-z Discussion on Human life is unlimited but short by Holger Rootzén and Dmitrii Zholud Chen Zhou 1 Received: 17 April 2018 / Accepted:
More informationEstimation de mesures de risques à partir des L p -quantiles
1/ 42 Estimation de mesures de risques à partir des L p -quantiles extrêmes Stéphane GIRARD (Inria Grenoble Rhône-Alpes) collaboration avec Abdelaati DAOUIA (Toulouse School of Economics), & Gilles STUPFLER
More informationA PRACTICAL WAY FOR ESTIMATING TAIL DEPENDENCE FUNCTIONS
Statistica Sinica 20 2010, 365-378 A PRACTICAL WAY FOR ESTIMATING TAIL DEPENDENCE FUNCTIONS Liang Peng Georgia Institute of Technology Abstract: Estimating tail dependence functions is important for applications
More informationOn estimating extreme tail. probabilities of the integral of. a stochastic process
On estimating extreme tail probabilities of the integral of a stochastic process Ana Ferreira Instituto uperior de Agronomia, UTL and CEAUL Laurens de Haan University of Tilburg, Erasmus University Rotterdam
More informationExtreme L p quantiles as risk measures
1/ 27 Extreme L p quantiles as risk measures Stéphane GIRARD (Inria Grenoble Rhône-Alpes) joint work Abdelaati DAOUIA (Toulouse School of Economics), & Gilles STUPFLER (University of Nottingham) December
More informationPitfalls in Using Weibull Tailed Distributions
Pitfalls in Using Weibull Tailed Distributions Alexandru V. Asimit, Deyuan Li & Liang Peng First version: 18 December 2009 Research Report No. 27, 2009, Probability and Statistics Group School of Mathematics,
More informationSequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process
Applied Mathematical Sciences, Vol. 4, 2010, no. 62, 3083-3093 Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process Julia Bondarenko Helmut-Schmidt University Hamburg University
More informationStatistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation
Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence
More informationExceedance probability of the integral of a stochastic process
Exceedance probability of the integral of a stochastic process Ana Ferreira IA, Universidade Técnica de Lisboa and CEAUL Laurens de Haan University of Tilburg, Erasmus University Rotterdam and CEAUL Chen
More informationInference on distributions and quantiles using a finite-sample Dirichlet process
Dirichlet IDEAL Theory/methods Simulations Inference on distributions and quantiles using a finite-sample Dirichlet process David M. Kaplan University of Missouri Matt Goldman UC San Diego Midwest Econometrics
More informationAnalysis methods of heavy-tailed data
Institute of Control Sciences Russian Academy of Sciences, Moscow, Russia February, 13-18, 2006, Bamberg, Germany June, 19-23, 2006, Brest, France May, 14-19, 2007, Trondheim, Norway PhD course Chapter
More informationA Note on the Scale Efficiency Test of Simar and Wilson
International Journal of Business Social Science Vol. No. 4 [Special Issue December 0] Abstract A Note on the Scale Efficiency Test of Simar Wilson Hédi Essid Institut Supérieur de Gestion Université de
More informationExtreme Value Theory and Applications
Extreme Value Theory and Deauville - 04/10/2013 Extreme Value Theory and Introduction Asymptotic behavior of the Sum Extreme (from Latin exter, exterus, being on the outside) : Exceeding the ordinary,
More informationEstimation of the functional Weibull-tail coefficient
1/ 29 Estimation of the functional Weibull-tail coefficient Stéphane Girard Inria Grenoble Rhône-Alpes & LJK, France http://mistis.inrialpes.fr/people/girard/ June 2016 joint work with Laurent Gardes,
More informationDoes k-th Moment Exist?
Does k-th Moment Exist? Hitomi, K. 1 and Y. Nishiyama 2 1 Kyoto Institute of Technology, Japan 2 Institute of Economic Research, Kyoto University, Japan Email: hitomi@kit.ac.jp Keywords: Existence of moments,
More informationA NOTE ON SECOND ORDER CONDITIONS IN EXTREME VALUE THEORY: LINKING GENERAL AND HEAVY TAIL CONDITIONS
REVSTAT Statistical Journal Volume 5, Number 3, November 2007, 285 304 A NOTE ON SECOND ORDER CONDITIONS IN EXTREME VALUE THEORY: LINKING GENERAL AND HEAVY TAIL CONDITIONS Authors: M. Isabel Fraga Alves
More informationCovariance function estimation in Gaussian process regression
Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian
More informationThe Convergence Rate for the Normal Approximation of Extreme Sums
The Convergence Rate for the Normal Approximation of Extreme Sums Yongcheng Qi University of Minnesota Duluth WCNA 2008, Orlando, July 2-9, 2008 This talk is based on a joint work with Professor Shihong
More informationResearch Article Strong Convergence Bound of the Pareto Index Estimator under Right Censoring
Hindawi Publishing Corporation Journal of Inequalities and Applications Volume 200, Article ID 20956, 8 pages doi:0.55/200/20956 Research Article Strong Convergence Bound of the Pareto Index Estimator
More informationMethod of Conditional Moments Based on Incomplete Data
, ISSN 0974-570X (Online, ISSN 0974-5718 (Print, Vol. 20; Issue No. 3; Year 2013, Copyright 2013 by CESER Publications Method of Conditional Moments Based on Incomplete Data Yan Lu 1 and Naisheng Wang
More informationSemi-parametric tail inference through Probability-Weighted Moments
Semi-parametric tail inference through Probability-Weighted Moments Frederico Caeiro New University of Lisbon and CMA fac@fct.unl.pt and M. Ivette Gomes University of Lisbon, DEIO, CEAUL and FCUL ivette.gomes@fc.ul.pt
More informationON EXTREME VALUE ANALYSIS OF A SPATIAL PROCESS
REVSTAT Statistical Journal Volume 6, Number 1, March 008, 71 81 ON EXTREME VALUE ANALYSIS OF A SPATIAL PROCESS Authors: Laurens de Haan Erasmus University Rotterdam and University Lisbon, The Netherlands
More informationVariable inspection plans for continuous populations with unknown short tail distributions
Variable inspection plans for continuous populations with unknown short tail distributions Wolfgang Kössler Abstract The ordinary variable inspection plans are sensitive to deviations from the normality
More informationHigh Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data
High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data Song Xi CHEN Guanghua School of Management and Center for Statistical Science, Peking University Department
More informationMath 494: Mathematical Statistics
Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/
More informationQualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama
Qualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama Instructions This exam has 7 pages in total, numbered 1 to 7. Make sure your exam has all the pages. This exam will be 2 hours
More informationj=1 r 1 x 1 x n. r m r j (x) r j r j (x) r j (x). r j x k
Maria Cameron Nonlinear Least Squares Problem The nonlinear least squares problem arises when one needs to find optimal set of parameters for a nonlinear model given a large set of data The variables x,,
More informationFrontier estimation based on extreme risk measures
Frontier estimation based on extreme risk measures by Jonathan EL METHNI in collaboration with Ste phane GIRARD & Laurent GARDES CMStatistics 2016 University of Seville December 2016 1 Risk measures 2
More informationSMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES
Statistica Sinica 19 (2009), 71-81 SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES Song Xi Chen 1,2 and Chiu Min Wong 3 1 Iowa State University, 2 Peking University and
More informationOptimal Jackknife for Unit Root Models
Optimal Jackknife for Unit Root Models Ye Chen and Jun Yu Singapore Management University October 19, 2014 Abstract A new jackknife method is introduced to remove the first order bias in the discrete time
More informationTail negative dependence and its applications for aggregate loss modeling
Tail negative dependence and its applications for aggregate loss modeling Lei Hua Division of Statistics Oct 20, 2014, ISU L. Hua (NIU) 1/35 1 Motivation 2 Tail order Elliptical copula Extreme value copula
More informationSmooth nonparametric estimation of a quantile function under right censoring using beta kernels
Smooth nonparametric estimation of a quantile function under right censoring using beta kernels Chanseok Park 1 Department of Mathematical Sciences, Clemson University, Clemson, SC 29634 Short Title: Smooth
More informationOverview of Extreme Value Theory. Dr. Sawsan Hilal space
Overview of Extreme Value Theory Dr. Sawsan Hilal space Maths Department - University of Bahrain space November 2010 Outline Part-1: Univariate Extremes Motivation Threshold Exceedances Part-2: Bivariate
More information5 Introduction to the Theory of Order Statistics and Rank Statistics
5 Introduction to the Theory of Order Statistics and Rank Statistics This section will contain a summary of important definitions and theorems that will be useful for understanding the theory of order
More informationChange Point Analysis of Extreme Values
Change Point Analysis of Extreme Values TIES 2008 p. 1/? Change Point Analysis of Extreme Values Goedele Dierckx Economische Hogeschool Sint Aloysius, Brussels, Belgium e-mail: goedele.dierckx@hubrussel.be
More informationESTIMATING BIVARIATE TAIL
Elena DI BERNARDINO b joint work with Clémentine PRIEUR a and Véronique MAUME-DESCHAMPS b a LJK, Université Joseph Fourier, Grenoble 1 b Laboratoire SAF, ISFA, Université Lyon 1 Framework Goal: estimating
More informationNonlinear Error Correction Model and Multiple-Threshold Cointegration May 23, / 31
Nonlinear Error Correction Model and Multiple-Threshold Cointegration Man Wang Dong Hua University, China Joint work with N.H.Chan May 23, 2014 Nonlinear Error Correction Model and Multiple-Threshold Cointegration
More informationSmooth simultaneous confidence bands for cumulative distribution functions
Journal of Nonparametric Statistics, 2013 Vol. 25, No. 2, 395 407, http://dx.doi.org/10.1080/10485252.2012.759219 Smooth simultaneous confidence bands for cumulative distribution functions Jiangyan Wang
More informationStatistics: Learning models from data
DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial
More informationLecture 3. Inference about multivariate normal distribution
Lecture 3. Inference about multivariate normal distribution 3.1 Point and Interval Estimation Let X 1,..., X n be i.i.d. N p (µ, Σ). We are interested in evaluation of the maximum likelihood estimates
More informationInequalities Relating Addition and Replacement Type Finite Sample Breakdown Points
Inequalities Relating Addition and Replacement Type Finite Sample Breadown Points Robert Serfling Department of Mathematical Sciences University of Texas at Dallas Richardson, Texas 75083-0688, USA Email:
More informationHIERARCHICAL MODELS IN EXTREME VALUE THEORY
HIERARCHICAL MODELS IN EXTREME VALUE THEORY Richard L. Smith Department of Statistics and Operations Research, University of North Carolina, Chapel Hill and Statistical and Applied Mathematical Sciences
More informationRefining the Central Limit Theorem Approximation via Extreme Value Theory
Refining the Central Limit Theorem Approximation via Extreme Value Theory Ulrich K. Müller Economics Department Princeton University February 2018 Abstract We suggest approximating the distribution of
More informationThe circular law. Lewis Memorial Lecture / DIMACS minicourse March 19, Terence Tao (UCLA)
The circular law Lewis Memorial Lecture / DIMACS minicourse March 19, 2008 Terence Tao (UCLA) 1 Eigenvalue distributions Let M = (a ij ) 1 i n;1 j n be a square matrix. Then one has n (generalised) eigenvalues
More informationEstimating a frontier function using a high-order moments method
1/ 16 Estimating a frontier function using a high-order moments method Gilles STUPFLER (University of Nottingham) Joint work with Stéphane GIRARD (INRIA Rhône-Alpes) and Armelle GUILLOU (Université de
More information1 Hypothesis Testing and Model Selection
A Short Course on Bayesian Inference (based on An Introduction to Bayesian Analysis: Theory and Methods by Ghosh, Delampady and Samanta) Module 6: From Chapter 6 of GDS 1 Hypothesis Testing and Model Selection
More informationCan we do statistical inference in a non-asymptotic way? 1
Can we do statistical inference in a non-asymptotic way? 1 Guang Cheng 2 Statistics@Purdue www.science.purdue.edu/bigdata/ ONR Review Meeting@Duke Oct 11, 2017 1 Acknowledge NSF, ONR and Simons Foundation.
More informationNon-parametric Inference and Resampling
Non-parametric Inference and Resampling Exercises by David Wozabal (Last update. Juni 010) 1 Basic Facts about Rank and Order Statistics 1.1 10 students were asked about the amount of time they spend surfing
More informationNontrivial Solutions for Boundary Value Problems of Nonlinear Differential Equation
Advances in Dynamical Systems and Applications ISSN 973-532, Volume 6, Number 2, pp. 24 254 (2 http://campus.mst.edu/adsa Nontrivial Solutions for Boundary Value Problems of Nonlinear Differential Equation
More informationBias Reduction in the Estimation of a Shape Second-order Parameter of a Heavy Right Tail Model
Bias Reduction in the Estimation of a Shape Second-order Parameter of a Heavy Right Tail Model Frederico Caeiro Universidade Nova de Lisboa, FCT and CMA M. Ivette Gomes Universidade de Lisboa, DEIO, CEAUL
More informationMathematics Qualifying Examination January 2015 STAT Mathematical Statistics
Mathematics Qualifying Examination January 2015 STAT 52800 - Mathematical Statistics NOTE: Answer all questions completely and justify your derivations and steps. A calculator and statistical tables (normal,
More informationSTAT 512 sp 2018 Summary Sheet
STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}
More informationThe Goodness-of-fit Test for Gumbel Distribution: A Comparative Study
MATEMATIKA, 2012, Volume 28, Number 1, 35 48 c Department of Mathematics, UTM. The Goodness-of-fit Test for Gumbel Distribution: A Comparative Study 1 Nahdiya Zainal Abidin, 2 Mohd Bakri Adam and 3 Habshah
More informationParametric Techniques Lecture 3
Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to
More informationUPPER DEVIATIONS FOR SPLIT TIMES OF BRANCHING PROCESSES
Applied Probability Trust 7 May 22 UPPER DEVIATIONS FOR SPLIT TIMES OF BRANCHING PROCESSES HAMED AMINI, AND MARC LELARGE, ENS-INRIA Abstract Upper deviation results are obtained for the split time of a
More informationLecture 8: Information Theory and Statistics
Lecture 8: Information Theory and Statistics Part II: Hypothesis Testing and I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 23, 2015 1 / 50 I-Hsiang
More informationA Bayesian perspective on GMM and IV
A Bayesian perspective on GMM and IV Christopher A. Sims Princeton University sims@princeton.edu November 26, 2013 What is a Bayesian perspective? A Bayesian perspective on scientific reporting views all
More informationA Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints
Noname manuscript No. (will be inserted by the editor) A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints Mai Zhou Yifan Yang Received: date / Accepted: date Abstract In this note
More informationMFM Practitioner Module: Quantitiative Risk Management. John Dodson. October 14, 2015
MFM Practitioner Module: Quantitiative Risk Management October 14, 2015 The n-block maxima 1 is a random variable defined as M n max (X 1,..., X n ) for i.i.d. random variables X i with distribution function
More informationWorst case analysis for a general class of on-line lot-sizing heuristics
Worst case analysis for a general class of on-line lot-sizing heuristics Wilco van den Heuvel a, Albert P.M. Wagelmans a a Econometric Institute and Erasmus Research Institute of Management, Erasmus University
More informationON THE TAIL INDEX ESTIMATION OF AN AUTOREGRESSIVE PARETO PROCESS
Discussiones Mathematicae Probability and Statistics 33 (2013) 65 77 doi:10.7151/dmps.1149 ON THE TAIL INDEX ESTIMATION OF AN AUTOREGRESSIVE PARETO PROCESS Marta Ferreira Center of Mathematics of Minho
More informationIEOR E4703: Monte-Carlo Simulation
IEOR E4703: Monte-Carlo Simulation Output Analysis for Monte-Carlo Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com Output Analysis
More informationSTAT Sample Problem: General Asymptotic Results
STAT331 1-Sample Problem: General Asymptotic Results In this unit we will consider the 1-sample problem and prove the consistency and asymptotic normality of the Nelson-Aalen estimator of the cumulative
More informationSingle Index Quantile Regression for Heteroscedastic Data
Single Index Quantile Regression for Heteroscedastic Data E. Christou M. G. Akritas Department of Statistics The Pennsylvania State University SMAC, November 6, 2015 E. Christou, M. G. Akritas (PSU) SIQR
More informationA Note on Tail Behaviour of Distributions. the max domain of attraction of the Frechét / Weibull law under power normalization
ProbStat Forum, Volume 03, January 2010, Pages 01-10 ISSN 0974-3235 A Note on Tail Behaviour of Distributions in the Max Domain of Attraction of the Frechét/ Weibull Law under Power Normalization S.Ravi
More informationShape of the return probability density function and extreme value statistics
Shape of the return probability density function and extreme value statistics 13/09/03 Int. Workshop on Risk and Regulation, Budapest Overview I aim to elucidate a relation between one field of research
More informationEconomics 583: Econometric Theory I A Primer on Asymptotics
Economics 583: Econometric Theory I A Primer on Asymptotics Eric Zivot January 14, 2013 The two main concepts in asymptotic theory that we will use are Consistency Asymptotic Normality Intuition consistency:
More informationMathematics Ph.D. Qualifying Examination Stat Probability, January 2018
Mathematics Ph.D. Qualifying Examination Stat 52800 Probability, January 2018 NOTE: Answers all questions completely. Justify every step. Time allowed: 3 hours. 1. Let X 1,..., X n be a random sample from
More informationA Conditional Approach to Modeling Multivariate Extremes
A Approach to ing Multivariate Extremes By Heffernan & Tawn Department of Statistics Purdue University s April 30, 2014 Outline s s Multivariate Extremes s A central aim of multivariate extremes is trying
More informationGoodness-of-fit tests for the cure rate in a mixture cure model
Biometrika (217), 13, 1, pp. 1 7 Printed in Great Britain Advance Access publication on 31 July 216 Goodness-of-fit tests for the cure rate in a mixture cure model BY U.U. MÜLLER Department of Statistics,
More informationNOTES ON EXISTENCE AND UNIQUENESS THEOREMS FOR ODES
NOTES ON EXISTENCE AND UNIQUENESS THEOREMS FOR ODES JONATHAN LUK These notes discuss theorems on the existence, uniqueness and extension of solutions for ODEs. None of these results are original. The proofs
More informationA Closer Look at the Hill Estimator: Edgeworth Expansions and Confidence Intervals
A Closer Look at the Hill Estimator: Edgeworth Expansions and Confidence Intervals Erich HAEUSLER University of Giessen http://www.uni-giessen.de Johan SEGERS Tilburg University http://www.center.nl EVA
More informationExperience Rating in General Insurance by Credibility Estimation
Experience Rating in General Insurance by Credibility Estimation Xian Zhou Department of Applied Finance and Actuarial Studies Macquarie University, Sydney, Australia Abstract This work presents a new
More informationNonparametric estimation of tail risk measures from heavy-tailed distributions
Nonparametric estimation of tail risk measures from heavy-tailed distributions Jonthan El Methni, Laurent Gardes & Stéphane Girard 1 Tail risk measures Let Y R be a real random loss variable. The Value-at-Risk
More informationIntroduction to Machine Learning. Lecture 2
Introduction to Machine Learning Lecturer: Eran Halperin Lecture 2 Fall Semester Scribe: Yishay Mansour Some of the material was not presented in class (and is marked with a side line) and is given for
More informationObtaining Critical Values for Test of Markov Regime Switching
University of California, Santa Barbara From the SelectedWorks of Douglas G. Steigerwald November 1, 01 Obtaining Critical Values for Test of Markov Regime Switching Douglas G Steigerwald, University of
More informationA STATISTICAL TEST FOR MONOTONIC AND NON-MONOTONIC TREND IN REPAIRABLE SYSTEMS
A STATISTICAL TEST FOR MONOTONIC AND NON-MONOTONIC TREND IN REPAIRABLE SYSTEMS Jan Terje Kvaløy Department of Mathematics and Science, Stavanger University College, P.O. Box 2557 Ullandhaug, N-491 Stavanger,
More informationStatistical Data Analysis
DS-GA 0 Lecture notes 8 Fall 016 1 Descriptive statistics Statistical Data Analysis In this section we consider the problem of analyzing a set of data. We describe several techniques for visualizing the
More informationESTIMATION OF NONLINEAR BERKSON-TYPE MEASUREMENT ERROR MODELS
Statistica Sinica 13(2003), 1201-1210 ESTIMATION OF NONLINEAR BERKSON-TYPE MEASUREMENT ERROR MODELS Liqun Wang University of Manitoba Abstract: This paper studies a minimum distance moment estimator for
More informationsparse and low-rank tensor recovery Cubic-Sketching
Sparse and Low-Ran Tensor Recovery via Cubic-Setching Guang Cheng Department of Statistics Purdue University www.science.purdue.edu/bigdata CCAM@Purdue Math Oct. 27, 2017 Joint wor with Botao Hao and Anru
More informationParameter Estimation of the Stable GARCH(1,1)-Model
WDS'09 Proceedings of Contributed Papers, Part I, 137 142, 2009. ISBN 978-80-7378-101-9 MATFYZPRESS Parameter Estimation of the Stable GARCH(1,1)-Model V. Omelchenko Charles University, Faculty of Mathematics
More information1 EM algorithm: updating the mixing proportions {π k } ik are the posterior probabilities at the qth iteration of EM.
Université du Sud Toulon - Var Master Informatique Probabilistic Learning and Data Analysis TD: Model-based clustering by Faicel CHAMROUKHI Solution The aim of this practical wor is to show how the Classification
More informationCalibration Estimation of Semiparametric Copula Models with Data Missing at Random
Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Shigeyuki Hamori 1 Kaiji Motegi 1 Zheng Zhang 2 1 Kobe University 2 Renmin University of China Institute of Statistics
More informationTesting Restrictions and Comparing Models
Econ. 513, Time Series Econometrics Fall 00 Chris Sims Testing Restrictions and Comparing Models 1. THE PROBLEM We consider here the problem of comparing two parametric models for the data X, defined by
More informationA General Overview of Parametric Estimation and Inference Techniques.
A General Overview of Parametric Estimation and Inference Techniques. Moulinath Banerjee University of Michigan September 11, 2012 The object of statistical inference is to glean information about an underlying
More informationMATH4427 Notebook 2 Fall Semester 2017/2018
MATH4427 Notebook 2 Fall Semester 2017/2018 prepared by Professor Jenny Baglivo c Copyright 2009-2018 by Jenny A. Baglivo. All Rights Reserved. 2 MATH4427 Notebook 2 3 2.1 Definitions and Examples...................................
More informationStatistics 3858 : Maximum Likelihood Estimators
Statistics 3858 : Maximum Likelihood Estimators 1 Method of Maximum Likelihood In this method we construct the so called likelihood function, that is L(θ) = L(θ; X 1, X 2,..., X n ) = f n (X 1, X 2,...,
More informationNearly Unbiased Estimation in Dynamic Panel Data Models
TI 00-008/ Tinbergen Institute Discussion Paper Nearly Unbiased Estimation in Dynamic Panel Data Models Martin A. Carree Department of General Economics, Faculty of Economics, Erasmus University Rotterdam,
More informationLecture 7 Introduction to Statistical Decision Theory
Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7
More informationAsymptotic Statistics-III. Changliang Zou
Asymptotic Statistics-III Changliang Zou The multivariate central limit theorem Theorem (Multivariate CLT for iid case) Let X i be iid random p-vectors with mean µ and and covariance matrix Σ. Then n (
More information1 General problem. 2 Terminalogy. Estimation. Estimate θ. (Pick a plausible distribution from family. ) Or estimate τ = τ(θ).
Estimation February 3, 206 Debdeep Pati General problem Model: {P θ : θ Θ}. Observe X P θ, θ Θ unknown. Estimate θ. (Pick a plausible distribution from family. ) Or estimate τ = τ(θ). Examples: θ = (µ,
More informationBahadur representations for bootstrap quantiles 1
Bahadur representations for bootstrap quantiles 1 Yijun Zuo Department of Statistics and Probability, Michigan State University East Lansing, MI 48824, USA zuo@msu.edu 1 Research partially supported by
More informationReliable Inference in Conditions of Extreme Events. Adriana Cornea
Reliable Inference in Conditions of Extreme Events by Adriana Cornea University of Exeter Business School Department of Economics ExISta Early Career Event October 17, 2012 Outline of the talk Extreme
More informationThe Slow Convergence of OLS Estimators of α, β and Portfolio. β and Portfolio Weights under Long Memory Stochastic Volatility
The Slow Convergence of OLS Estimators of α, β and Portfolio Weights under Long Memory Stochastic Volatility New York University Stern School of Business June 21, 2018 Introduction Bivariate long memory
More informationParametric Techniques
Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure
More informationSupplement to Quantile-Based Nonparametric Inference for First-Price Auctions
Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions Vadim Marmer University of British Columbia Artyom Shneyerov CIRANO, CIREQ, and Concordia University August 30, 2010 Abstract
More informationInferences for the Ratio: Fieller s Interval, Log Ratio, and Large Sample Based Confidence Intervals
Inferences for the Ratio: Fieller s Interval, Log Ratio, and Large Sample Based Confidence Intervals Michael Sherman Department of Statistics, 3143 TAMU, Texas A&M University, College Station, Texas 77843,
More information