Estimating a frontier function using a high-order moments method

Similar documents
The high order moments method in endpoint estimation: an overview

Frontier estimation based on extreme risk measures

Estimation of the functional Weibull-tail coefficient

Extreme L p quantiles as risk measures

Contributions to extreme-value analysis

Extreme geometric quantiles

Single Index Quantile Regression for Heteroscedastic Data

Estimation de mesures de risques à partir des L p -quantiles

NONPARAMETRIC ESTIMATION OF THE CONDITIONAL TAIL INDEX

41903: Introduction to Nonparametrics

Nonparametric estimation of tail risk measures from heavy-tailed distributions

Minimum Hellinger Distance Estimation in a. Semiparametric Mixture Model

Estimation of risk measures for extreme pluviometrical measurements

Integral approximation by kernel smoothing

ROBUST NONPARAMETRIC FRONTIER ESTIMATORS: QUALITATIVE ROBUSTNESS AND INFLUENCE FUNCTION

Nonparametric estimation of extreme risks from heavy-tailed distributions

Single Index Quantile Regression for Heteroscedastic Data

Minimax Rate of Convergence for an Estimator of the Functional Component in a Semiparametric Multivariate Partially Linear Model.

AN ABSTRACT OF THE THESIS OF. Anton Piskunov for the degree of Master of Science in Economics presented on June 13, 2008.

Estimating frontier cost models using extremiles

DEPARTMENT MATHEMATIK ARBEITSBEREICH MATHEMATISCHE STATISTIK UND STOCHASTISCHE PROZESSE

Bickel Rosenblatt test

Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood

Acceleration of some empirical means. Application to semiparametric regression

Nonconcave Penalized Likelihood with A Diverging Number of Parameters

Transformation and Smoothing in Sample Survey Data

Estimation of the second order parameter for heavy-tailed distributions

Nonparametric Regression

D I S C U S S I O N P A P E R

ECE521 week 3: 23/26 January 2017

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon

Statistical Properties of Numerical Derivatives

WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract

Random measures, intersections, and applications

Theory of Maximum Likelihood Estimation. Konstantin Kashin

Spatial analysis of extreme rainfalls In the Cévennes-Vivarais region

Nonparametric Frontier Estimation: recent developments and new challenges

Copula Regression RAHUL A. PARSA DRAKE UNIVERSITY & STUART A. KLUGMAN SOCIETY OF ACTUARIES CASUALTY ACTUARIAL SOCIETY MAY 18,2011

Local Polynomial Regression

Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued

Nonparametric regression with martingale increment errors

Statistical inference on Lévy processes

Information geometry for bivariate distribution control

Testing Restrictions and Comparing Models

Bayesian Modeling of Conditional Distributions

Fast learning rates for plug-in classifiers under the margin condition

Web-based Supplementary Material for. Dependence Calibration in Conditional Copulas: A Nonparametric Approach

GAUSSIAN PROCESS REGRESSION

Modelling Non-linear and Non-stationary Time Series

Kernel regression with Weibull-type tails

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices

Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed

Nonparametric Density Estimation

A tailor made nonparametric density estimate

Sliced Inverse Regression

Econ 582 Nonparametric Regression

The EM Algorithm for the Finite Mixture of Exponential Distribution Models

Robustness to Parametric Assumptions in Missing Data Models

Model Selection and Geometry

Nonparametric Estimation of Regression Functions In the Presence of Irrelevant Regressors

Hölder regularity estimation by Hart Smith and Curvelet transforms

Nonparametric Methods

A new estimator for quantile-oriented sensitivity indices

1.1 Basis of Statistical Decision Theory

Semiparametric Regression Based on Multiple Sources

7 Semiparametric Estimation of Additive Models

Plug-in Approach to Active Learning

Empirical Likelihood

Semiparametric Regression Based on Multiple Sources

Lecture 3. Inference about multivariate normal distribution

Geometric Inference for Probability distributions

Lecture 3: Statistical Decision Theory (Part II)

Day 3B Nonparametrics and Bootstrap

npbr: A Package for Nonparametric Boundary Regression in R

Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity

A Note on Data-Adaptive Bandwidth Selection for Sequential Kernel Smoothers

Local linear multiple regression with variable. bandwidth in the presence of heteroscedasticity

Simple and Efficient Improvements of Multivariate Local Linear Regression

Inverse Statistical Learning

Geometric Inference for Probability distributions

WEIGHTED LIKELIHOOD NEGATIVE BINOMIAL REGRESSION

Sensitivity Analysis of Efficiency Scores: How to Bootstrap in Nonparametric Frontier Models. April Abstract

Testing for Anomalous Periods in Time Series Data. Graham Elliott

Nonparametric Inference via Bootstrapping the Debiased Estimator

Nonparametric Modal Regression

Semi-parametric estimation of non-stationary Pickands functions

Estimating Bivariate Tail: a copula based approach

Supervised Learning: Non-parametric Estimation

Concentration of Measures by Bounded Couplings

Stochastic Nonparametric Envelopment of Data (StoNED) in the Case of Panel Data: Timo Kuosmanen

Outline. Supervised Learning. Hong Chang. Institute of Computing Technology, Chinese Academy of Sciences. Machine Learning Methods (Fall 2012)

Local Polynomial Modelling and Its Applications

On variable bandwidth kernel density estimation

Estimation of the Bivariate and Marginal Distributions with Censored Data

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Parameter Estimation

Inference on distributions and quantiles using a finite-sample Dirichlet process

Nonparametric Covariate Adjustment for Receiver Operating Characteristic Curves

Independent and conditionally independent counterfactual distributions

Stat 5101 Lecture Notes

Transcription:

1/ 16 Estimating a frontier function using a high-order moments method Gilles STUPFLER (University of Nottingham) Joint work with Stéphane GIRARD (INRIA Rhône-Alpes) and Armelle GUILLOU (Université de Strasbourg) EMS 2017 Conference Helsinki, Finland, 24th July 2017

2/ 16 Outline Context Constructing the estimator Asymptotic results Finite-sample results Discussion

3/ 16 Context We assume that Y is a univariate random variable recorded along with a finite-dimensional covariate X. Suppose that Y given X = x has a finite right endpoint g(x): g(x) := sup{y R P(Y y X = x) < 1} <. We address the problem of estimating the frontier function x g(x). Practical relevance: Temperature/wind speed as a function of 2D/3D coordinates. Performance in athletics as a function of age. Life span as a function of socioeconomic status. Production level as a function of input.

4/ 16 Specifically, assume that the distribution of (X, Y ) has support where S = {(x, y) Ω R 0 y g(x)} X has a pdf f on the compact subset Ω of R d having nonempty interior int(ω); g is a positive Borel measurable function on Ω. We consider pointwise estimation of the function g on int(ω), given an n sample of i.i.d. replications of (X, Y ).

5/ 16 2.5 2.0 1.5 y 1.0 0.5 0.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 x Figure 1: Frontier g (solid line), data points (+). The sample size is n = 200.

Some existing methods Extreme-value based estimators: Geffroy (1964), Gardes (2002), Girard and Jacob (2003a, 2003b, 2004), Girard and Menneteau (2005), Menneteau (2008). Optimization methods and linear programming: Bouchard et al. (2004, 2005), Girard et al. (2005), Nazin and Girard (2014). Piecewise polynomial estimators: Korostelev and Tsybakov (1993), Korostelev et al. (1995), Härdle et al. (1995). Projection estimators: Jacob and Suquet (1995). If g is nondecreasing and concave: DEA/FDH estimators and improvements: Deprins et al. (1984), Farrell (1957), Gijbels et al. (1999). Robust estimators: Aragon et al. (2005), Cazals et al. (2002), Daouia and Simar (2005), Daouia et al. (2012). Local MLE (with random noise): Aigner et al. (1976), Fan et al. (1996), Kumbhakar et al. (2007), Simar and Zelenyuk (2011). 6/ 16

7/ 16 Constructing the estimator Assume first that Y is a positive random variable with finite right endpoint θ. Denote by µ p := E(Y p ). Proposition 1 It holds that µ p /µ p+1 1/θ as p. Given data points Y 1,..., Y n, this opens a number of ways to estimate θ from the class of empirical high-order moments µ p = 1 n n Y p k with p = p n. k=1 However, the most direct of such estimators, namely θ n = µ pn+1/ µ pn, is in practice too biased to be used.

8/ 16 A better alternative is given by, for some tuning constant a > 0, 1 = 1 [ ((a + 1)p n + 1) µ (a+1)p n (p n + 1) µ ] p n. θ n ap n µ (a+1)pn+1 µ pn+1 This estimator is motivated by the elimination of the bias term when the survival function of Y is ( y [0, θ], F (y) := P(Y > y) = 1 y ) α. θ High-order moments allow to control the bias brought by general survival functions with polynomial decay near the endpoint.

9/ 16 High-order moments frontier estimator Our previous construction suggests the following estimator of g(x): 1 ĝ n (x) = 1 [ ((a + 1)p n + 1) µ (a+1)p n (x) ap n µ (p n + 1) µ ] p n (x) (a+1)pn+1(x) µ pn+1(x) where µ p (x) is a well-behaved estimator of the conditional pth order moment µ p (x) := E(Y p X = x). We choose this estimator to be the smoothed estimator n ). µ p,hn (x) := 1 nh d n k=1 Y p k K ( x Xk h n Here, K is a kernel function, i.e. a bounded pdf on R d with support included in the unit Euclidean ball B R d, and h n > 0 is a bandwidth sequence that converges to 0.

10/ 16 Our (kernel) estimator ĝ n (x) of g(x) is then defined by ap n ĝ n (x) = ((a + 1)p n + 1) µ (a+1)p n,h n (x) µ (a+1)pn+1,hn (x) (p n + 1) µ p n,h n (x) µ pn+1,hn (x). For ease of exposition, assume that we work in the parametric setting (P) y [0, g(x)], F (y x) = (1 y/g(x)) 1/γ(x), with γ(x) < 0. (A) f, g and γ are positive and Hölder continuous on Ω with respective exponents η f, η g and η γ. A departure from (P) is actually allowed (if we stay within the Hall class).

11/ 16 Asymptotic results Theorem 1 (Pointwise consistency, frontier estimator) If npn 1/γ(x) hn d and p n hn ηg 0, then ĝ n (x) P g(x). Theorem 2 (Asymptotic normality, frontier estimator) If np 1/γ(x) n hn d, npn 2+1/γ(x) hn d+2ηg np 2+1/γ(x) n h d n ) (ĝn (x) g(x) 1 where V (γ, a) is explicitly known. 0 and npn 1/γ(x) ( d N 0, hn d+2ηα B K 2 V (γ(x), a) f (x) 0, then ) Uniform consistency results on compacta E int(ω) are also available.

12/ 16 Finite-sample results We examined the finite-sample performance of the estimator depending on the value of: the frontier function g (smooth or not), the extreme-value index function γ (constant or not), the dimension d = 1 or 2. We also checked for robustness against a violation of model (P), and we compared the estimator to: the block maxima estimator of Geffroy (1964), a primitive version of the high order moments method, constructed for a conditional uniform model by Girard and Jacob (2008). In general, the estimator ĝ n significantly outperformed these competitors w.r.t. the L 1 metric, even though the choices of a, p n and h n were crude.

Figure 2: Case γ(x) = 2[2.5 + cos(2πx) ] 1 : frontier function g (solid line), high-order moments estimate ĝ n (dotted line) corresponding to the best result among 500 replications of a sample of size 500. 13/ 16 2.5 2.0 1.5 y 1.0 0.5 0.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 x

Figure 3: Case γ(x) = 2[2.5 + cos(2πx) ] 1 : frontier function g (solid line), high-order moments estimate ĝ n (dotted line) corresponding to the worst result among 500 replications of a sample of size 500. 14/ 16 2.5 2.0 1.5 y 1.0 0.5 0.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 x

15/ 16 Discussion Conclusions: High order moments provide a class of interesting devices when it comes to estimating an endpoint/frontier function. The order p n is a substitute for the effective sample size k n of extreme-value methods. The presented estimator has satisfactory finite-sample performance even with fairly simple choices of tuning parameters. Some ideas for further studies: Development of data-driven choice procedures of tuning parameters; Construction of outlier-resistant high order moments procedures; Building estimators adapted to high-dimensional data sets.

16/ 16 References High-order moments estimator for a frontier function: Girard, S., Guillou, A., Stupfler, G. (2013). Frontier estimation with kernel regression on high order moments, Journal of Multivariate Analysis 116: 172 189. Uniform versions of the asymptotic results: Girard, S., Guillou, A., Stupfler, G. (2014). Uniform strong consistency of a frontier estimator using kernel regression on high order moments, ESAIM: Probability and Statistics 18: 642 666. Thanks for listening!