Wolfgang K. Härdle Vladimir Spokoiny Weining Wang Ladislaus von Bortkiewicz Chair of Statistics C.A.S.E. - Center for Applied Statistics and Economics Humboldt-Universität zu Berlin http://ise.wiwi.hu-berlin.de http://www.case.hu-berlin.de
Sea urchin growth modeled using quantile regression D (mm) 6 5 D (mm) 5 6 7 t (years) Figure : Individual growth over time in a population, Grosjean, PH, Ch. Spirlet M. Jangoux,
logratio -.8 -.6 -. -......6.8. rangesca Figure : observations, a light detection and ranging (LIDAR) experiment, X: range distance travelled before the light is reflected back to its source, Y: logarithm of the ratio of received light from two laser sources.
.5.5.5.5.5.5 Motivation - Quantile Estimation 8 x 5 6 6 8 5 Figure : Forecasts of Value at Risk
.5.5.5.5.5.5 Motivation - Conditional Quantile Regression Median regression = mean regression (symmetric) Describes the conditional behavior of a response Y in a broader spectrum... Example Financial Market & Econometrics VaR (Value at Risk) tool Detect conditional heteroskedasticity...
.5.5.5.5.5.5 Motivation - Example Labor Market Analyze income of football players w.r.t. different ages, years, and countries, etc. To detect discrimination effects, need split other effects at first. log (Income) = A(year, age, education, etc) +β B(gender, nationality, union status, etc) + ε...
.5.5.5.5.5.5 Motivation - Parametric e.g. polynomial model Interior point method, Koenker and Bassett (978), Koenker and Park (996), Portnoy and Koenker (996). Nonparametric model Check function approach, Fan et al. (99), Yu and Jones (997, 998). Double-kernel technique, Fan et al. (996). Weighted NW estimation, Hall et al. (999), Cai ().
.5.5.5.5.5.5 Motivation -5 Global Smoothing Parameter is not sensible to local variations of quantile curve Adaptive approach to choose local smoothing parameter Local changing point method for Volatility model, Spokoiny (9). Local model selection method, Katkovnik and Spokoiny (7).
Outline. Motivation. Basic Setup. Parametric Exponential Bounds. Calibration by "Propagation" Condition 5. "Small Modeling Bias" and Propagation Properties 6. "Stability" and "Oracle" Property 7. Simulation 8. Application 9. Further work
.5.5.5.5.5.5 Basic Setup - Basic Setup {(X i, Y i )} n i= i.i.d. rv s, pdf f (x, y), f (y x), cdf F (x, y), F (y x), and marginal f X, f Y, x J = [, ] R d, y R l(x) = F Y x (τ) τ-quantile regression curve l n (x) quantile-smoother
.5.5.5.5.5.5 Basic Setup - Quantile & ALD Location Model Y i F (y), l τ = F (τ) = θ = Asymmetric Laplace Distribution (ALD) Y i = θ + ε i, ε i ALD τ (, ) p(u, ) = τ( τ) exp{ ρ τ (u)} l(u) = log p(u, ) = log{τ( τ)} ρ τ (u) ρ τ (u) def = τu{u (, )} ( τ)u{u (, )}
.5.5.5.5.5.5 Basic Setup - Check Function.5.5 - - Figure : Check function for τ=.9, τ=.5 and weight function in conditional mean regression
.5.5.5.5.5.5 Basic Setup - Quasi-likelihood Approach θ = arg min E θ,λ (.) ρ τ (Y i θ) θ θ = n arg min L(θ) def = arg min { log τ( τ) + ρ τ (Y i θ)} θ θ i= λ + (y) = (τ y) log{τ P(Y > y)}, y > λ (y) = {( τ) y} log{( τ) P(Y < y)}, y < where we assume λ +( ) (y) as y (heavy tail distribution), λ (.) tail function of true P.
.5.5.5.5.5.5 Basic Setup -5 l(x) minimizes (w.r.t. θ) L(θ) = E l(.),λ (.){ρ τ (Y θ) X = x} l n (x) minimizes (w.r.t. θ) n L n (θ) = n ρ τ (Y i θ)k h(x) (x X i ) () i= K h(x) (u) = h(x) K{u/h(x)} is a kernel with bandwidth h(x), which is determined by a local adaptive algorithm.
.5.5.5.5.5.5 Parametric Exponential Bounds - First Step to Happiness
.5.5.5.5.5.5 Parametric Exponential Bounds - Parametric Exponential Bounds Define rate function: M(θ, θ ) def = log E θ,λ(.) exp{µ(θ)l(θ, θ )} () Let θ =, and θ >, we can derive, for µ(θ) = τ λ+ (θ): M(θ, θ ) log{ + θλ + (θ)} + τ θλ + (θ) () τ
.5.5.5.5.5.5 Parametric Exponential Bounds - Parametric Exponential Bounds Golubev Y., Spokoiny V. (9) V (θ) H(λ, γ, θ) def = E θ ζ(θ)[ ζ(θ)] def = log E θ,λ(.) exp{λ γ ζ(θ) γ V (θ)γ }, where ζ(θ) def = l(θ, θ ) E θ,λ(.) l(θ, θ ).
.5.5.5.5.5.5 Parametric Exponential Bounds - Parametric Exponential Bounds (ED) There exists λ > such that for some ν uniformly in θ Θ sup sup λ H(λ, γ, θ) ν γ S p λ λ Theorem Under (ED), we have, for prescribed ρ <, s <, E θ,λ(.) exp[ρ{µ( θ)l( θ, θ )+sm( θ, θ )}] C( ρ) / ( s) / with fixed constant C provided that n exceeds some minimal sample size.
.5.5.5.5.5.5 Parametric Exponential Bounds -5 Parametric Exponential Bounds Obtain exponential risk bound for the likelihood function: E θ,λ(.) exp{ρµ( θ)l( θ, θ )} C( ρ).5 L( θ, θ ) = n τ ( θ θ ) + n { Y i θ Y i θ } i=
.5.5.5.5.5.5 Parametric Exponential Bounds -6 Parametric Exponential Bounds This further implies: E θ,λ(.) L( θ, θ ) r τ r
.5.5.5.5.5.5 Calibration by "Propagation" Condition - Adaptation Scale Fix x, we have a sequence of ordered weights W (k) = (w (k), w (k),..., w n (k) ). Define w (k) i = K hk (x X i ), (h < h <... < h K ). LPA: l(x i ) l θ (X i ) θ k = argmax θ def = argmax θ n i= l{y i, l θ (X i )}w (k) i L(W (k), θ)
.5.5.5.5.5.5 Calibration by "Propagation" Condition - LMS Procedure for Quantile Construct an estimate ˆθ = ˆθ(x), on the base of θ, θ,..., θ K. Start with ˆθ = θ. For k, θ k is accepted and ˆθ k = θ k if θ k was accepted and L(W (l), θ l, θ k ) z l, l =,..., k ˆθ k is the the latest accepted estimate after the first k steps.
.5.5.5.5.5.5 Local Model Selection procedure Calibration by "Propagation" Condition - LMS procedure LMS: Illustration Illustration CS θ k + θ θ θ θ k * k * k + Stop
.5.5.5.5.5.5 Calibration by "Propagation" Condition - "Propagation" Condition Theorem Suppose (W (k), θ) for k k. Then E θ,λ(.) L(W (k), θ k, ˆθ k ) r τ r (W (k) ) α, where k, the "oracle" choice.
.5.5.5.5.5.5 Calibration by "Propagation" Condition -5 Sequential Choice of Critical Values Consider first only z letting z =... = z K =. Leads to the estimates ˆθ k (z ) for k =,..., K. The value z is selected as the minimal one for which L{W (k), θ k, ˆθ k (z )} r sup E θ,λ(.) θ τ r (W (k) ) α, k =,..., K. K Set z k+ =... = z K = and fix z k lead the set of parameters z,..., z k,,..., and the estimates ˆθ m (z,..., z k ) for m = k +,..., K. Select z k s.t. L{W (k), sup E θ m, ˆθ m (z, z,..., z k )} r θ,λ(.) θ τ r (W (k) ) m = k +,..., K. kα K,
.5.5.5.5.5.5 Calibration by "Propagation" Condition -6 xi[-] 6 8 6 8 Index Figure 5: Critical value with λ(.) corresponding to standard normal distribution, ρ =., r =.5, τ =.75.
.5.5.5.5.5.5 Calibration by "Propagation" Condition -7 Bounds for Critical Values a) For some constants < u, u <, it holds < u h k /h k u <. In addition, we assume the parameter set Θ satisfy the condition: b) For some constants a with < a <, and for any θ and θ Θ, a < θ /θ < a Theorem Suppose that r >, α >. Under the assumptions of probability bounds, there are a, a, a, s.t. the propagation condition is fulfilled with the choice of z k = a r log(ρ ) + a r log(h K /h k ) + a log(nh k ).
.5.5.5.5.5.5 "Small Modeling Bias" and Propagation Properties 5- "Small Modeling Bias" Condition λ(.),λ (.)(W (k), θ) = n i= K{P l(xi ),λ (.), P lθ (X i ),λ(.)}{w (k) i > } "Small Modeling Bias" Condition: λ,λ (.)(W (k), θ), k < k
.5.5.5.5.5.5 "Small Modeling Bias" and Propagation Properties 5- "Small Modeling Bias" Property Then, it holds for any estimate θ k and θ satisfies "SMB": E l(.),λ (.) log{ + L(W (k), θ k, θ) r /τ r (W (k) )} +,
.5.5.5.5.5.5 "Small Modeling Bias" and Propagation Properties 5- "Stability" Property Stability: the attained quality of estimation during "propagation" can not get lost at further steps. Theorem L(W (k ), θ k, ˆθˆk){ˆk > k } z k
.5.5.5.5.5.5 "Small Modeling Bias" and Propagation Properties 5- "Oracle" Property Combing the "propagation" and "stability" results yields Theorem Let (W (k), θ) for some θ Θ and k k. Then E l(.),λ (.) log { + L(W (k ), θ k, θ) r } τ r (W (k ) + ) E l(.),λ (.) log { + L(W (k ), θ k, ˆθ) r } τ r (W (k ) + α ) z k + log{ + τ r (W (k ) ) }
.5.5.5.5.5.5 Monte Carlo Simulation 6-..5.7.9 6 8-6 8 6 8 Figure 6: The bandwidth sequence (upper panel), the data with exp() noise (blue line), the adaptive estimation of.75 quantile (red line), the quantile smoother with fixed optimal bandwidth =.6 (yellow)
.5.5.5.5.5.5 Monte Carlo Simulation 6- - 5-5 Figure 7: The block residuals of the fixed bandwidth estimation (upper panel)and the adaptive estimation (lower panel)
.5.5.5.5.5.5 Monte Carlo Simulation 6-..5.7.9 6 8 - - 6 8 Figure 8: The bandwidth sequence (upper panel), the data with standard normal noise(blue line), the adaptive estimation of.75 quantile (red line)
.5.5.5.5.5.5 Monte Carlo Simulation 6-..5.7 6 8-5 5 6 8 Figure 9: The bandwidth sequence (upper panel), the data with t() noise (blue line), the adaptive estimation of.75 quantile (red line)
.5.5.5.5.5.5 Monte Carlo Simulation 6-5..5.7.9 6 8 5 6 8 Figure : The bandwidth sequence (upper panel), the data with exp() noise (blue line), the adaptive estimation of.75 quantile (red line)
.5.5.5.5.5.5 Application 7- -.5 -. -.5..5..5 5 6 7 8 9 5 7 9 Figure : Y: HK stock cheung kong returns, X: HK stock clpholdings returns, daily // /6/
.5.5.5.5.5.5 Application 7- q[ ].6.7.8.9 6 8 6 8 q.85.95.5.5 q.85.95.5.5 _...5.6....6.8..85.9.95..5..5 Figure : Plot of bandwidth sequence (upper left), plot of quantile smoother (upper left), scatter plot on scaled X (lower left), original scale (lower right)
.5.5.5.5.5.5 Application 7-.75.8.85.9 6 8 -.8 -.6 -. -......6.8. Figure : observations, a light detection and ranging (LIDAR) experiment, X: range distance travelled before the light is reflected back to its source, Y: logarithm of the ratio of received light from two laser sources.
.5.5.5.5.5.5 Application 7- References P. Bickel and M. Rosenblatt On some global measures of the deviation of density function estimatiors Ann. Statist., :7-95, 97. Z. W. Cai Regression quantiles for time series Econometric Theory, 8:69-9,. J. Fan and T. C. Hu and Y. K. Troung Robust nonparametric function estimation Scandinavian Journal of Statistics, :-6, 99.
.5.5.5.5.5.5 Application 7-5 References J. Fan and Q. Yao and H. Tong Estimation of conditional densities and sensitivity measures in nonlinear dynamical systems Biometrika, 8:89-6, 996. J. Franke and P. Mwita Nonparametric Estimates for conditional quantiles of time series Report in Wirtschaftsmathematik 87, University of Kaiserslautern,. P. Hall and R.C.L. Wolff and Q. Yao Methods for estimating a conditional distribution function J. Amer. Statist. Assoc., 9:5-6, 999.
.5.5.5.5.5.5 Application 7-6 References W. Härdle, P. Janssen and R. Serfling Strong uniform consistency rates for estimators of conditional functionals Ann. Statist., 6:8-9 988. W. Härdle and S. Luckhaus Uniform consistency of a class of regression function estimators Ann. Statist., :6-6 98. X. He and H. Liang Quantile regression estimates for a class of linear and partially linear errors-in-variables model Ann. Statist., :9-.
.5.5.5.5.5.5 Application 7-7 References K. Jeong and W. Härdle A Consistent Nonparametric Test for Causality in Quantile SFB 69 Discussion Paper, 8. G. Johnston Probabilities of maximal deviations of nonparametric regression function estimation J. Multivariate Anal., :-, 98. R. Koenker and G. W. Bassett Regression quantiles Econometrica, 6:-5, 978.
.5.5.5.5.5.5 Application 7-8 References R. Koenker and B. J. Park An interior point algorithm for nonlinear quantile regression Journal of Econometrics, 7:65-8 996. M. G. Lejeune and P. Sarda Quantile regression: A nonparametric approach Computational Statistics & Data Analysis, 6:9-9 988. K. Murphy and F. Welch Empirical age-earnings profiles Journal of Labor Economics, 8():-9 99.
.5.5.5.5.5.5 Application 7-9 References S. Portnoy and R. Koenker The Gaussian hare and the Laplacian tortoise: computability of squared-error versus absolute -error estimatiors (with discussion) Statistical Sciences, :79-, 997. E. Rio Vitesses de convergence dans le principe d invariance faible pour la fonction de répartition empirique multivariée (Rates of convergence in the invariance principle for the multivariate empirical distribution function) C. r. Acad. sci., Sér., Math., ():69-7, 996.
.5.5.5.5.5.5 Application 7- References V. Spokoiny Local parametric estimation Heildelberg: Springer Verlag, 8 G. Tusnady A remark on the approximation of the sample distribution function in the multidimensional case Period. Math. Hungar., 8:5-55, 977. K. Yu and M. C. Jones A comparison of local constant and local linear regression quantile estimation Computational Statistics and Data Analysis, 5:59-66, 997.
.5.5.5.5.5.5 Application 7- References K. Yu and M. C. Jones Local linear quantile regression J. Amer. Statist. Assoc., 9:8-7, 998. K. Yu, Z. Lu and J. Stander Quantile regression: applications and current research areas J. Roy. Statistical Society: Series D (The Statistician), 5:-5,. L. X. Zhu, R. Q. Zhu and S. Song Diagnostic checking for multivariate regression models J. Multivariate Anal., accepted.