arxiv: v1 [stat.me] 25 Mar 2019

Size: px
Start display at page:

Download "arxiv: v1 [stat.me] 25 Mar 2019"

Transcription

1 -Divergence loss for the kernel density estimation with bias reduced Hamza Dhaker a,, El Hadji Deme b, and Youssou Ciss b a Département de mathématiques et statistique,université de Moncton, NB, Canada b LERSTAD,UFR SAT, Universite Gaston Berger, Saint-Louis, Senegal arxiv: v [stat.me] 5 Mar 09 Abstract Allthough nonparametric kernel density estimation with bias reduce is nowadays a standard technique in explorative data-analysis, there is still a big dispute on how to assess the quality of the estimate and which choice of bandwidth is optimal. This article examines the most important bandwidth selection methods for kernel density estimation with bias reduce, in particular, normal reference, least squares cross-validation, biased crossvalidation and -Divergence loss. Methods are described and expressions are presented. We will compare these various bandwidth selector on simulated data. As an example of real data, we will use econometric data sets CO per capita in example and the second data set consists of 07 eruption lengths in minutes for the Old Faithful geyser in Yellowstone National Park, USA. Keywords: -Divergence; Kernel Density Estimation; bandwidth.. ntroduction Selecting an appropriate bandwidth for a kernel density estimator is of crucial importance, and the purpose of the estimation may be an influential factor in the selection method. n many situations, it is sufficient to subjectively choose the smoothing parameter by looking at the density estimates produced by a range of bandwidths. A good overview on kernel density estimators is supplied by Silverman [8]; Scott [7]; Mugdadi and Ahmad [4]. Throughout this article, we use the following notation. Let X,..., X n, of size n from a density f where the Parzen [5] kernel estimator f n is defined by f n = n K( x X i ) nh h i= n general, K is the kernel function (e.g. normal density function) and h is the bandwidth. A large body of research is devoted to choosing h, essentially the amount of smoothing to apply. Usually, smoothing parameters can be chosen via cross validation or by minimizing a measure of error. An important recent paper in this area is Xie and Wu []. Xie and Wu studied an estimator which reduces the bias, the performance of which in both theory and simulations proved to be clearly superior to other methods currently popular in the literature. Xie and Wu [] presented an estimator which reduces the bias, defined by: n = f n Bias( f n ) () = f n h f n t K(t)dt () the bandwidth is the most dominant parameter in the kernel density estimator. This parameter controls the amount of smoothing, and is analogous to the bandwidth in a histogram. Even though the kernel estimator depends on the Corresponding author address: hamza.dhaker@umoncton.ca (Hamza Dhaker) Preprint submitted to Journal Name March 6, 09

2 kernel and the bandwidth in a rather complicated way, a graphical representation clearly illustrates the difference in importance between these two parameters, see figure.3 and.6a in Wand and Jones [0]. To explore the most relevant bandwidth selection methods in density estimation for complete data see the reviews of Turlach [9], Cao et al. [3], Jones et al. [8] or Heidenreich et al. (03), Mammen et al. ([] and []), and the recent work on -Divergence for Bandwidth Selection by Dhaker and al. [5]. Our aim in this paper is to propose and compare several bandwidth selection procedures for the kernel density estimators introduced by Xie and Wu []. The procedures we study are bandwidth selector based on the criterion of -divergence with different beta values. A simulation study is then carried out to assess the finite sample behavior of these bandwidth selectors. The remainder of the paper is organised as follows. n Section, we state our main results: presents the method proposed for bandwidth selector based D. Section 3 estimation of the optimal bandwidth Section 4 is devoted to our simulation results. Section 5 applies the methods to real data. We conclude the paper in Section 6,. Bandwidth selection based -divergence The -divergence [Cichocki [4], Basu [] and Eguchi [6]] is a general framework of similarity measures induced from various statistical models, such as Poisson, Gamma, Gaussian, nverse Gaussian and compound Poisson distribution. For the connection between the -divergence and the various statistical distributions, see Jorgensen [9]. Beta divergence was proposed in Basu [] and Minami and Eguchi [3] and is defined as dissimilarity between the density function and its estimator D ( ˆ f n, f ) = n dx n f dx ( ) f dx. n the case =, D ( n, f ) = S E( n ) = ( n f ) dx. The following theorem allows us to give the analytical value of bandwidth which minimizes the mean D ( ˆ f n, f ). Theorem. Let the following conditions on f be satisfied: (F) f is compactly supported on. (F) f is four times continuously differentiable on. (F3) f (4) f dx <. the bandwidth h ED that minimizes the mean -divergence between a kernel estimator f h and density f is: Proposition. Under (F) (F3) we have ED ( n, f ) = ( ) h8 t 4 K(t)dt 576 K(t) h = h ED = 7 dt f dx [ t4 K(t)dt ] f ( f (4) ) dx /9 f ( f (4) ) dx K (t)dt nh n /9 (3) f dx O p(n c ) O(h 6 ) (4) and AED ( n, f ) = ( ) h8 t 4 K(t)dt 576 f ( f (4) ) dx K (t)dt nh f dx (5) where 0 < c < 8

3 Corollary. Assuming that the assumptions in Theorem hold with =. Then, we have ED ( f h, f ) = MS E( f h, f ) in that case Proof. R(g) = g(t) dt and µ 4 (K) = x 4 Kdx AED ( f h, f ) = AMS E( f h, f ) { } /9 9 h AMS E( f h, f ) = R(K) n /9 (6) (µ 4 (K)) R( f (4) ) f n = ( f n Bias( f ) With a random variable ξ = O p () whose expectation is 0 and variance, we can write f n as (see Kanazawa [0]) f n = f h f () f t K(t)dt h4 f (4) 4 f t 4 K(t)dt O(h 6 ) K(t) dt nh f / ξ O p (n / ) (7) Using the result of the corollary.6 [7], we have lim sup n x n c f (r) n f (r) = 0 with 0 < c < r 4 f n = f n Bias( f n ) = f n h f n () t K(t)dt = f n h f () t K(t)dt O(n c ) = f h f () t K(t)dt h4 f (4) t 4 K(t)dt O(h 6 ) K(t) / dt ξ O f 4 f nh f p (n / ) h f () t K(t)dt O(n c ) = f h4 f (4) t 4 K(t)dt O(h 6 ) K(t) / dt ξ O 4 f nh f p (n / ) O(n c ) Where the O(h 6 ) terms depend upon x. Using ( z) = z ( ) z O(z 3 ) f n = f h4 f (4) 4 f = f h 4 f (4) 4 f O p (n c ) O(h 6 ) ], t 4 K(t)dt O(h 6 ) K(t) dt nh f t 4 K(t)dt K(t) / dt ξ nh f 3 / ξ O p (n / ) O(n c ) ( ) h8 576 ( f (4) ) f ( t 4 K(t)dt) K(t) dt ξ nh f

4 and f n = f h4 f (4) t 4 K(t)dt O(h 6 ) K(t) / dt ξ O 4 f nh f p (n / ) O(n c ) = f h 4 f (4) [ ( ) t 4 K(t)dt K(t) / dt ξ 4 f nh f ( )( ) h8 ( f (4) ) ( t 4 K(t)dt) K(t) dt ξ 576 f nh f O p (n c ) O(h 6 )] D ( n, f ) = n dx h 4 f (4) f = f [ 4 ( ) h8 ( f (4) ) 576 f O p (n c ) O(h 6 )]dx f h 4 [ ( ) 4 ( )( ) h8 ( f (4) ) 576 f O p (n c ) O(h 6 )]dx f dx ( ) = f ( ) [ h8 576 n f dx ( ) t 4 K(t)dt K(t) dt nh f ( t 4 K(t)dt) f (4) f ( t 4 K(t)dt) K(t) dt ξ nh f t 4 K(t)dt ( t 4 K(t)dt) f dx / ξ K(t) dt nh f K(t) dt ξ nh f ( f (4) ) K(t) dt ξ f nh f O p(n c ) O(h 6 )]dx f ( )( ) [ h8 ( f (4) ) ( t 4 K(t)dt) K(t) dt ξ 576 f nh f O p(n c ) O(h 6 )]dx = f [( ) h8 ( f (4) ) ( t 4 K(t)dt) K(t) dt ξ 576 f nh f O p(n c ) O(h 6 )]dx = [ h ( t 4 K(t)dt) f ( f (4)) ] dx K (t)dt f dxξ O p (n c ) O(h 6 ) nh / ξ ED ( n, f ) = ( ) h8 t 4 K(t)dt 576 f ( f (4)) dx K (t)dt nh f dx O p(n c ) O(h 6 ) 4

5 3. The choice of the bandwidth h n this section, we describe bandwidth selection methods for the density estimator defined in (). These methods consist of adaptations of common automatic selectors for kernel density estimation. We propose two selection methods a Normal reference and the cross-validation method. The Normal reference bandwidth is based on estimating the infeasible optimal expression (6), in which the unknown element is R( f (4) ). 3.. Rule-of-thumb for bandwidth selection This method is based on the rule-of-thumb, Silverman [8], for complete data.the idea is to assume that the underlying distribution is normal, N(µ, σ), and in this situation so f (4) = ( f (4) ) x m = σ 0 e ( σ (9 ) 36 π f = σ π e ( x m σ ), σ 5 π e ( x m σ ) (3 6 ( x m σ ( x m σ ) 30 ( x m σ ) ( x m ) 4 ), σ ) 4 8 ( x m σ ) 6 ( x m ) 8 ), σ f ( f (4) ) dx = σ 7 (π) ( ). and f dx = (π) n that case the asymptotically optimal bandwidth h in Equation (3) becomes the normal reference bandwidth. h = h ED = = 7 R(K) f dx µ 4 (K) f ( f (4) ) dx 7 /9 R(K) (π) µ 4 (K) σ 7 (π) n /9 ( 9 ) /9 n /9 with σ being the standard deviation of f. For the Gaussian kernel, µ 4 (K) = 3 and R(K) = (4π) so that 4 4 h NR = π n /9 σ (8) for = 6 h NR = 86 π n /9 σ. (9) The standard deviation σ can be estimated by the sample standard deviation s or by the standardized interquartile range QR/.34 for robustness against outliers (.34 = Φ (3/4) Φ (/4)), but a better rule of thumb is (e.g., Silverman, 986, pp. 4547; Hrdle, 99, p. 9). 5

6 6 ĥ NR = 86 π n /9 σ, (0) with σ = min(s, QR/.34) 3.. Cross-validation The method previously defined is based on minimising estimations of the MS E, more precisely of the AMS E. This procedure relies on the minimisation of the S E (integrated squared error), the methodology is the same as in Rudemo [6] and Bowman [] applied to (). Let write: D ( n, f ) = n dx n f dx ( ) f dx, Note that ( ) f dx dz does not depend on h, so the minimisation of the SE is equivalent to minimise the following function: L(h) = D ( n, f ) f dx, ( ) = n dx n f dx, = n dx E ( n ). The principle of the least squares cross-validation method is to find an estimate of L(h) from the data and minimize it over h. Consider the estimator LS CV(h) = n dx n n h(i) (X i), with h(i) (X i) = h(n ) 4. Simulation nj i K ( X i X j ) h. n this section we evaluate the performance of the bandwidth selection procedures presented in Section. To this goal we have carried out a simulation study including rule-of-thumb (h NR ), cross-validation bandwidth (h LS CV ) and h for minimizing criterion -divergence (with =.5,.and.9). We consider various sets of experiments in which data are generated from the mixture of a Normal N(0, ) and Normal N(µ, σ) distributions. Hence, the DGP (Data Generating Process) is generated from m(µ, σ ) with the density i= m(µ, σ ) = 0.5N(0, ) 0.5N(µ, σ) () where µ = 0,, 5 and σ =, 0.5, 0.. One thousand Monte Carlo samples of size n are generated from the normal mixture model in Equation () for each combination of n = 50, 00, 700. The results of our different sets of experiments are presented in Tables -3. Table give the exhibits simulated relative efficiency RE(ĥ) = MS E( fĥms E )/MS E( fĥ) of the kernel estimator, using bandwidths ĥ NR, ĥ LS CV and ĥ, it is lower than, because the optimal bandwidth h MS E minimize MS E. Each bandwidth, mean E(ĥ) and mean relation error E(ĥ/h MS E ) are obtained, these values are given by respectively, Tables and 3. 6

7 - For all situations, each relative efficiency RE(ĥ) < because the optimal bandwidth h MS E minimizes the MS E. - The normal reference bandwidth hnr performswell if the true density is not very far from normal, such as the cases of (µ, σ) = (0, ), (0, 0.5), (, ), and (, 0.5). Otherwise, it usually has the smallest RE(ĥ) and largest E(ĥ), tending to oversmooth its kernel density estimate the most. - We have to remark that in table, the LSCV bandwidth hlscv needs a large sample size in order to be competitive. Note also that in Table, it is seen that E(ĥ LS CV ) is close to the optimal ĥ MS E, but the corresponding E(ĥ LS CV /ĥ MS E ) is large, which means that the bias of ĥ LS CV is small but its variation is large in Table The bandwidth ĥ seems to be the best existing bandwidth selectors. n most situations, it is indeed one of the best bandwidth selectors, However, it behaves very poorly for small σ (the true density curve is sharp). Table : RE ( ĥ ) for normal mixture f = 0.5φ 0.5φ σ (x µ) n ĥ NR ĥ LS CV ĥ D. CV ĥ D.5 CV ĥ D.9 CV µ = 0 σ = µ = 0 σ = µ = 0 σ = µ = σ = µ = σ = µ = σ = µ = 5 σ = µ = 5 σ = µ = 5 σ =

8 Table : E ( ĥ ) for normal mixture f = 0.5φ 0.5φ σ (x µ) n ĥ NR ĥ LS CV ĥ D. CV ĥ D.5 CV ĥ D.9 CV h MS E µ = 0 σ = µ = 0 σ = µ = 0 σ = µ = σ = µ = σ = µ = σ = µ = 5 σ = µ = 5 σ = µ = 5 σ =

9 Table 3: E ĥ/h MS E for normal mixture f = 0.5φ 0.5φ σ (x µ) n ĥ NR ĥ LS CV ĥ D. CV ĥ D.5 CV ĥ D.9 CV µ = 0 σ = µ = 0 σ = µ = 0 σ = µ = σ = µ = σ = µ = σ = µ = 5 σ = µ = 5 σ = µ = 5 σ =

10 Figure : Boxplots of the relative values Re forthe bandwidth selectors forestimation of the densities µ = 0,, 5 and σ =, 0.5, 0.. The sample size varies from 00 to 000. Figure compare, for densities with (µ = 0,, 5 and σ =,.5,.), the results of the five bandwidth selection NR, LS CV and D (discussed in Section 3), relatively to the results obtained by using the MS E optimal bandwidth (h MS E ). These figures present boxplots of the ratio RE(ĥ) = MS E( fĥms E )/MS E( fĥ) for each bandwidths ĥ NR, ĥ LS CV and ĥ D (with =.,.5 and.9). We see the LS CV and D (with =.5) methods gave overall the bests ratios across all simulations, and that this ratio was rather large in general. 5. llustration with real data Two examples are provided to demonstrate the performance of kernel density estimation with different bandwidths, where the Gaussian kernel is used. All of them are classical examples of unimodal and bimodal distributions, respectively. Example The first data set comprises the CO per capita in the year of 04. The data set can be downloaded from the world bank website. Figure shows the estimated density of CO per capita in the year of 008,, we using bandwidths ĥ NR =.38, ĥ LS CV = 0.439, ĥ.5 = 0.83, ĥ. = 0.93 and ĥ.9 = 0.54 The data set that the estimated density that was computed with the ĥ LS CV = and ĥ.9 bandwidths captures the peak that characterizes the mode, while the estimated density with the bandwidths that ĥ NR, ĥ.5 and ĥ. smoothes out this peak. This happens because the outliers at the tail of the distribution contribute to ĥ NR, ĥ.5 and ĥ. be larger than the than other bandwidths.. Example we use the time between eruptions set for the Old Faithful geyser in Yellowstone National Park, Wyoming, USA (07 sample data, source: Silvermanl []). Figure 3 plot the data points and the kernel density estimates for old faithful geyser data, we using bandwidths ĥ NR = 0.44, ĥ LS CV = 0.6, ĥ.5 = 0.76, ĥ. = 0.8 and ĥ.9 = 0.0. An important point to note that the density curve for eruption length is similar to bimodal normal density (normal mixture). From our example we see that the h NB is always larger than the others bandwidths, he heavily oversmoothes its kernel density curve, underestimating the two peaks of the curve but overestimating the valley between them. About h LS CV, ĥ.5 and ĥ.9 seems to undersmooth the curve too much, overestimating the two peaks but underestimating for 0

11 Figure : Estimated density of CO per capita in 008 using the different bandwidths. D. (solid line); D.9 (dashed line); D.5 (dotted line); LS CV, least squares cross-validation (dotdash line) and NR, normal reference (longdash line). the valley. However ĥ. is proper bandwidth for their density estimate to be able to capture the feature of the true density curve. 6. Conclusions This paper proposed the method for bandwidth selection of bias reduction kernel density estimator, given in (). A various bandwidth selection strategies have been proposed such as normal reference h NR, least squares cross-validation h LS CV and h for minimizing criterion -divergence (with =.5,. and.9). The normal reference bandwidth h NR method is a simple and quick selector, but limited the practical use., since they are restricted to situations where a pre-specified family of densities is correctly selected. The LS CV do not provide a smooth density estimation, although asymptotically optimal, the finite sample behavior of h LS CV is disappointing for its variability and undersmoothing. We have attempted to evaluate choice the optimal bandwidth h LS CV and h NR, using -divergence. Compared to traditional bandwidth selection methods designed for kernel density estimation, our proposed D bandwidth selection method is always one of the best for having large RE(ĥ) and small E(ĥ/ĥ MS E ). Simulation studies showed that our proposed optimal bandwidth D method designed for kernel density estimation adapts to different situations, and out-performs other bandwidths. we conclude that the choice of the bandwidth based on the real data is consistent with the one based on simulations which is the D =. and.5 ) method gives us a smoother density estimation. References [] Basu, A., Harris,. R., Hjort N. L. and Jones M. C. (998). Robust and efficient estimation by minimising a density power divergence, Biometrika 85(3), [] Bowman A. W., (984). An alternative method of cross-validation for the smoothing of density estimates, Biometrika 7(), [3] Cao, R., Cuevas, A., and Gonalez-Manteiga, W. (994), A comparative study of several smoothing methods in density estimation?, Computational Statistics and Data Analysis, 7, 53?76. [4] Cichocki, A.; Zdunek, R.; Amari, S. (006) Csiszar?s divergences for nonnegative matrix factorization: Family of new algorithms. n Lecture Notes in Computer Science; Springer: Charleston, SC, USA, 3889, 3?39.

12 Figure 3: Estimated density of Old Faithful geyser using the different bandwidths. D. (solid line); D.9 (dashed line); D.5 (dotted line); LS CV, least squares cross-validation (dotdash line) and NR, normal reference (longdash line). [5] Dhaker H., Ngom P., Deme Elhadji and Mbodj M. New approach for bandwidth selection in the kernel density estimation based on - divergence, Journal of Mathematical Sciences: Advances and Applications. 5, 08, [6] Eguchi S. and Kano Y. (00) Robustifying maximum likelihood estimation. Technical report, nstitute of Statistical Mathematics, June. [7] Eugene F. S, (969), Estimation of a probability density function and its derivatives, The annals of Mathematical Statistics, 40(4), [8] Jones, M.C., Marron, J.S., and Sheather, S.J. (996), A brief survey of bandwidth selection for density estimation, Journal of the American Statistical Association, 9, [9] Jorgensen B. (997) The Theory of Dispersion Models. Chapman Hall/CRC Monographs on Statistics and Applied Probability. [0] Kanazawa Y. (993) Hellinger distance and Kullback-Leibler loss for the kernel density estimator, Statistics and Probability Letters 8(4), [] Mammen, E., Martinez-Miranda, M.D., Nielsen, J.P., and Sperlich, S. (0), Do-validation for kernel density estimatio?, Journal of the American Statistical Association, 06, [] Mammen, E., Martinez-Miranda, M.D., Nielsen, J.P., and Sperlich, S. (04), Further theoretical and practical insight to the do-validated bandwidth selector, Journal of the Korean Statistical Society, 43, [3] Minami, M.; Eguchi, S. (00) Robust blind source separation by Beta-divergence. Neural Comput. 4, [4] Mugdadi, A. R. and brahim A. A. (004), A bandwidth selection for kernel density estimation of functions of random variables, Computational Statistics and Data Analysis, 47(), 49-6 [5] Parzen, E. (96), On estimation of a probability density function and mode, Annals of Mathematical Statistics 33, [6] Rudemo M. (98) Empirical choice of histograms and kernel density estimators, Scandinavia Journal of Statistics 9(), [7] Scott, W. D (99), Multivariate density estimation theory, practice, and visualization, Wiley, New York. [8] Silverman, B. W. (986), Density estimation for statistics and data analysis, Chapman and Hall, London. [9] Turlach, B.A. (993), Bandwidth selection in kernel density estimation: A review, Technical report, Universite catholique de Louvain. [0] Wand, M. P. and M. C. Jones (995), Kernel Smoothing, Chapman and Hall, London, UK. [] Xie X. and Wu J. (04), Some mprovement on Convergence Rates of Kernel Density Estimator, Applied Mathematics, 04, 5,

O Combining cross-validation and plug-in methods - for kernel density bandwidth selection O

O Combining cross-validation and plug-in methods - for kernel density bandwidth selection O O Combining cross-validation and plug-in methods - for kernel density selection O Carlos Tenreiro CMUC and DMUC, University of Coimbra PhD Program UC UP February 18, 2011 1 Overview The nonparametric problem

More information

1 Introduction A central problem in kernel density estimation is the data-driven selection of smoothing parameters. During the recent years, many dier

1 Introduction A central problem in kernel density estimation is the data-driven selection of smoothing parameters. During the recent years, many dier Bias corrected bootstrap bandwidth selection Birgit Grund and Jorg Polzehl y January 1996 School of Statistics, University of Minnesota Technical Report No. 611 Abstract Current bandwidth selectors for

More information

J. Cwik and J. Koronacki. Institute of Computer Science, Polish Academy of Sciences. to appear in. Computational Statistics and Data Analysis

J. Cwik and J. Koronacki. Institute of Computer Science, Polish Academy of Sciences. to appear in. Computational Statistics and Data Analysis A Combined Adaptive-Mixtures/Plug-In Estimator of Multivariate Probability Densities 1 J. Cwik and J. Koronacki Institute of Computer Science, Polish Academy of Sciences Ordona 21, 01-237 Warsaw, Poland

More information

A NOTE ON THE CHOICE OF THE SMOOTHING PARAMETER IN THE KERNEL DENSITY ESTIMATE

A NOTE ON THE CHOICE OF THE SMOOTHING PARAMETER IN THE KERNEL DENSITY ESTIMATE BRAC University Journal, vol. V1, no. 1, 2009, pp. 59-68 A NOTE ON THE CHOICE OF THE SMOOTHING PARAMETER IN THE KERNEL DENSITY ESTIMATE Daniel F. Froelich Minnesota State University, Mankato, USA and Mezbahur

More information

Curve Fitting Re-visited, Bishop1.2.5

Curve Fitting Re-visited, Bishop1.2.5 Curve Fitting Re-visited, Bishop1.2.5 Maximum Likelihood Bishop 1.2.5 Model Likelihood differentiation p(t x, w, β) = Maximum Likelihood N N ( t n y(x n, w), β 1). (1.61) n=1 As we did in the case of the

More information

Boosting kernel density estimates: A bias reduction technique?

Boosting kernel density estimates: A bias reduction technique? Biometrika (2004), 91, 1, pp. 226 233 2004 Biometrika Trust Printed in Great Britain Boosting kernel density estimates: A bias reduction technique? BY MARCO DI MARZIO Dipartimento di Metodi Quantitativi

More information

ON SOME TWO-STEP DENSITY ESTIMATION METHOD

ON SOME TWO-STEP DENSITY ESTIMATION METHOD UNIVESITATIS IAGELLONICAE ACTA MATHEMATICA, FASCICULUS XLIII 2005 ON SOME TWO-STEP DENSITY ESTIMATION METHOD by Jolanta Jarnicka Abstract. We introduce a new two-step kernel density estimation method,

More information

Computation of an efficient and robust estimator in a semiparametric mixture model

Computation of an efficient and robust estimator in a semiparametric mixture model Journal of Statistical Computation and Simulation ISSN: 0094-9655 (Print) 1563-5163 (Online) Journal homepage: http://www.tandfonline.com/loi/gscs20 Computation of an efficient and robust estimator in

More information

ESTIMATORS IN THE CONTEXT OF ACTUARIAL LOSS MODEL A COMPARISON OF TWO NONPARAMETRIC DENSITY MENGJUE TANG A THESIS MATHEMATICS AND STATISTICS

ESTIMATORS IN THE CONTEXT OF ACTUARIAL LOSS MODEL A COMPARISON OF TWO NONPARAMETRIC DENSITY MENGJUE TANG A THESIS MATHEMATICS AND STATISTICS A COMPARISON OF TWO NONPARAMETRIC DENSITY ESTIMATORS IN THE CONTEXT OF ACTUARIAL LOSS MODEL MENGJUE TANG A THESIS IN THE DEPARTMENT OF MATHEMATICS AND STATISTICS PRESENTED IN PARTIAL FULFILLMENT OF THE

More information

Statistica Sinica Preprint No: SS

Statistica Sinica Preprint No: SS Statistica Sinica Preprint No: SS-017-0013 Title A Bootstrap Method for Constructing Pointwise and Uniform Confidence Bands for Conditional Quantile Functions Manuscript ID SS-017-0013 URL http://wwwstatsinicaedutw/statistica/

More information

WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract

WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract Journal of Data Science,17(1). P. 145-160,2019 DOI:10.6339/JDS.201901_17(1).0007 WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION Wei Xiong *, Maozai Tian 2 1 School of Statistics, University of

More information

Robust Stochastic Frontier Analysis: a Minimum Density Power Divergence Approach

Robust Stochastic Frontier Analysis: a Minimum Density Power Divergence Approach Robust Stochastic Frontier Analysis: a Minimum Density Power Divergence Approach Federico Belotti Giuseppe Ilardi CEIS, University of Rome Tor Vergata Bank of Italy Workshop on the Econometrics and Statistics

More information

Confidence intervals for kernel density estimation

Confidence intervals for kernel density estimation Stata User Group - 9th UK meeting - 19/20 May 2003 Confidence intervals for kernel density estimation Carlo Fiorio c.fiorio@lse.ac.uk London School of Economics and STICERD Stata User Group - 9th UK meeting

More information

Nonparametric Methods

Nonparametric Methods Nonparametric Methods Michael R. Roberts Department of Finance The Wharton School University of Pennsylvania July 28, 2009 Michael R. Roberts Nonparametric Methods 1/42 Overview Great for data analysis

More information

Log-transform kernel density estimation of income distribution

Log-transform kernel density estimation of income distribution Log-transform kernel density estimation of income distribution by Arthur Charpentier Université du Québec à Montréal CREM & GERAD Département de Mathématiques 201, av. Président-Kennedy, PK-5151, Montréal,

More information

Nonparametric Modal Regression

Nonparametric Modal Regression Nonparametric Modal Regression Summary In this article, we propose a new nonparametric modal regression model, which aims to estimate the mode of the conditional density of Y given predictors X. The nonparametric

More information

Sparse Nonparametric Density Estimation in High Dimensions Using the Rodeo

Sparse Nonparametric Density Estimation in High Dimensions Using the Rodeo Sparse Nonparametric Density Estimation in High Dimensions Using the Rodeo Han Liu John Lafferty Larry Wasserman Statistics Department Computer Science Department Machine Learning Department Carnegie Mellon

More information

Assignment # 1. STA 355 (K. Knight) February 8, (a) by the Central Limit Theorem. n(g( Xn ) g(µ)) by the delta method ( ) (b) i.

Assignment # 1. STA 355 (K. Knight) February 8, (a) by the Central Limit Theorem. n(g( Xn ) g(µ)) by the delta method ( ) (b) i. STA 355 (K. Knight) February 8, 2014 Assignment # 1 Saša Milić 997 410 626 1. (a) n( n ) d N (0, σ 2 ()) by the Central Limit Theorem n(g( n ) g()) d g (θ)n (0, σ 2 ()) by the delta method σ 2 ()g (θ)

More information

Nonparametric Density Estimation. October 1, 2018

Nonparametric Density Estimation. October 1, 2018 Nonparametric Density Estimation October 1, 2018 Introduction If we can t fit a distribution to our data, then we use nonparametric density estimation. Start with a histogram. But there are problems with

More information

Local linear multiple regression with variable. bandwidth in the presence of heteroscedasticity

Local linear multiple regression with variable. bandwidth in the presence of heteroscedasticity Local linear multiple regression with variable bandwidth in the presence of heteroscedasticity Azhong Ye 1 Rob J Hyndman 2 Zinai Li 3 23 January 2006 Abstract: We present local linear estimator with variable

More information

Local Polynomial Wavelet Regression with Missing at Random

Local Polynomial Wavelet Regression with Missing at Random Applied Mathematical Sciences, Vol. 6, 2012, no. 57, 2805-2819 Local Polynomial Wavelet Regression with Missing at Random Alsaidi M. Altaher School of Mathematical Sciences Universiti Sains Malaysia 11800

More information

Teruko Takada Department of Economics, University of Illinois. Abstract

Teruko Takada Department of Economics, University of Illinois. Abstract Nonparametric density estimation: A comparative study Teruko Takada Department of Economics, University of Illinois Abstract Motivated by finance applications, the objective of this paper is to assess

More information

Applications of nonparametric methods in economic and political science

Applications of nonparametric methods in economic and political science Applications of nonparametric methods in economic and political science Dissertation presented for the degree of Doctor rerum politicarum at the Faculty of Economic Sciences of the Georg-August-Universität

More information

Boundary Correction Methods in Kernel Density Estimation Tom Alberts C o u(r)a n (t) Institute joint work with R.J. Karunamuni University of Alberta

Boundary Correction Methods in Kernel Density Estimation Tom Alberts C o u(r)a n (t) Institute joint work with R.J. Karunamuni University of Alberta Boundary Correction Methods in Kernel Density Estimation Tom Alberts C o u(r)a n (t) Institute joint work with R.J. Karunamuni University of Alberta November 29, 2007 Outline Overview of Kernel Density

More information

Local Polynomial Modelling and Its Applications

Local Polynomial Modelling and Its Applications Local Polynomial Modelling and Its Applications J. Fan Department of Statistics University of North Carolina Chapel Hill, USA and I. Gijbels Institute of Statistics Catholic University oflouvain Louvain-la-Neuve,

More information

Local Polynomial Regression

Local Polynomial Regression VI Local Polynomial Regression (1) Global polynomial regression We observe random pairs (X 1, Y 1 ),, (X n, Y n ) where (X 1, Y 1 ),, (X n, Y n ) iid (X, Y ). We want to estimate m(x) = E(Y X = x) based

More information

Akaike Information Criterion to Select the Parametric Detection Function for Kernel Estimator Using Line Transect Data

Akaike Information Criterion to Select the Parametric Detection Function for Kernel Estimator Using Line Transect Data Journal of Modern Applied Statistical Methods Volume 12 Issue 2 Article 21 11-1-2013 Akaike Information Criterion to Select the Parametric Detection Function for Kernel Estimator Using Line Transect Data

More information

A Novel Nonparametric Density Estimator

A Novel Nonparametric Density Estimator A Novel Nonparametric Density Estimator Z. I. Botev The University of Queensland Australia Abstract We present a novel nonparametric density estimator and a new data-driven bandwidth selection method with

More information

Discussion Paper No. 28

Discussion Paper No. 28 Discussion Paper No. 28 Asymptotic Property of Wrapped Cauchy Kernel Density Estimation on the Circle Yasuhito Tsuruta Masahiko Sagae Asymptotic Property of Wrapped Cauchy Kernel Density Estimation on

More information

Preface. 1 Nonparametric Density Estimation and Testing. 1.1 Introduction. 1.2 Univariate Density Estimation

Preface. 1 Nonparametric Density Estimation and Testing. 1.1 Introduction. 1.2 Univariate Density Estimation Preface Nonparametric econometrics has become one of the most important sub-fields in modern econometrics. The primary goal of this lecture note is to introduce various nonparametric and semiparametric

More information

A Bayesian approach to parameter estimation for kernel density estimation via transformations

A Bayesian approach to parameter estimation for kernel density estimation via transformations A Bayesian approach to parameter estimation for kernel density estimation via transformations Qing Liu,, David Pitt 2, Xibin Zhang 3, Xueyuan Wu Centre for Actuarial Studies, Faculty of Business and Economics,

More information

Kullback-Leibler Designs

Kullback-Leibler Designs Kullback-Leibler Designs Astrid JOURDAN Jessica FRANCO Contents Contents Introduction Kullback-Leibler divergence Estimation by a Monte-Carlo method Design comparison Conclusion 2 Introduction Computer

More information

MEASUREMENT UNCERTAINTY AND SUMMARISING MONTE CARLO SAMPLES

MEASUREMENT UNCERTAINTY AND SUMMARISING MONTE CARLO SAMPLES XX IMEKO World Congress Metrology for Green Growth September 9 14, 212, Busan, Republic of Korea MEASUREMENT UNCERTAINTY AND SUMMARISING MONTE CARLO SAMPLES A B Forbes National Physical Laboratory, Teddington,

More information

Data-Based Choice of Histogram Bin Width. M. P. Wand. Australian Graduate School of Management. University of New South Wales.

Data-Based Choice of Histogram Bin Width. M. P. Wand. Australian Graduate School of Management. University of New South Wales. Data-Based Choice of Histogram Bin Width M. P. Wand Australian Graduate School of Management University of New South Wales 13th May, 199 Abstract The most important parameter of a histogram is the bin

More information

A Note on Data-Adaptive Bandwidth Selection for Sequential Kernel Smoothers

A Note on Data-Adaptive Bandwidth Selection for Sequential Kernel Smoothers 6th St.Petersburg Workshop on Simulation (2009) 1-3 A Note on Data-Adaptive Bandwidth Selection for Sequential Kernel Smoothers Ansgar Steland 1 Abstract Sequential kernel smoothers form a class of procedures

More information

Log-Transform Kernel Density Estimation of Income Distribution

Log-Transform Kernel Density Estimation of Income Distribution Log-Transform Kernel Density Estimation of Income Distribution Arthur Charpentier, Emmanuel Flachaire To cite this version: Arthur Charpentier, Emmanuel Flachaire. Log-Transform Kernel Density Estimation

More information

Kernel Density Estimation

Kernel Density Estimation Kernel Density Estimation and Application in Discriminant Analysis Thomas Ledl Universität Wien Contents: Aspects of Application observations: 0 Which distribution? 0?? 0.0 0. 0. 0. 0.0 0. 0. 0 0 0.0

More information

Bandwidth Selection in Nonparametric Kernel Estimation

Bandwidth Selection in Nonparametric Kernel Estimation Bandwidth Selection in Nonparametric Kernel Estimation Dissertation presented for the degree of Doctor rerum politicarum at the Faculty of Economic Sciences of the Georg-August-Universität Göttingen by

More information

Bandwidth selection for kernel conditional density

Bandwidth selection for kernel conditional density Bandwidth selection for kernel conditional density estimation David M Bashtannyk and Rob J Hyndman 1 10 August 1999 Abstract: We consider bandwidth selection for the kernel estimator of conditional density

More information

Sparse Nonparametric Density Estimation in High Dimensions Using the Rodeo

Sparse Nonparametric Density Estimation in High Dimensions Using the Rodeo Outline in High Dimensions Using the Rodeo Han Liu 1,2 John Lafferty 2,3 Larry Wasserman 1,2 1 Statistics Department, 2 Machine Learning Department, 3 Computer Science Department, Carnegie Mellon University

More information

Nonparametric Density Estimation

Nonparametric Density Estimation Nonparametric Density Estimation Econ 690 Purdue University Justin L. Tobias (Purdue) Nonparametric Density Estimation 1 / 29 Density Estimation Suppose that you had some data, say on wages, and you wanted

More information

Nonparametric Inference via Bootstrapping the Debiased Estimator

Nonparametric Inference via Bootstrapping the Debiased Estimator Nonparametric Inference via Bootstrapping the Debiased Estimator Yen-Chi Chen Department of Statistics, University of Washington ICSA-Canada Chapter Symposium 2017 1 / 21 Problem Setup Let X 1,, X n be

More information

A Perturbation Technique for Sample Moment Matching in Kernel Density Estimation

A Perturbation Technique for Sample Moment Matching in Kernel Density Estimation A Perturbation Technique for Sample Moment Matching in Kernel Density Estimation Arnab Maity 1 and Debapriya Sengupta 2 Abstract The fundamental idea of kernel smoothing technique can be recognized as

More information

Positive data kernel density estimation via the logkde package for R

Positive data kernel density estimation via the logkde package for R Positive data kernel density estimation via the logkde package for R Andrew T. Jones 1, Hien D. Nguyen 2, and Geoffrey J. McLachlan 1 which is constructed from the sample { i } n i=1. Here, K (x) is a

More information

Optimal Ridge Detection using Coverage Risk

Optimal Ridge Detection using Coverage Risk Optimal Ridge Detection using Coverage Risk Yen-Chi Chen Department of Statistics Carnegie Mellon University yenchic@andrew.cmu.edu Shirley Ho Department of Physics Carnegie Mellon University shirleyh@andrew.cmu.edu

More information

Econ 582 Nonparametric Regression

Econ 582 Nonparametric Regression Econ 582 Nonparametric Regression Eric Zivot May 28, 2013 Nonparametric Regression Sofarwehaveonlyconsideredlinearregressionmodels = x 0 β + [ x ]=0 [ x = x] =x 0 β = [ x = x] [ x = x] x = β The assume

More information

Double Kernel Method Using Line Transect Sampling

Double Kernel Method Using Line Transect Sampling AUSTRIAN JOURNAL OF STATISTICS Volume 41 (2012), Number 2, 95 103 Double ernel Method Using Line Transect Sampling Omar Eidous and M.. Shakhatreh Department of Statistics, Yarmouk University, Irbid, Jordan

More information

Kernel density estimation

Kernel density estimation Kernel density estimation Patrick Breheny October 18 Patrick Breheny STA 621: Nonparametric Statistics 1/34 Introduction Kernel Density Estimation We ve looked at one method for estimating density: histograms

More information

Maximum likelihood kernel density estimation: on the potential of convolution sieves

Maximum likelihood kernel density estimation: on the potential of convolution sieves Maximum likelihood kernel density estimation: on the potential of convolution sieves M.C. Jones a, and D.A. Henderson b a Department of Mathematics and Statistics, The Open University, Walton Hall, Milton

More information

Kernel Density Estimation: Theory and Application in Discriminant Analysis

Kernel Density Estimation: Theory and Application in Discriminant Analysis AUSTRIAN JOURNAL OF STATISTICS Volume 33 (2004), Number 3, 267-279 Kernel Density Estimation: Theory and Application in Discriminant Analysis Thomas Ledl Department of Statistics and Decision Support Systems,

More information

A PROBABILITY DENSITY FUNCTION ESTIMATION USING F-TRANSFORM

A PROBABILITY DENSITY FUNCTION ESTIMATION USING F-TRANSFORM K Y BERNETIKA VOLUM E 46 ( 2010), NUMBER 3, P AGES 447 458 A PROBABILITY DENSITY FUNCTION ESTIMATION USING F-TRANSFORM Michal Holčapek and Tomaš Tichý The aim of this paper is to propose a new approach

More information

Eruptions of the Old Faithful geyser

Eruptions of the Old Faithful geyser Eruptions of the Old Faithful geyser geyser2.ps Topics covered: Bimodal distribution. Data summary. Examination of univariate data. Identifying subgroups. Prediction. Key words: Boxplot. Histogram. Mean.

More information

CV-NP BAYESIANISM BY MCMC. Cross Validated Non Parametric Bayesianism by Markov Chain Monte Carlo CARLOS C. RODRIGUEZ

CV-NP BAYESIANISM BY MCMC. Cross Validated Non Parametric Bayesianism by Markov Chain Monte Carlo CARLOS C. RODRIGUEZ CV-NP BAYESIANISM BY MCMC Cross Validated Non Parametric Bayesianism by Markov Chain Monte Carlo CARLOS C. RODRIGUE Department of Mathematics and Statistics University at Albany, SUNY Albany NY 1, USA

More information

Nonparametric confidence intervals. for receiver operating characteristic curves

Nonparametric confidence intervals. for receiver operating characteristic curves Nonparametric confidence intervals for receiver operating characteristic curves Peter G. Hall 1, Rob J. Hyndman 2, and Yanan Fan 3 5 December 2003 Abstract: We study methods for constructing confidence

More information

arxiv: v2 [stat.me] 13 Sep 2007

arxiv: v2 [stat.me] 13 Sep 2007 Electronic Journal of Statistics Vol. 0 (0000) ISSN: 1935-7524 DOI: 10.1214/154957804100000000 Bandwidth Selection for Weighted Kernel Density Estimation arxiv:0709.1616v2 [stat.me] 13 Sep 2007 Bin Wang

More information

Smooth nonparametric estimation of a quantile function under right censoring using beta kernels

Smooth nonparametric estimation of a quantile function under right censoring using beta kernels Smooth nonparametric estimation of a quantile function under right censoring using beta kernels Chanseok Park 1 Department of Mathematical Sciences, Clemson University, Clemson, SC 29634 Short Title: Smooth

More information

Bayesian Semiparametric GARCH Models

Bayesian Semiparametric GARCH Models Bayesian Semiparametric GARCH Models Xibin (Bill) Zhang and Maxwell L. King Department of Econometrics and Business Statistics Faculty of Business and Economics xibin.zhang@monash.edu Quantitative Methods

More information

Kernel density estimation with doubly truncated data

Kernel density estimation with doubly truncated data Kernel density estimation with doubly truncated data Jacobo de Uña-Álvarez jacobo@uvigo.es http://webs.uvigo.es/jacobo (joint with Carla Moreira) Department of Statistics and OR University of Vigo Spain

More information

Chapter 9. Non-Parametric Density Function Estimation

Chapter 9. Non-Parametric Density Function Estimation 9-1 Density Estimation Version 1.1 Chapter 9 Non-Parametric Density Function Estimation 9.1. Introduction We have discussed several estimation techniques: method of moments, maximum likelihood, and least

More information

Estimation of cumulative distribution function with spline functions

Estimation of cumulative distribution function with spline functions INTERNATIONAL JOURNAL OF ECONOMICS AND STATISTICS Volume 5, 017 Estimation of cumulative distribution function with functions Akhlitdin Nizamitdinov, Aladdin Shamilov Abstract The estimation of the cumulative

More information

A NOTE ON TESTING THE SHOULDER CONDITION IN LINE TRANSECT SAMPLING

A NOTE ON TESTING THE SHOULDER CONDITION IN LINE TRANSECT SAMPLING A NOTE ON TESTING THE SHOULDER CONDITION IN LINE TRANSECT SAMPLING Shunpu Zhang Department of Mathematical Sciences University of Alaska, Fairbanks, Alaska, U.S.A. 99775 ABSTRACT We propose a new method

More information

Kernel density estimation for heavy-tailed distributions using the champernowne transformation

Kernel density estimation for heavy-tailed distributions using the champernowne transformation Statistics, Vol. 39, No. 6, December 2005, 503 518 Kernel density estimation for heavy-tailed distributions using the champernowne transformation TINE BUCH-LARSEN, JENS PERCH NIELSEN, MONTSERRAT GUILLÉN*

More information

Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods

Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods Robert V. Breunig Centre for Economic Policy Research, Research School of Social Sciences and School of

More information

unit interval Gery Geenens School of Mathematics and Statistics, University of New South Wales, Sydney, Australia Abstract

unit interval Gery Geenens School of Mathematics and Statistics, University of New South Wales, Sydney, Australia Abstract Probit transformation for kernel density estimation on the unit interval Gery Geenens School of Mathematics and Statistics, ariv:1303.4121v1 [stat.me] 17 Mar 2013 University of New South Wales, Sydney,

More information

Nonparametric Density Estimation

Nonparametric Density Estimation Nonparametric Density Estimation Advanced Econometrics Douglas G. Steigerwald UC Santa Barbara D. Steigerwald (UCSB) Density Estimation 1 / 20 Overview Question of interest has wage inequality among women

More information

Smooth simultaneous confidence bands for cumulative distribution functions

Smooth simultaneous confidence bands for cumulative distribution functions Journal of Nonparametric Statistics, 2013 Vol. 25, No. 2, 395 407, http://dx.doi.org/10.1080/10485252.2012.759219 Smooth simultaneous confidence bands for cumulative distribution functions Jiangyan Wang

More information

Density Estimation (II)

Density Estimation (II) Density Estimation (II) Yesterday Overview & Issues Histogram Kernel estimators Ideogram Today Further development of optimization Estimating variance and bias Adaptive kernels Multivariate kernel estimation

More information

A New Method for Varying Adaptive Bandwidth Selection

A New Method for Varying Adaptive Bandwidth Selection IEEE TRASACTIOS O SIGAL PROCESSIG, VOL. 47, O. 9, SEPTEMBER 1999 2567 TABLE I SQUARE ROOT MEA SQUARED ERRORS (SRMSE) OF ESTIMATIO USIG THE LPA AD VARIOUS WAVELET METHODS A ew Method for Varying Adaptive

More information

Classification via kernel regression based on univariate product density estimators

Classification via kernel regression based on univariate product density estimators Classification via kernel regression based on univariate product density estimators Bezza Hafidi 1, Abdelkarim Merbouha 2, and Abdallah Mkhadri 1 1 Department of Mathematics, Cadi Ayyad University, BP

More information

STATISTICAL COMPUTING AND GRAPHICS

STATISTICAL COMPUTING AND GRAPHICS STATISTICAL COMPUTING AND GRAPHICS Data-Based Choice of Histogram Bin Width The most important parameter of a histogram is the bin width because it controls the tradeoff between presenting a picture with

More information

Log-Density Estimation with Application to Approximate Likelihood Inference

Log-Density Estimation with Application to Approximate Likelihood Inference Log-Density Estimation with Application to Approximate Likelihood Inference Martin Hazelton 1 Institute of Fundamental Sciences Massey University 19 November 2015 1 Email: m.hazelton@massey.ac.nz WWPMS,

More information

Regression Analysis for Data Containing Outliers and High Leverage Points

Regression Analysis for Data Containing Outliers and High Leverage Points Alabama Journal of Mathematics 39 (2015) ISSN 2373-0404 Regression Analysis for Data Containing Outliers and High Leverage Points Asim Kumer Dey Department of Mathematics Lamar University Md. Amir Hossain

More information

LQ-Moments for Statistical Analysis of Extreme Events

LQ-Moments for Statistical Analysis of Extreme Events Journal of Modern Applied Statistical Methods Volume 6 Issue Article 5--007 LQ-Moments for Statistical Analysis of Extreme Events Ani Shabri Universiti Teknologi Malaysia Abdul Aziz Jemain Universiti Kebangsaan

More information

Bayesian Semiparametric GARCH Models

Bayesian Semiparametric GARCH Models Bayesian Semiparametric GARCH Models Xibin (Bill) Zhang and Maxwell L. King Department of Econometrics and Business Statistics Faculty of Business and Economics xibin.zhang@monash.edu Quantitative Methods

More information

A Note on Auxiliary Particle Filters

A Note on Auxiliary Particle Filters A Note on Auxiliary Particle Filters Adam M. Johansen a,, Arnaud Doucet b a Department of Mathematics, University of Bristol, UK b Departments of Statistics & Computer Science, University of British Columbia,

More information

Package OSCV. March 18, 2017

Package OSCV. March 18, 2017 Title One-Sided Cross-Validation Version 1.0 Author Olga Savchuk Package OSCV March 18, 2017 Maintainer Olga Savchuk Functions for implementing different versions of the OSCV

More information

Department of Statistics, School of Mathematical Sciences, Ferdowsi University of Mashhad, Iran.

Department of Statistics, School of Mathematical Sciences, Ferdowsi University of Mashhad, Iran. JIRSS (2012) Vol. 11, No. 2, pp 191-202 A Goodness of Fit Test For Exponentiality Based on Lin-Wong Information M. Abbasnejad, N. R. Arghami, M. Tavakoli Department of Statistics, School of Mathematical

More information

Modelling Non-linear and Non-stationary Time Series

Modelling Non-linear and Non-stationary Time Series Modelling Non-linear and Non-stationary Time Series Chapter 2: Non-parametric methods Henrik Madsen Advanced Time Series Analysis September 206 Henrik Madsen (02427 Adv. TS Analysis) Lecture Notes September

More information

Smooth functions and local extreme values

Smooth functions and local extreme values Smooth functions and local extreme values A. Kovac 1 Department of Mathematics University of Bristol Abstract Given a sample of n observations y 1,..., y n at time points t 1,..., t n we consider the problem

More information

Mathematical Foundations of the Generalization of t-sne and SNE for Arbitrary Divergences

Mathematical Foundations of the Generalization of t-sne and SNE for Arbitrary Divergences MACHINE LEARNING REPORTS Mathematical Foundations of the Generalization of t-sne and SNE for Arbitrary Report 02/2010 Submitted: 01.04.2010 Published:26.04.2010 T. Villmann and S. Haase University of Applied

More information

Function of Longitudinal Data

Function of Longitudinal Data New Local Estimation Procedure for Nonparametric Regression Function of Longitudinal Data Weixin Yao and Runze Li Abstract This paper develops a new estimation of nonparametric regression functions for

More information

Likelihood Cross-Validation Versus Least Squares Cross- Validation for Choosing the Smoothing Parameter in Kernel Home-Range Analysis

Likelihood Cross-Validation Versus Least Squares Cross- Validation for Choosing the Smoothing Parameter in Kernel Home-Range Analysis Research Article Likelihood Cross-Validation Versus Least Squares Cross- Validation for Choosing the Smoothing Parameter in Kernel Home-Range Analysis JON S. HORNE, 1 University of Idaho, Department of

More information

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Jianqing Fan Department of Statistics Chinese University of Hong Kong AND Department of Statistics

More information

Independent and conditionally independent counterfactual distributions

Independent and conditionally independent counterfactual distributions Independent and conditionally independent counterfactual distributions Marcin Wolski European Investment Bank M.Wolski@eib.org Society for Nonlinear Dynamics and Econometrics Tokyo March 19, 2018 Views

More information

Average Reward Parameters

Average Reward Parameters Simulation-Based Optimization of Markov Reward Processes: Implementation Issues Peter Marbach 2 John N. Tsitsiklis 3 Abstract We consider discrete time, nite state space Markov reward processes which depend

More information

Midwest Big Data Summer School: Introduction to Statistics. Kris De Brabanter

Midwest Big Data Summer School: Introduction to Statistics. Kris De Brabanter Midwest Big Data Summer School: Introduction to Statistics Kris De Brabanter kbrabant@iastate.edu Iowa State University Department of Statistics Department of Computer Science June 20, 2016 1/27 Outline

More information

Asymptotics for posterior hazards

Asymptotics for posterior hazards Asymptotics for posterior hazards Pierpaolo De Blasi University of Turin 10th August 2007, BNR Workshop, Isaac Newton Intitute, Cambridge, UK Joint work with Giovanni Peccati (Université Paris VI) and

More information

Integral approximation by kernel smoothing

Integral approximation by kernel smoothing Integral approximation by kernel smoothing François Portier Université catholique de Louvain - ISBA August, 29 2014 In collaboration with Bernard Delyon Topic of the talk: Given ϕ : R d R, estimation of

More information

Chapter 9. Non-Parametric Density Function Estimation

Chapter 9. Non-Parametric Density Function Estimation 9-1 Density Estimation Version 1.2 Chapter 9 Non-Parametric Density Function Estimation 9.1. Introduction We have discussed several estimation techniques: method of moments, maximum likelihood, and least

More information

arxiv: v1 [stat.me] 11 Sep 2007

arxiv: v1 [stat.me] 11 Sep 2007 Electronic Journal of Statistics Vol. 0 (0000) ISSN: 1935-7524 arxiv:0709.1616v1 [stat.me] 11 Sep 2007 Bandwidth Selection for Weighted Kernel Density Estimation Bin Wang Mathematics and Statistics Department,

More information

Local Modal Regression

Local Modal Regression Local Modal Regression Weixin Yao Department of Statistics, Kansas State University, Manhattan, Kansas 66506, U.S.A. wxyao@ksu.edu Bruce G. Lindsay and Runze Li Department of Statistics, The Pennsylvania

More information

Adaptive Kernel Estimation of The Hazard Rate Function

Adaptive Kernel Estimation of The Hazard Rate Function Adaptive Kernel Estimation of The Hazard Rate Function Raid Salha Department of Mathematics, Islamic University of Gaza, Palestine, e-mail: rbsalha@mail.iugaza.edu Abstract In this paper, we generalized

More information

Descriptive Data Summarization

Descriptive Data Summarization Descriptive Data Summarization Descriptive data summarization gives the general characteristics of the data and identify the presence of noise or outliers, which is useful for successful data cleaning

More information

Web-based Supplementary Material for. Dependence Calibration in Conditional Copulas: A Nonparametric Approach

Web-based Supplementary Material for. Dependence Calibration in Conditional Copulas: A Nonparametric Approach 1 Web-based Supplementary Material for Dependence Calibration in Conditional Copulas: A Nonparametric Approach Elif F. Acar, Radu V. Craiu, and Fang Yao Web Appendix A: Technical Details The score and

More information

On variable bandwidth kernel density estimation

On variable bandwidth kernel density estimation JSM 04 - Section on Nonparametric Statistics On variable bandwidth kernel density estimation Janet Nakarmi Hailin Sang Abstract In this paper we study the ideal variable bandwidth kernel estimator introduced

More information

Chapter 1: Introduction. Material from Devore s book (Ed 8), and Cengagebrain.com

Chapter 1: Introduction. Material from Devore s book (Ed 8), and Cengagebrain.com 1 Chapter 1: Introduction Material from Devore s book (Ed 8), and Cengagebrain.com Populations and Samples An investigation of some characteristic of a population of interest. Example: Say you want to

More information

Empirical Density Estimation for Interval Censored Data

Empirical Density Estimation for Interval Censored Data AUSTRIAN JOURNAL OF STATISTICS Volume 37 (2008), Number 1, 119 128 Empirical Density Estimation for Interval Censored Data Eugenia Stoimenova Bulgarian Academy of Sciences, Sofia Abstract: This paper is

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic building blocks: Need to determine given Representation: or? Recall Curve Fitting Binary Variables

More information

University, Tempe, Arizona, USA b Department of Mathematics and Statistics, University of New. Mexico, Albuquerque, New Mexico, USA

University, Tempe, Arizona, USA b Department of Mathematics and Statistics, University of New. Mexico, Albuquerque, New Mexico, USA This article was downloaded by: [University of New Mexico] On: 27 September 2012, At: 22:13 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered

More information

OPTIMAL POINTWISE ADAPTIVE METHODS IN NONPARAMETRIC ESTIMATION 1

OPTIMAL POINTWISE ADAPTIVE METHODS IN NONPARAMETRIC ESTIMATION 1 The Annals of Statistics 1997, Vol. 25, No. 6, 2512 2546 OPTIMAL POINTWISE ADAPTIVE METHODS IN NONPARAMETRIC ESTIMATION 1 By O. V. Lepski and V. G. Spokoiny Humboldt University and Weierstrass Institute

More information