A new method of nonparametric density estimation

Size: px
Start display at page:

Download "A new method of nonparametric density estimation"

Transcription

1 A new method of nonparametric density estimation Andrey Pepelyshev Cardi December 7, /32 A. Pepelyshev A new method of nonparametric density estimation

2 Contents Introduction A new density estimate Selection of a smoothing parameter Application in quality control 2/32 A. Pepelyshev A new method of nonparametric density estimation

3 Introduction Let (x 1,..., x m ) be a sample from a distribution F (x) with density p(x) = F (x). The problem is to estimate p(x) and F (x). 3/32 A. Pepelyshev A new method of nonparametric density estimation

4 Empirical distribution function The EDF F m (x) is the minimum-variance unbiased nonparametric estimate of F (x). F m (x) = 1 m m 1 [xi, )(x) i=1 4/32 A. Pepelyshev A new method of nonparametric density estimation

5 Kernel estimation The kernel density estimate is ˆp m,h (x) = 1 mh m i=1 K( x x i h ) where h is the bandwidth and the kernel K is a continuous function, K(x)dx = 1. The popular choice of K(x) is the Gaussian kernel K(x) = 1 2π e x2 /2. 5/32 A. Pepelyshev A new method of nonparametric density estimation

6 Kernel estimation The corresponding estimate of F (x) have the form ˆF (x) = (K F m )(x) = K(x z)f m (z)dz since the equality x K(y z)dy = z K(y x)dy holds for all x and z if K(x) is a symmetric kernel. 6/32 A. Pepelyshev A new method of nonparametric density estimation

7 Bandwidth selection An optimal bandwidth minimizes a risk function, e.g. MISE(h) = E (p(x) ˆp m,h (x)) 2 dx. The LSCV-bandwidth minimizes Ψ(h) = ˆp m,h (x) 2 dx 2 m m i=1 ˆp [x 1,...,x i 1,x i+1,...,x n ] m 1,h (x i ) 7/32 A. Pepelyshev A new method of nonparametric density estimation

8 A new estimate A new estimate p(x) = d dx F (x) The aim is to obtain a smoothed EDF which can be dierentiated. 8/32 A. Pepelyshev A new method of nonparametric density estimation

9 A new estimate A new estimate Consider values of the EDF at equidistant points as a time series. Apply Singular Spectrum Analysis (SSA) for its smoothing. Golyandina N., Nekrutkin V., Zhigljavsky A. (2001) Analysis of Time Series Structure: SSA and Related Techniques. London: Chapman & Hall/CRC. 9/32 A. Pepelyshev A new method of nonparametric density estimation

10 A new estimate The SSA procedure for a CDF series Transform a series (f 1,..., f N ) to a matrix X as X=(x i,j )=(f i+j 1 ), i=1,..., L, j =1,..., N L+1. Make the SVD decomposition X = L i=1 λi U i V T i, λ 1 λ X = f 1 f 2 f 3 f N L+1 f 2 f 3 f 4 f N L+2 f 3 f 4 f 5 f N L+3... f L f L+1 f L+2 f N. 10/32 A. Pepelyshev A new method of nonparametric density estimation

11 A new estimate The SSA procedure for a CDF series Transform a series (f 1,..., f N ) to a matrix X as X=(x i,j )=(f i+j 1 ), i=1,..., L, j =1,..., N L+1. Make the SVD decomposition X = L λi U i V T i, λ 1 λ i=1 Extract the r leading components X (r) = ( x (r) i,j ) = r i=1 λi U i V T i. Transform X (r) back to the series ˆf j = ˆf j (L, r) = 0 1 j < L, 1 L x (r) k,j k+1 L L j K, k=1 1 K < j N, 11/32 A. Pepelyshev A new method of nonparametric density estimation

12 A new estimate Representation of the SSA procedure ˆf j = L i=1 L (u 1,i u 1,l u r,i u r,l )f j+i l /L l=1 for j = L,..., K, where (u l,1,..., u l,l ) T = U l. This is a data-adaptive lter. L is half of the length of the lter. r is a complexity of the lter. L controls the smoothness. 12/32 A. Pepelyshev A new method of nonparametric density estimation

13 A new estimate Selection of points t j for time series Choose δ such that and dene t j as f j = F m (t j ), t j+1 t j = δ δ 1 m max x p(x) t j = δ(j 2L) + min x i, i=1,...,m j = 1,..., N, N = 2L max + 4L, maxi x i min i x i L max =. 2δ 13/32 A. Pepelyshev A new method of nonparametric density estimation

14 A new estimate Inuence of the parameter L Measurements of Lean Body Mass from the Australian Institute of Sport Data The SSA 1c estimate (i.e. with r = 1) of the distribution function and the density for L=20,..., 60 14/32 A. Pepelyshev A new method of nonparametric density estimation

15 A new estimate Inuence of the parameter L Measurements of Lean Body Mass from the Australian Institute of Sport Data The SSA 1c estimate (i.e. with r = 1) of the distribution function and the density for L=20,..., 60 15/32 A. Pepelyshev A new method of nonparametric density estimation

16 A new estimate Specic SSA estimates SSA 1c : the SSA estimate with r = 1 SSA 2c : the SSA estimate with r = 2 SSA b : the SSA estimate with r = 1 and bias correction 16/32 A. Pepelyshev A new method of nonparametric density estimation

17 A new estimate Validity of the SSA procedure The estimated series ( ˆf 1,..., ˆf N ) can be transformed to the SSA estimate of F (x) as follows ˆF m (x) = N ( j=2 ˆf j 1 + ( ˆf j ˆf j 1 ) x t ) j 1 1 [tj 1,t t j t j )(x) j 1 +1 [tn, )(x), Lemma. The SSA 1c estimate is a valid distribution function. 17/32 A. Pepelyshev A new method of nonparametric density estimation

18 A new estimate The SSA b estimate Let S(F ) be a lter (weighted moving average). Our case: S(F ) is the SSA 1c estimate. The problem: this lter has a bias. The idea is to correct S(F ) as an estimate F by use of S(S(F )) and S(S(S(F ))). S b (F) = 3S(F) 3S 2 (F) + S 3 (F) 18/32 A. Pepelyshev A new method of nonparametric density estimation

19 A new estimate Simulation results for dierent L The performance of SSA estimates for samples of size 100, the replication number is ED ISE ED KS ED H model 0.4N(0, 1) + 0.6N(5, 2 2 ) Kernel est. with h LSCV SSA 1c est. with L= SSA 1c est. with L= SSA 1c est. with L= SSA 2c est. with L= SSA 2c est. with L= SSA 2c est. with L= SSA b est. with L= SSA b est. with L= SSA b est. with L= /32 A. Pepelyshev A new method of nonparametric density estimation

20 The automatic choice of a smoothing parameter How to choose L automatically A law of the iterated logarithm. Glivenko-Cantelli result The empirical distribution function F m (x) satises almost surely, where R m = lim sup R Fm m (x) F (x) 1 m 2 m 2 ln ln m. If an estimate is far from the EDF, then this estimate is not good. 20/32 A. Pepelyshev A new method of nonparametric density estimation

21 The automatic choice of a smoothing parameter How to choose L automatically 1. Compute L=max { L : F m ˆF m (l) } 1/R m l {1,..., L}. 2. Compute the sequence M 1,..., M L, where M j is the number of modes of estimated density for L = j. This sequence has decrasing tendency. 3. Compute ˇM j = min{m 1, M 2,..., M j }. 21/32 A. Pepelyshev A new method of nonparametric density estimation

22 The automatic choice of a smoothing parameter How to choose L automatically 4. Divide the set {1, 2,..., L} into groups such that the values ˇM j are equal to each other within each group, i.e. dene a 1, b 1,..., a k, b k and k such that 1 = a 1 b 1 <... < a k b k = L, a i+1 = b i + 1 and ˇM j = ˇM j for all i, j {a l,..., b l }, l {1,..., k}. 22/32 A. Pepelyshev A new method of nonparametric density estimation

23 The automatic choice of a smoothing parameter How to choose L automatically 5. Finally, compute L a (a 'best' value of L) as an average with weight coecients, which are proportional to the sizes of these k groups, namely k L a = c i w i, where i=1 c i = γ i a i + (1 γ i )b i, w i = b i a i k j=1 b j a j, γ i = 1/2 if ˇMai = 1 and γ i = 0.9 otherwise. 23/32 A. Pepelyshev A new method of nonparametric density estimation

24 The automatic choice of a smoothing parameter How to choose the bandwidth 1. Compute h=max { h : Fm ˆF m ( ) } 1/R m (0, h). 2. Dene a dense set {h 1,..., h n } (0, h) and compute the sequence M 1,..., M n, where M j be the number of modes of estimated density for h = h j. 3. Compute ˇM j = min{m 1, M 2,..., M j }. 4. Divide the set {h 1,..., h n } into groups as earlier. 5. Compute h a = k i=1 c iw i, where c i and w i are dened in the same manner. 24/32 A. Pepelyshev A new method of nonparametric density estimation

25 The automatic choice of a smoothing parameter Consistency Lemma. The kernel estimate with h a and any SSA estimate with L a of the distribution function are consistent. 25/32 A. Pepelyshev A new method of nonparametric density estimation

26 The automatic choice of a smoothing parameter Simulation study Consider several models of the density p(x) and simulate samples of size 100. D ISE (ˆp, p) = (ˆp(x) p(x) ) 2dx D KS ( ˆF, F ) = ˆF F = max ˆF (x) F (x) x ( ) 2dx D H (ˆp, p) = ˆp(x) p(x) 26/32 A. Pepelyshev A new method of nonparametric density estimation

27 The automatic choice of a smoothing parameter Results on bandwidth selection ED ISE ED KS ED H model N(0, 1) Kernel est. with h LSCV Kernel est. with h SJPI Kernel est. with h ICV Kernel est. with h a model 0.4N(0, 1) + 0.6N(5, 2 2 ) Kernel est. with h LSCV Kernel est. with h SJPI Kernel est. with h ICV Kernel est. with h a model 0.3N(0, 1) + 0.7N(15, 4 2 ) Kernel est. with h LSCV Kernel est. with h SJPI Kernel est. with h ICV Kernel est. with h a /32 A. Pepelyshev A new method of nonparametric density estimation

28 The automatic choice of a smoothing parameter Results on the SSA estimates ED ISE ED KS ED H EL a model N(0, 1) Kernel est. with h a SSA 1c est SSA 2c est SSA b est model 0.4N(0, 1) + 0.6N(5, 2 2 ) Kernel est. with h a SSA 1c est SSA 2c est SSA b est model 0.3N(0, 1) + 0.7N(15, 4 2 ) Kernel est. with h a SSA 1c est SSA 2c est SSA 3c est SSA b est /32 A. Pepelyshev A new method of nonparametric density estimation

29 Application Application in photovoltaics In quality control, acceptance sampling is used to determine whether to accept or reject a large production lot by checking a small number of items. A sampling plan consists of the number n of items to be measured and the critical value c such that a lot is rejected if an appropriate statistic is larger than c. 29/32 A. Pepelyshev A new method of nonparametric density estimation

30 Application in photovoltaics Sampling plans for quality control Let measurements be distributed according to a distribution G(x) with mean a and variance σ 2. The asymptotically optimal sampling plan (n, c) is ( Φ 1 (α) Φ 1 (1 β) ) 2 n = ( F 1 (AQL) F 1 (RQL) ) 2, n ( c = F 1 (AQL) + F 1 (RQL) ), 2 where F (x) = G((x a)/σ), AQL is the acceptable quality level, RQL is the rejectable quality level, α is the producer risk and β is the consumer risk. 30/32 A. Pepelyshev A new method of nonparametric density estimation

31 Application in photovoltaics Comparison of sampling plans Means and standard deviations of sampling plan size using the empirical distribution function, the kernel density estimate and the SSA 1c, SSA b and SSA 2c estimates for samples of size m from 0.4N(0, 1) + 0.6N(5, 2 2 ). The true sampling plan size is 392. m = 250 m = 500 m = 1000 EDF 469.7(468.9) 462.7(278.9) 422.0(160.1) Kernel h LSCV 322.3(102.8) 324.0(79.7) 337.8(59.2) Kernel h a 302.1(77.2) 321.2(65.2) 339.1(52.8) SSA 1c est (71.8) 316.3(61.3) 329.1(46.9) SSA b est (112.9) 408.9(88.8) 403.6(67.7) SSA 2c est (107.8) 393.1(81.4) 392.8(62.1) 31/32 A. Pepelyshev A new method of nonparametric density estimation

32 Application in photovoltaics Thank you for your attention! 32/32 A. Pepelyshev A new method of nonparametric density estimation

Estimation of the quantile function using Bernstein-Durrmeyer polynomials

Estimation of the quantile function using Bernstein-Durrmeyer polynomials Estimation of the quantile function using Bernstein-Durrmeyer polynomials Andrey Pepelyshev Ansgar Steland Ewaryst Rafaj lowic The work is supported by the German Federal Ministry of the Environment, Nature

More information

1 Introduction A central problem in kernel density estimation is the data-driven selection of smoothing parameters. During the recent years, many dier

1 Introduction A central problem in kernel density estimation is the data-driven selection of smoothing parameters. During the recent years, many dier Bias corrected bootstrap bandwidth selection Birgit Grund and Jorg Polzehl y January 1996 School of Statistics, University of Minnesota Technical Report No. 611 Abstract Current bandwidth selectors for

More information

Density and Distribution Estimation

Density and Distribution Estimation Density and Distribution Estimation Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 04-Jan-2017 Nathaniel E. Helwig (U of Minnesota) Density

More information

A Novel Nonparametric Density Estimator

A Novel Nonparametric Density Estimator A Novel Nonparametric Density Estimator Z. I. Botev The University of Queensland Australia Abstract We present a novel nonparametric density estimator and a new data-driven bandwidth selection method with

More information

STAT 6385 Survey of Nonparametric Statistics. Order Statistics, EDF and Censoring

STAT 6385 Survey of Nonparametric Statistics. Order Statistics, EDF and Censoring STAT 6385 Survey of Nonparametric Statistics Order Statistics, EDF and Censoring Quantile Function A quantile (or a percentile) of a distribution is that value of X such that a specific percentage of the

More information

2 Tikhonov Regularization and ERM

2 Tikhonov Regularization and ERM Introduction Here we discusses how a class of regularization methods originally designed to solve ill-posed inverse problems give rise to regularized learning algorithms. These algorithms are kernel methods

More information

Statistics: Learning models from data

Statistics: Learning models from data DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial

More information

Nonparametric Inference via Bootstrapping the Debiased Estimator

Nonparametric Inference via Bootstrapping the Debiased Estimator Nonparametric Inference via Bootstrapping the Debiased Estimator Yen-Chi Chen Department of Statistics, University of Washington ICSA-Canada Chapter Symposium 2017 1 / 21 Problem Setup Let X 1,, X n be

More information

A tailor made nonparametric density estimate

A tailor made nonparametric density estimate A tailor made nonparametric density estimate Daniel Carando 1, Ricardo Fraiman 2 and Pablo Groisman 1 1 Universidad de Buenos Aires 2 Universidad de San Andrés School and Workshop on Probability Theory

More information

Introduction to Curve Estimation

Introduction to Curve Estimation Introduction to Curve Estimation Density 0.000 0.002 0.004 0.006 700 800 900 1000 1100 1200 1300 Wilcoxon score Michael E. Tarter & Micheal D. Lock Model-Free Curve Estimation Monographs on Statistics

More information

Advanced Statistics II: Non Parametric Tests

Advanced Statistics II: Non Parametric Tests Advanced Statistics II: Non Parametric Tests Aurélien Garivier ParisTech February 27, 2011 Outline Fitting a distribution Rank Tests for the comparison of two samples Two unrelated samples: Mann-Whitney

More information

Computation Of Asymptotic Distribution. For Semiparametric GMM Estimators. Hidehiko Ichimura. Graduate School of Public Policy

Computation Of Asymptotic Distribution. For Semiparametric GMM Estimators. Hidehiko Ichimura. Graduate School of Public Policy Computation Of Asymptotic Distribution For Semiparametric GMM Estimators Hidehiko Ichimura Graduate School of Public Policy and Graduate School of Economics University of Tokyo A Conference in honor of

More information

NONPARAMETRIC DENSITY ESTIMATION WITH RESPECT TO THE LINEX LOSS FUNCTION

NONPARAMETRIC DENSITY ESTIMATION WITH RESPECT TO THE LINEX LOSS FUNCTION NONPARAMETRIC DENSITY ESTIMATION WITH RESPECT TO THE LINEX LOSS FUNCTION R. HASHEMI, S. REZAEI AND L. AMIRI Department of Statistics, Faculty of Science, Razi University, 67149, Kermanshah, Iran. ABSTRACT

More information

Nonparametric Density Estimation. October 1, 2018

Nonparametric Density Estimation. October 1, 2018 Nonparametric Density Estimation October 1, 2018 Introduction If we can t fit a distribution to our data, then we use nonparametric density estimation. Start with a histogram. But there are problems with

More information

Maximum Mean Discrepancy

Maximum Mean Discrepancy Maximum Mean Discrepancy Thanks to Karsten Borgwardt, Malte Rasch, Bernhard Schölkopf, Jiayuan Huang, Arthur Gretton Alexander J. Smola Statistical Machine Learning Program Canberra, ACT 0200 Australia

More information

Automatic algorithms

Automatic algorithms Automatic algorithms Function approximation using its series decomposition in a basis Lluís Antoni Jiménez Rugama May 30, 2013 Index 1 Introduction Environment Linking H in Example and H out 2 Our algorithms

More information

Modelling Non-linear and Non-stationary Time Series

Modelling Non-linear and Non-stationary Time Series Modelling Non-linear and Non-stationary Time Series Chapter 2: Non-parametric methods Henrik Madsen Advanced Time Series Analysis September 206 Henrik Madsen (02427 Adv. TS Analysis) Lecture Notes September

More information

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University PRINCIPAL COMPONENT ANALYSIS DIMENSIONALITY

More information

Goodness-of-fit tests for the cure rate in a mixture cure model

Goodness-of-fit tests for the cure rate in a mixture cure model Biometrika (217), 13, 1, pp. 1 7 Printed in Great Britain Advance Access publication on 31 July 216 Goodness-of-fit tests for the cure rate in a mixture cure model BY U.U. MÜLLER Department of Statistics,

More information

Automatic Singular Spectrum Analysis and Forecasting Michele Trovero, Michael Leonard, and Bruce Elsheimer SAS Institute Inc.

Automatic Singular Spectrum Analysis and Forecasting Michele Trovero, Michael Leonard, and Bruce Elsheimer SAS Institute Inc. ABSTRACT Automatic Singular Spectrum Analysis and Forecasting Michele Trovero, Michael Leonard, and Bruce Elsheimer SAS Institute Inc., Cary, NC, USA The singular spectrum analysis (SSA) method of time

More information

Computer simulation on homogeneity testing for weighted data sets used in HEP

Computer simulation on homogeneity testing for weighted data sets used in HEP Computer simulation on homogeneity testing for weighted data sets used in HEP Petr Bouř and Václav Kůs Department of Mathematics, Faculty of Nuclear Sciences and Physical Engineering, Czech Technical University

More information

Lecture 2: CDF and EDF

Lecture 2: CDF and EDF STAT 425: Introduction to Nonparametric Statistics Winter 2018 Instructor: Yen-Chi Chen Lecture 2: CDF and EDF 2.1 CDF: Cumulative Distribution Function For a random variable X, its CDF F () contains all

More information

. Consider the linear system dx= =! = " a b # x y! : (a) For what values of a and b do solutions oscillate (i.e., do both x(t) and y(t) pass through z

. Consider the linear system dx= =! =  a b # x y! : (a) For what values of a and b do solutions oscillate (i.e., do both x(t) and y(t) pass through z Preliminary Exam { 1999 Morning Part Instructions: No calculators or crib sheets are allowed. Do as many problems as you can. Justify your answers as much as you can but very briey. 1. For positive real

More information

SSA analysis and forecasting of records for Earth temperature and ice extents

SSA analysis and forecasting of records for Earth temperature and ice extents SSA analysis and forecasting of records for Earth temperature and ice extents V. Kornikov, A. Pepelyshev, A. Zhigljavsky December 19, 2016 St.Petersburg State University, Cardiff University Abstract In

More information

Kernel Methods. Lecture 4: Maximum Mean Discrepancy Thanks to Karsten Borgwardt, Malte Rasch, Bernhard Schölkopf, Jiayuan Huang, Arthur Gretton

Kernel Methods. Lecture 4: Maximum Mean Discrepancy Thanks to Karsten Borgwardt, Malte Rasch, Bernhard Schölkopf, Jiayuan Huang, Arthur Gretton Kernel Methods Lecture 4: Maximum Mean Discrepancy Thanks to Karsten Borgwardt, Malte Rasch, Bernhard Schölkopf, Jiayuan Huang, Arthur Gretton Alexander J. Smola Statistical Machine Learning Program Canberra,

More information

Nonparametric Methods

Nonparametric Methods Nonparametric Methods Michael R. Roberts Department of Finance The Wharton School University of Pennsylvania July 28, 2009 Michael R. Roberts Nonparametric Methods 1/42 Overview Great for data analysis

More information

The EM Algorithm for the Finite Mixture of Exponential Distribution Models

The EM Algorithm for the Finite Mixture of Exponential Distribution Models Int. J. Contemp. Math. Sciences, Vol. 9, 2014, no. 2, 57-64 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ijcms.2014.312133 The EM Algorithm for the Finite Mixture of Exponential Distribution

More information

12 - Nonparametric Density Estimation

12 - Nonparametric Density Estimation ST 697 Fall 2017 1/49 12 - Nonparametric Density Estimation ST 697 Fall 2017 University of Alabama Density Review ST 697 Fall 2017 2/49 Continuous Random Variables ST 697 Fall 2017 3/49 1.0 0.8 F(x) 0.6

More information

Lecture 8: The Metropolis-Hastings Algorithm

Lecture 8: The Metropolis-Hastings Algorithm 30.10.2008 What we have seen last time: Gibbs sampler Key idea: Generate a Markov chain by updating the component of (X 1,..., X p ) in turn by drawing from the full conditionals: X (t) j Two drawbacks:

More information

OHSU OGI Class ECE-580-DOE :Statistical Process Control and Design of Experiments Steve Brainerd Basic Statistics Sample size?

OHSU OGI Class ECE-580-DOE :Statistical Process Control and Design of Experiments Steve Brainerd Basic Statistics Sample size? ECE-580-DOE :Statistical Process Control and Design of Experiments Steve Basic Statistics Sample size? Sample size determination: text section 2-4-2 Page 41 section 3-7 Page 107 Website::http://www.stat.uiowa.edu/~rlenth/Power/

More information

Confidence intervals for kernel density estimation

Confidence intervals for kernel density estimation Stata User Group - 9th UK meeting - 19/20 May 2003 Confidence intervals for kernel density estimation Carlo Fiorio c.fiorio@lse.ac.uk London School of Economics and STICERD Stata User Group - 9th UK meeting

More information

Lecture Notes 15 Prediction Chapters 13, 22, 20.4.

Lecture Notes 15 Prediction Chapters 13, 22, 20.4. Lecture Notes 15 Prediction Chapters 13, 22, 20.4. 1 Introduction Prediction is covered in detail in 36-707, 36-701, 36-715, 10/36-702. Here, we will just give an introduction. We observe training data

More information

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2. APPENDIX A Background Mathematics A. Linear Algebra A.. Vector algebra Let x denote the n-dimensional column vector with components 0 x x 2 B C @. A x n Definition 6 (scalar product). The scalar product

More information

Can we do statistical inference in a non-asymptotic way? 1

Can we do statistical inference in a non-asymptotic way? 1 Can we do statistical inference in a non-asymptotic way? 1 Guang Cheng 2 Statistics@Purdue www.science.purdue.edu/bigdata/ ONR Review Meeting@Duke Oct 11, 2017 1 Acknowledge NSF, ONR and Simons Foundation.

More information

Non-parametric Inference and Resampling

Non-parametric Inference and Resampling Non-parametric Inference and Resampling Exercises by David Wozabal (Last update 3. Juni 2013) 1 Basic Facts about Rank and Order Statistics 1.1 10 students were asked about the amount of time they spend

More information

Lecture 21. Hypothesis Testing II

Lecture 21. Hypothesis Testing II Lecture 21. Hypothesis Testing II December 7, 2011 In the previous lecture, we dened a few key concepts of hypothesis testing and introduced the framework for parametric hypothesis testing. In the parametric

More information

Sparse Nonparametric Density Estimation in High Dimensions Using the Rodeo

Sparse Nonparametric Density Estimation in High Dimensions Using the Rodeo Outline in High Dimensions Using the Rodeo Han Liu 1,2 John Lafferty 2,3 Larry Wasserman 1,2 1 Statistics Department, 2 Machine Learning Department, 3 Computer Science Department, Carnegie Mellon University

More information

Ulam Quaterly { Volume 2, Number 4, with Convergence Rate for Bilinear. Omar Zane. University of Kansas. Department of Mathematics

Ulam Quaterly { Volume 2, Number 4, with Convergence Rate for Bilinear. Omar Zane. University of Kansas. Department of Mathematics Ulam Quaterly { Volume 2, Number 4, 1994 Identication of Parameters with Convergence Rate for Bilinear Stochastic Dierential Equations Omar Zane University of Kansas Department of Mathematics Lawrence,

More information

Regression and Statistical Inference

Regression and Statistical Inference Regression and Statistical Inference Walid Mnif wmnif@uwo.ca Department of Applied Mathematics The University of Western Ontario, London, Canada 1 Elements of Probability 2 Elements of Probability CDF&PDF

More information

On variable bandwidth kernel density estimation

On variable bandwidth kernel density estimation JSM 04 - Section on Nonparametric Statistics On variable bandwidth kernel density estimation Janet Nakarmi Hailin Sang Abstract In this paper we study the ideal variable bandwidth kernel estimator introduced

More information

8 Singular Integral Operators and L p -Regularity Theory

8 Singular Integral Operators and L p -Regularity Theory 8 Singular Integral Operators and L p -Regularity Theory 8. Motivation See hand-written notes! 8.2 Mikhlin Multiplier Theorem Recall that the Fourier transformation F and the inverse Fourier transformation

More information

ECON 721: Lecture Notes on Nonparametric Density and Regression Estimation. Petra E. Todd

ECON 721: Lecture Notes on Nonparametric Density and Regression Estimation. Petra E. Todd ECON 721: Lecture Notes on Nonparametric Density and Regression Estimation Petra E. Todd Fall, 2014 2 Contents 1 Review of Stochastic Order Symbols 1 2 Nonparametric Density Estimation 3 2.1 Histogram

More information

Spline Density Estimation and Inference with Model-Based Penalities

Spline Density Estimation and Inference with Model-Based Penalities Spline Density Estimation and Inference with Model-Based Penalities December 7, 016 Abstract In this paper we propose model-based penalties for smoothing spline density estimation and inference. These

More information

ON SOME TWO-STEP DENSITY ESTIMATION METHOD

ON SOME TWO-STEP DENSITY ESTIMATION METHOD UNIVESITATIS IAGELLONICAE ACTA MATHEMATICA, FASCICULUS XLIII 2005 ON SOME TWO-STEP DENSITY ESTIMATION METHOD by Jolanta Jarnicka Abstract. We introduce a new two-step kernel density estimation method,

More information

Kernel density estimation

Kernel density estimation Kernel density estimation Patrick Breheny October 18 Patrick Breheny STA 621: Nonparametric Statistics 1/34 Introduction Kernel Density Estimation We ve looked at one method for estimating density: histograms

More information

Mathematical Institute, University of Utrecht. The problem of estimating the mean of an observed Gaussian innite-dimensional vector

Mathematical Institute, University of Utrecht. The problem of estimating the mean of an observed Gaussian innite-dimensional vector On Minimax Filtering over Ellipsoids Eduard N. Belitser and Boris Y. Levit Mathematical Institute, University of Utrecht Budapestlaan 6, 3584 CD Utrecht, The Netherlands The problem of estimating the mean

More information

Nonparametric Function Estimation with Infinite-Order Kernels

Nonparametric Function Estimation with Infinite-Order Kernels Nonparametric Function Estimation with Infinite-Order Kernels Arthur Berg Department of Statistics, University of Florida March 15, 2008 Kernel Density Estimation (IID Case) Let X 1,..., X n iid density

More information

Introduction Wavelet shrinage methods have been very successful in nonparametric regression. But so far most of the wavelet regression methods have be

Introduction Wavelet shrinage methods have been very successful in nonparametric regression. But so far most of the wavelet regression methods have be Wavelet Estimation For Samples With Random Uniform Design T. Tony Cai Department of Statistics, Purdue University Lawrence D. Brown Department of Statistics, University of Pennsylvania Abstract We show

More information

Statistical inference on Lévy processes

Statistical inference on Lévy processes Alberto Coca Cabrero University of Cambridge - CCA Supervisors: Dr. Richard Nickl and Professor L.C.G.Rogers Funded by Fundación Mutua Madrileña and EPSRC MASDOC/CCA student workshop 2013 26th March Outline

More information

Boundary Correction Methods in Kernel Density Estimation Tom Alberts C o u(r)a n (t) Institute joint work with R.J. Karunamuni University of Alberta

Boundary Correction Methods in Kernel Density Estimation Tom Alberts C o u(r)a n (t) Institute joint work with R.J. Karunamuni University of Alberta Boundary Correction Methods in Kernel Density Estimation Tom Alberts C o u(r)a n (t) Institute joint work with R.J. Karunamuni University of Alberta November 29, 2007 Outline Overview of Kernel Density

More information

The Laplacian PDF Distance: A Cost Function for Clustering in a Kernel Feature Space

The Laplacian PDF Distance: A Cost Function for Clustering in a Kernel Feature Space The Laplacian PDF Distance: A Cost Function for Clustering in a Kernel Feature Space Robert Jenssen, Deniz Erdogmus 2, Jose Principe 2, Torbjørn Eltoft Department of Physics, University of Tromsø, Norway

More information

The Bias-Variance dilemma of the Monte Carlo. method. Technion - Israel Institute of Technology, Technion City, Haifa 32000, Israel

The Bias-Variance dilemma of the Monte Carlo. method. Technion - Israel Institute of Technology, Technion City, Haifa 32000, Israel The Bias-Variance dilemma of the Monte Carlo method Zlochin Mark 1 and Yoram Baram 1 Technion - Israel Institute of Technology, Technion City, Haifa 32000, Israel fzmark,baramg@cs.technion.ac.il Abstract.

More information

Minimum Hellinger Distance Estimation in a. Semiparametric Mixture Model

Minimum Hellinger Distance Estimation in a. Semiparametric Mixture Model Minimum Hellinger Distance Estimation in a Semiparametric Mixture Model Sijia Xiang 1, Weixin Yao 1, and Jingjing Wu 2 1 Department of Statistics, Kansas State University, Manhattan, Kansas, USA 66506-0802.

More information

The Bootstrap: Theory and Applications. Biing-Shen Kuo National Chengchi University

The Bootstrap: Theory and Applications. Biing-Shen Kuo National Chengchi University The Bootstrap: Theory and Applications Biing-Shen Kuo National Chengchi University Motivation: Poor Asymptotic Approximation Most of statistical inference relies on asymptotic theory. Motivation: Poor

More information

Stochastic Simulation

Stochastic Simulation Stochastic Simulation APPM 7400 Lesson 3: Testing Random Number Generators Part II: Uniformity September 5, 2018 Lesson 3: Testing Random Number GeneratorsPart II: Uniformity Stochastic Simulation September

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued

Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations Research

More information

Chapter 1. Density Estimation

Chapter 1. Density Estimation Capter 1 Density Estimation Let X 1, X,..., X n be observations from a density f X x. Te aim is to use only tis data to obtain an estimate ˆf X x of f X x. Properties of f f X x x, Parametric metods f

More information

Bickel Rosenblatt test

Bickel Rosenblatt test University of Latvia 28.05.2011. A classical Let X 1,..., X n be i.i.d. random variables with a continuous probability density function f. Consider a simple hypothesis H 0 : f = f 0 with a significance

More information

Bootstrap, Jackknife and other resampling methods

Bootstrap, Jackknife and other resampling methods Bootstrap, Jackknife and other resampling methods Part III: Parametric Bootstrap Rozenn Dahyot Room 128, Department of Statistics Trinity College Dublin, Ireland dahyot@mee.tcd.ie 2005 R. Dahyot (TCD)

More information

Nonparametric Statistics

Nonparametric Statistics Nonparametric Statistics Jessi Cisewski Yale University Astrostatistics Summer School - XI Wednesday, June 3, 2015 1 Overview Many of the standard statistical inference procedures are based on assumptions

More information

Gaussian processes for inference in stochastic differential equations

Gaussian processes for inference in stochastic differential equations Gaussian processes for inference in stochastic differential equations Manfred Opper, AI group, TU Berlin November 6, 2017 Manfred Opper, AI group, TU Berlin (TU Berlin) inference in SDE November 6, 2017

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning 3. Instance Based Learning Alex Smola Carnegie Mellon University http://alex.smola.org/teaching/cmu2013-10-701 10-701 Outline Parzen Windows Kernels, algorithm Model selection

More information

Nonparametric Estimation of Luminosity Functions

Nonparametric Estimation of Luminosity Functions x x Nonparametric Estimation of Luminosity Functions Chad Schafer Department of Statistics, Carnegie Mellon University cschafer@stat.cmu.edu 1 Luminosity Functions The luminosity function gives the number

More information

Nonparametric Density Estimation

Nonparametric Density Estimation Nonparametric Density Estimation Econ 690 Purdue University Justin L. Tobias (Purdue) Nonparametric Density Estimation 1 / 29 Density Estimation Suppose that you had some data, say on wages, and you wanted

More information

The bootstrap. Patrick Breheny. December 6. The empirical distribution function The bootstrap

The bootstrap. Patrick Breheny. December 6. The empirical distribution function The bootstrap Patrick Breheny December 6 Patrick Breheny BST 764: Applied Statistical Modeling 1/21 The empirical distribution function Suppose X F, where F (x) = Pr(X x) is a distribution function, and we wish to estimate

More information

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo Group Prof. Daniel Cremers 11. Sampling Methods: Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative

More information

EE 302: Probabilistic Methods in Electrical Engineering

EE 302: Probabilistic Methods in Electrical Engineering EE : Probabilistic Methods in Electrical Engineering Print Name: Solution (//6 --sk) Test II : Chapters.5 4 //98, : PM Write down your name on each paper. Read every question carefully and solve each problem

More information

Kullback-Leibler Designs

Kullback-Leibler Designs Kullback-Leibler Designs Astrid JOURDAN Jessica FRANCO Contents Contents Introduction Kullback-Leibler divergence Estimation by a Monte-Carlo method Design comparison Conclusion 2 Introduction Computer

More information

RegML 2018 Class 2 Tikhonov regularization and kernels

RegML 2018 Class 2 Tikhonov regularization and kernels RegML 2018 Class 2 Tikhonov regularization and kernels Lorenzo Rosasco UNIGE-MIT-IIT June 17, 2018 Learning problem Problem For H {f f : X Y }, solve min E(f), f H dρ(x, y)l(f(x), y) given S n = (x i,

More information

1 Glivenko-Cantelli type theorems

1 Glivenko-Cantelli type theorems STA79 Lecture Spring Semester Glivenko-Cantelli type theorems Given i.i.d. observations X,..., X n with unknown distribution function F (t, consider the empirical (sample CDF ˆF n (t = I [Xi t]. n Then

More information

Chapter 9. Non-Parametric Density Function Estimation

Chapter 9. Non-Parametric Density Function Estimation 9-1 Density Estimation Version 1.1 Chapter 9 Non-Parametric Density Function Estimation 9.1. Introduction We have discussed several estimation techniques: method of moments, maximum likelihood, and least

More information

Application of Bernstein Polynomials for smooth estimation of a distribution and density function

Application of Bernstein Polynomials for smooth estimation of a distribution and density function Journal of Statistical Planning and Inference 105 (2002) 377 392 www.elsevier.com/locate/jspi Application of Bernstein Polynomials for smooth estimation of a distribution and density function G. Jogesh

More information

Recall the Basics of Hypothesis Testing

Recall the Basics of Hypothesis Testing Recall the Basics of Hypothesis Testing The level of significance α, (size of test) is defined as the probability of X falling in w (rejecting H 0 ) when H 0 is true: P(X w H 0 ) = α. H 0 TRUE H 1 TRUE

More information

The exact bootstrap method shown on the example of the mean and variance estimation

The exact bootstrap method shown on the example of the mean and variance estimation Comput Stat (2013) 28:1061 1077 DOI 10.1007/s00180-012-0350-0 ORIGINAL PAPER The exact bootstrap method shown on the example of the mean and variance estimation Joanna Kisielinska Received: 21 May 2011

More information

Local Polynomial Modelling and Its Applications

Local Polynomial Modelling and Its Applications Local Polynomial Modelling and Its Applications J. Fan Department of Statistics University of North Carolina Chapel Hill, USA and I. Gijbels Institute of Statistics Catholic University oflouvain Louvain-la-Neuve,

More information

Density Estimation (II)

Density Estimation (II) Density Estimation (II) Yesterday Overview & Issues Histogram Kernel estimators Ideogram Today Further development of optimization Estimating variance and bias Adaptive kernels Multivariate kernel estimation

More information

Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood

Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood Kuangyu Wen & Ximing Wu Texas A&M University Info-Metrics Institute Conference: Recent Innovations in Info-Metrics October

More information

Chapter 9. Non-Parametric Density Function Estimation

Chapter 9. Non-Parametric Density Function Estimation 9-1 Density Estimation Version 1.2 Chapter 9 Non-Parametric Density Function Estimation 9.1. Introduction We have discussed several estimation techniques: method of moments, maximum likelihood, and least

More information

The WENO Method for Non-Equidistant Meshes

The WENO Method for Non-Equidistant Meshes The WENO Method for Non-Equidistant Meshes Philip Rupp September 11, 01, Berlin Contents 1 Introduction 1.1 Settings and Conditions...................... The WENO Schemes 4.1 The Interpolation Problem.....................

More information

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu October

More information

O Combining cross-validation and plug-in methods - for kernel density bandwidth selection O

O Combining cross-validation and plug-in methods - for kernel density bandwidth selection O O Combining cross-validation and plug-in methods - for kernel density selection O Carlos Tenreiro CMUC and DMUC, University of Coimbra PhD Program UC UP February 18, 2011 1 Overview The nonparametric problem

More information

Kernel Density Estimation

Kernel Density Estimation Kernel Density Estimation Univariate Density Estimation Suppose tat we ave a random sample of data X 1,..., X n from an unknown continuous distribution wit probability density function (pdf) f(x) and cumulative

More information

Regression, Ridge Regression, Lasso

Regression, Ridge Regression, Lasso Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.

More information

Density estimators for the convolution of discrete and continuous random variables

Density estimators for the convolution of discrete and continuous random variables Density estimators for the convolution of discrete and continuous random variables Ursula U Müller Texas A&M University Anton Schick Binghamton University Wolfgang Wefelmeyer Universität zu Köln Abstract

More information

Model-free prediction intervals for regression and autoregression. Dimitris N. Politis University of California, San Diego

Model-free prediction intervals for regression and autoregression. Dimitris N. Politis University of California, San Diego Model-free prediction intervals for regression and autoregression Dimitris N. Politis University of California, San Diego To explain or to predict? Models are indispensable for exploring/utilizing relationships

More information

Nonparametric Indepedence Tests: Space Partitioning and Kernel Approaches

Nonparametric Indepedence Tests: Space Partitioning and Kernel Approaches Nonparametric Indepedence Tests: Space Partitioning and Kernel Approaches Arthur Gretton 1 and László Györfi 2 1. Gatsby Computational Neuroscience Unit London, UK 2. Budapest University of Technology

More information

Adaptive Kernel Estimation of The Hazard Rate Function

Adaptive Kernel Estimation of The Hazard Rate Function Adaptive Kernel Estimation of The Hazard Rate Function Raid Salha Department of Mathematics, Islamic University of Gaza, Palestine, e-mail: rbsalha@mail.iugaza.edu Abstract In this paper, we generalized

More information

A Note on Data-Adaptive Bandwidth Selection for Sequential Kernel Smoothers

A Note on Data-Adaptive Bandwidth Selection for Sequential Kernel Smoothers 6th St.Petersburg Workshop on Simulation (2009) 1-3 A Note on Data-Adaptive Bandwidth Selection for Sequential Kernel Smoothers Ansgar Steland 1 Abstract Sequential kernel smoothers form a class of procedures

More information

1 The Glivenko-Cantelli Theorem

1 The Glivenko-Cantelli Theorem 1 The Glivenko-Cantelli Theorem Let X i, i = 1,..., n be an i.i.d. sequence of random variables with distribution function F on R. The empirical distribution function is the function of x defined by ˆF

More information

Histogram Härdle, Müller, Sperlich, Werwatz, 1995, Nonparametric and Semiparametric Models, An Introduction

Histogram Härdle, Müller, Sperlich, Werwatz, 1995, Nonparametric and Semiparametric Models, An Introduction Härdle, Müller, Sperlich, Werwatz, 1995, Nonparametric and Semiparametric Models, An Introduction Tine Buch-Kromann Construction X 1,..., X n iid r.v. with (unknown) density, f. Aim: Estimate the density

More information

A Linear-Time Kernel Goodness-of-Fit Test

A Linear-Time Kernel Goodness-of-Fit Test A Linear-Time Kernel Goodness-of-Fit Test Wittawat Jitkrittum 1; Wenkai Xu 1 Zoltán Szabó 2 NIPS 2017 Best paper! Kenji Fukumizu 3 Arthur Gretton 1 wittawatj@gmail.com 1 Gatsby Unit, University College

More information

Reproducing Kernel Hilbert Spaces

Reproducing Kernel Hilbert Spaces Reproducing Kernel Hilbert Spaces Lorenzo Rosasco 9.520 Class 03 February 11, 2009 About this class Goal To introduce a particularly useful family of hypothesis spaces called Reproducing Kernel Hilbert

More information

LECTURE 15 + C+F. = A 11 x 1x1 +2A 12 x 1x2 + A 22 x 2x2 + B 1 x 1 + B 2 x 2. xi y 2 = ~y 2 (x 1 ;x 2 ) x 2 = ~x 2 (y 1 ;y 2 1

LECTURE 15 + C+F. = A 11 x 1x1 +2A 12 x 1x2 + A 22 x 2x2 + B 1 x 1 + B 2 x 2. xi y 2 = ~y 2 (x 1 ;x 2 ) x 2 = ~x 2 (y 1 ;y 2  1 LECTURE 5 Characteristics and the Classication of Second Order Linear PDEs Let us now consider the case of a general second order linear PDE in two variables; (5.) where (5.) 0 P i;j A ij xix j + P i,

More information

Optimal Estimation of a Nonsmooth Functional

Optimal Estimation of a Nonsmooth Functional Optimal Estimation of a Nonsmooth Functional T. Tony Cai Department of Statistics The Wharton School University of Pennsylvania http://stat.wharton.upenn.edu/ tcai Joint work with Mark Low 1 Question Suppose

More information

Robust Backtesting Tests for Value-at-Risk Models

Robust Backtesting Tests for Value-at-Risk Models Robust Backtesting Tests for Value-at-Risk Models Jose Olmo City University London (joint work with Juan Carlos Escanciano, Indiana University) Far East and South Asia Meeting of the Econometric Society

More information

Lecture Notes 5 Convergence and Limit Theorems. Convergence with Probability 1. Convergence in Mean Square. Convergence in Probability, WLLN

Lecture Notes 5 Convergence and Limit Theorems. Convergence with Probability 1. Convergence in Mean Square. Convergence in Probability, WLLN Lecture Notes 5 Convergence and Limit Theorems Motivation Convergence with Probability Convergence in Mean Square Convergence in Probability, WLLN Convergence in Distribution, CLT EE 278: Convergence and

More information

Asymptotics and Simulation of Heavy-Tailed Processes

Asymptotics and Simulation of Heavy-Tailed Processes Asymptotics and Simulation of Heavy-Tailed Processes Department of Mathematics Stockholm, Sweden Workshop on Heavy-tailed Distributions and Extreme Value Theory ISI Kolkata January 14-17, 2013 Outline

More information

CS 147: Computer Systems Performance Analysis

CS 147: Computer Systems Performance Analysis CS 147: Computer Systems Performance Analysis Summarizing Variability and Determining Distributions CS 147: Computer Systems Performance Analysis Summarizing Variability and Determining Distributions 1

More information

Efficient Density Estimation using Fejér-Type Kernel Functions

Efficient Density Estimation using Fejér-Type Kernel Functions Efficient Density Estimation using Fejér-Type Kernel Functions by Olga Kosta A Thesis submitted to the Faculty of Graduate Studies and Postdoctoral Affairs in partial fulfilment of the requirements for

More information

Multiple Linear Regression for the Supervisor Data

Multiple Linear Regression for the Supervisor Data for the Supervisor Data Rating 40 50 60 70 80 90 40 50 60 70 50 60 70 80 90 40 60 80 40 60 80 Complaints Privileges 30 50 70 40 60 Learn Raises 50 70 50 70 90 Critical 40 50 60 70 80 30 40 50 60 70 80

More information