Density Estimation. We are concerned more here with the non-parametric case (see Roger Barlow s lectures for parametric statistics)

Size: px
Start display at page:

Download "Density Estimation. We are concerned more here with the non-parametric case (see Roger Barlow s lectures for parametric statistics)"

Transcription

1 Density Estimation Density Estimation: Deals with the problem of estimating probability density functions (PDFs) based on some data sampled from the PDF. May use assumed forms of the distribution, parameterized in some way (parametric statistics); or May avoid making assumptions about the form of the PDF (nonparametric statistics). We are concerned more here with the non-parametric case (see Roger Barlow s lectures for parametric statistics) 1 Frank Porter, SLUO Lectures on Statistics, August 2006

2 Some References (I) Richard A. Tapia & James R. Thompson, Nonparametric Density Estimation, Johns Hopkins University Press, Baltimore (1978). David W. Scott, Multivariate Density Estimation, John Wiley & Sons, Inc., New York (1992). Adrian W. Bowman and Adelchi Azzalini, Applied Smoothing Techniques for Data Analysis, Clarendon Press, Oxford (1997). B. W. Silverman, Density Estimation for Statistics and Data Analysis, Monographs on Statistics and Applied Probability, Chapman and Hall (1986); contents.html K. S. Cranmer, Kernel Estimation in High Energy Physics, Comp. Phys. Comm. 136, 198 (2001) [hep-ex/ v1]; cache/hep-ex/pdf/0011/ pdf 2 Frank Porter, SLUO Lectures on Statistics, August 2006

3 Some References (II) M. Pivk & F. R. Le Diberder, splot: a statistical tool to unfold data distributions, Nucl. Instr. Meth. A 555, 356 (2005). R. Cahn, How splots are Best (2005), rev splots best.pdf BaBar Statistics Working Group, Recommendations for Display of Projections in Multi-Dimensional Analyses, Statistics/Documents/MDgraphRec.pdf Additional specific references will noted in the course of the lectures. 3 Frank Porter, SLUO Lectures on Statistics, August 2006

4 Preliminaries We ll couch discussion in terms of observations (dataset) from some experiment. Our dataset consists of the values x i ; i =1, 2,...,n. Our dataset consists of repeated samplings from a (presumed unknown) probability distribution. IID Independent, Identically Distributed We ll note generalizations here and there. Order is not important; if we are discussing a time series, we could introduce ordered pairs {(x i,t i ),i = 1,...,n}, and call it two-dimensional [But beware the correlations then; probably not IID!]. In general, our quantities can be multi-dimensional; no special notation will be used to distinguish one- from multi-variate cases. We ll discuss where issues enter with dimensionality. 4 Frank Porter, SLUO Lectures on Statistics, August 2006

5 Notation At our convenience we may use E,, and all to mean expectation : E(x) x x xp(x)dx, where p(x) is the probability density function (PDF) for x (or, more generally p(x)dx μ(dx) is the probability measure). Estimators are denoted with a hat : In these lectures, we ll be concerned with estimators for the density function itself, hence p(x) is a random variable giving our estimate for p(x). We will not be especially rigorous. For example, we won t make a notational distinction between the random variable and an instance. 5 Frank Porter, SLUO Lectures on Statistics, August 2006

6 Motivation Why do we want to estimate densities? Well, that is the whole point... Harder question: Why non-parametric estimates? Comparison with models (which may be parametric) May be easier/better than parametric modeling for efficiency corrections and background subtraction Visualization Unfolding Comparing samples 6 Frank Porter, SLUO Lectures on Statistics, August 2006

7 R, A Toolkit, er, Language, You Might be Interested In... Histogram of x The S Language: developed with statistical analysis of data in mind. > x <- rnorm(100,10,1) > hist(x,xlim=range(5,15)) > Frequency x Free, open source version is R, fromther Project. Downloads available for Linux/MacOS X/Windows, e.g., at: Commercial version is S-Plus, at 7 Frank Porter, SLUO Lectures on Statistics, August 2006

8 Empirical Probability Density Function Place a delta function at each data point. The estimator is (EPDF, for Empirical Probability Density Function ) p(x) = 1 n n δ(x x i ). i= x Note that x could be multi-dimensional here. This is the sampling density for the bootstrap (more later; also see Ilya Narsky lectures). 8 Frank Porter, SLUO Lectures on Statistics, August 2006

9 The Histogram Perhaps our most ubiquitous density estimator is the histogram: h(x) = n B(x x i ; w), i=1 where x i is the center of the bin in which observation x i lies, w is the bin width, and B(x; w) ={ 1 x ( w/2,w/2) 0 otherwise (called the Indicator function in probability). 1 B(x-x ~ i ; w) x i w ~ x i x This is written for uniform bin widths, but may be generalized to differing widths with appropriate relative normalization factors. The estimator for the probability density function (PDF) is: p(x) = 1 nw h(x). 9 Frank Porter, SLUO Lectures on Statistics, August 2006

10 Histogram Example 6 5 Events/10 MeV x m(p pi) - m(p) - m(pi) Left: EPDF; Right: Histogram with w = 10 MeV. [Actual sampling is 100 points from a Δ(1232) Breit-Wigner (Cauchy) on a second-order polynomial background. Background probability is 50%.] 10 Frank Porter, SLUO Lectures on Statistics, August 2006

11 Criticisms of Histogram as Density Estimator Discontinuous even if PDF is continuous. Dependence on bin size and bin origin. Information from location of datum within a bin is ignored. 11 Frank Porter, SLUO Lectures on Statistics, August 2006

12 Kernel Estimation Take the histogram, but replace bin function B with something else: p(x) = 1 n k(x x n i ; w), i=1 where k(x, w) is the kernel function, normalized to unity: k(x; w) dx =1. Usually interested in kernels of the form k(x x i ; w) = 1 w K ( x xi w indeed this may be used as the definition of kernel. The kernel estimator for the PDF is then: p(x) = 1 n ( ) x xi K, nw w i=1 ), The role of parameter w as a smoothing parameter is clearer. 12 Frank Porter, SLUO Lectures on Statistics, August 2006

13 Multi-Variate Kernel Esitmation Explicit multi-variate case, d =2dimensions: p(x, y) = 1 nw x w y n K i=1 ( ) x xi K w x ( ) y yi. w y This is a product kernel form, with the same kernel in each dimension, except for possibly different smoothing parameters. It does not have correlations. The kernels we have introduced are classified more explicitly as fixed kernels : The smoothing parameter is independent of x. 13 Frank Porter, SLUO Lectures on Statistics, August 2006

14 Ideogram A simple variant on the kernel idea is to permit the kernel to depend on additional knowledge in the data. Physicists call this an ideogram. Most common is the Gaussian ideogram, in which each data point is entered as a Gaussian of area one and standard deviation appropriate to that datum. This addresses a way that the IID assumption might be broken. [Aside: Be careful to get your likelihood function right if you are incorporating variable resolution information in your fits; see, e.g., Punzi: ] 14 Frank Porter, SLUO Lectures on Statistics, August 2006

15 Sample Ideograms (I) WEIGHTED AVERAGE ±0.011 (Error scaled by 2.5) m K ± (MeV) Values above of weighted average, error, and scale factor are based upon the data in this ideogram only. They are not necessarily the same as our `best' values, obtained from a least-squares constrained fit utilizing measurements of other (related) quantities as additional information. 2 DENISOV GALL 88 K Pb GALL 88 K Pb GALL 88 K W GALL 88 K W LUM BARKOV CHENG 75 K Pb CHENG 75 K Pb CHENG 75 K Pb CHENG 75 K Pb CHENG 75 K Pb BACKENSTO (Confidence Level 0.001) (from RPP 2006) 15 Frank Porter, SLUO Lectures on Statistics, August 2006

16 Sample Ideograms (II) Note detailed comparison. Figure 1. A histogram of magnetic field values (black), compared with a smoothed frequency distribution constructed using a Gaussian ideogram technique (red). (from J. S. Halekas et al., Magnetic Properties of Lunar Geologic Terranes: New Statistical Results, Lunar and Planetary Science XXXIII (2002), 1368.pdf) 16 Frank Porter, SLUO Lectures on Statistics, August 2006

17 Parametric vs non-parametric Density Estimation (I) Distinction is fuzzy A histogram is non-parametric, in the sense that no assumption about the form of the sampling distribution is made. Often an implicit assumption that distribution is smooth on scale smaller than bin size. For example, we know something about the resolution of our apparatus. But the estimator of the parent distribution made with a histogram is parametric the parameters are populations (or frequencies) in each bin. The estimators for those parameters are the observed histogram populations. Even more parameters than a typical parametric fit! 17 Frank Porter, SLUO Lectures on Statistics, August 2006

18 Parametric vs non-parametric Density Estimation (II) Essence of difference may be captured in notions of local and nonlocal : If a datum at x i influences the density estimator at some other point x this is non-local. A non-parametric estimator is one in which the influence of a point at x i on the estimate at any x with d(x i,x) >ɛvanishes, asymptotically. Notice that for a kernel estimator, the bigger the smoothing paramter w, the more non-local the estimator, p(x) = 1 n ( ) x xi K. nw w i=1 As we ll discuss, the optimal choice of smoothing parameter depends on n. 18 Frank Porter, SLUO Lectures on Statistics, August 2006

19 Optimization We would like to make an optimal density estimate from our data. What does that mean? Need a criterion for optimal Choice of criterion is subjective; it depends on what you want to achieve. ^ f(x) We may compare the estimator for a quantity (here, value of Δ(x) the density at x) with the true f(x) value: Δ(x) = f(x) f(x). x 19 Frank Porter, SLUO Lectures on Statistics, August 2006

20 Mean Squared Error (I) A common choice in parametric estimation is to minimize the sum of the squares. We may take this idea over here, and form the Mean Squared Error (MSE): MSE[ f(x)] [ f(x) f(x) ] 2 =Var[ f(x)] + Bias 2 [ f(x)], where Var[ f(x)] E [ ( f(x) E[ f(x)] ) 2 ] Bias[ f(x)] E[ f(x)] f(x) 20 Frank Porter, SLUO Lectures on Statistics, August 2006

21 Mean Squared Error (II) Since this isn t quite our familiar parameter estimation, let s take a little time to make sure it is understood: Suppose p(x) is an estimator for the PDF p(x), based on data {x i ; i = 1,...,n}, IIDfromp(x). Then E[ p(x)] = p(x; {x i })Prob({x i })d n ({x i }) = n p(x; {x i }) [p(x i )dx i ] i=1 21 Frank Porter, SLUO Lectures on Statistics, August 2006

22 Exercise: Proof of formula for the MSE MSE[ f(x)] = ( f(x) f(x)) 2 ] 2 = [ f(x; n {xi }) f(x) [p(x i )dx i ] = = i=1 ] 2 [ f(x; n {xi }) E( f)+e( f) f(x) i=1 ] 2 { ] 2 [ [ f(x; {xi }) E( f) + E( f) f(x) ][ ]} 2 [ f(x; n {xi }) E( f) E( f) f(x) =Var[ f(x)] + Bias 2 [ f(x)] + 0. i=1 [p(x i )dx i ] [p(x i )dx i ] [In typical treatments of parametric statistics, we assume unbiased estimators, hence the Bias term is zero. That isn t a good assumption here.] 22 Frank Porter, SLUO Lectures on Statistics, August 2006

23 The Problem With Smoothing (I) Thm: [Rosenblatt (1956)] A uniform minimum variance unbiased estimator for p(x) does not exist. Unbiased: E[ p(x)] = p(x). Uniform minimum variance: Var [ p(x) p(x)] Var [ q(x) p(x)], x, for all p(x), where q(x) is any other estimator of p(x). 23 Frank Porter, SLUO Lectures on Statistics, August 2006

24 The Problem With Smoothing (II) For example, suppose we have a kernel estimator: p(x) = 1 n n k(x x i ; w), i=1 Its expectation is: E[ p(x)] = 1 n = n k(x x i ; w)p(x i )dx i i=1 k(x y)p(y)dy. Unless k(x y) =δ(x y), p(x) will be biased for some p(x). But δ(x y) has infinite variance. 24 Frank Porter, SLUO Lectures on Statistics, August 2006

25 The Problem with Smoothing (III) So the nice properties we strive for in parameter estimation (and sometimes achieve) are beyond reach. Intuition: smoothing lowers peaks and fills in valleys. Frequency Red curve: PDF Histogram: Sampling from PDF Black curve: Gaussian kernel estimator for PDF x 25 Frank Porter, SLUO Lectures on Statistics, August 2006

26 Comment on Number of Bins in Histogram Note: Sturges rule, based on optimizing MSE, was used in deciding how many bins, k, to make in the histogram: k =1+log 2 n. The argument behind this rule has been criticized (1995): hyndman/papers/sturges.pdf Indeed we see in our example that we would have by hand selected more bins; our histogram is over-smoothed. There are other rules for optimizing the number of bins. For example, Scott s rule for the bin width is: w =3.5sn 1/3, where s is the sample standard deviation. [More later] 26 Frank Porter, SLUO Lectures on Statistics, August 2006

27 Dependence on Smoothing Parameter Plot showing effect of choice of smoothing parameter : Frequency Red: Sampling PDF Black: Default smoothing (w) Blue: w/2 smoothing Turquoise: w/4 smoothing Green: 2w smoothing x 27 Frank Porter, SLUO Lectures on Statistics, August 2006

28 The Curse of Dimensionality Roger Barlow gave a nice example of the impact of the Curse of Dimensionality in parametric statistics. It is a significant affliction in density estimation as well. Difficult to display and visualize as the number of dimensions increases. All the volume (of a bounded region) goes to the boundary (exponentially!) as the dimensions increases. I.e., data becomes sparse. 1/2, d 1/4 1/8 Tendency for exponentially growing computation requirement with dimensions. Even worse than parametric statistics. 28 Frank Porter, SLUO Lectures on Statistics, August 2006

29 Summary We have introduced: Basic notions in (non-parametric) density estimation Some simple variations on the theme A foundation towards optimization An idea of where and how things will fail Next: Further sophistication on these ideas; and introduction of other variations in approach and application. 29 Frank Porter, SLUO Lectures on Statistics, August 2006

Density Estimation (II)

Density Estimation (II) Density Estimation (II) Yesterday Overview & Issues Histogram Kernel estimators Ideogram Today Further development of optimization Estimating variance and bias Adaptive kernels Multivariate kernel estimation

More information

Density Estimation (III)

Density Estimation (III) Density Estimation (III) Yesterday Cross-validation Adaptive kernels Variance(bootstrap) Bias(jackknife) Multivariate kernel estimation Today Series estimation Monte Carlo weighting Unfolding Non-parametric

More information

Chapter 9. Non-Parametric Density Function Estimation

Chapter 9. Non-Parametric Density Function Estimation 9-1 Density Estimation Version 1.2 Chapter 9 Non-Parametric Density Function Estimation 9.1. Introduction We have discussed several estimation techniques: method of moments, maximum likelihood, and least

More information

Quantitative Economics for the Evaluation of the European Policy. Dipartimento di Economia e Management

Quantitative Economics for the Evaluation of the European Policy. Dipartimento di Economia e Management Quantitative Economics for the Evaluation of the European Policy Dipartimento di Economia e Management Irene Brunetti 1 Davide Fiaschi 2 Angela Parenti 3 9 ottobre 2015 1 ireneb@ec.unipi.it. 2 davide.fiaschi@unipi.it.

More information

Chapter 9. Non-Parametric Density Function Estimation

Chapter 9. Non-Parametric Density Function Estimation 9-1 Density Estimation Version 1.1 Chapter 9 Non-Parametric Density Function Estimation 9.1. Introduction We have discussed several estimation techniques: method of moments, maximum likelihood, and least

More information

Frank C Porter and Ilya Narsky: Statistical Analysis Techniques in Particle Physics Chap. c /9/9 page 78 le-tex

Frank C Porter and Ilya Narsky: Statistical Analysis Techniques in Particle Physics Chap. c /9/9 page 78 le-tex Frank C Porter and Ilya Narsky: Statistical Analysis Techniques in Particle Physics Chap. c04 203/9/9 page 78 le-tex 78 4 Resampling Techniques.2 bootstrap interval bounds 0.8 0.6 0.4 0.2 0 0 200 400 600

More information

Statistics: Learning models from data

Statistics: Learning models from data DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial

More information

Boundary Correction Methods in Kernel Density Estimation Tom Alberts C o u(r)a n (t) Institute joint work with R.J. Karunamuni University of Alberta

Boundary Correction Methods in Kernel Density Estimation Tom Alberts C o u(r)a n (t) Institute joint work with R.J. Karunamuni University of Alberta Boundary Correction Methods in Kernel Density Estimation Tom Alberts C o u(r)a n (t) Institute joint work with R.J. Karunamuni University of Alberta November 29, 2007 Outline Overview of Kernel Density

More information

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Statistics - Lecture One. Outline. Charlotte Wickham  1. Basic ideas about estimation Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence

More information

O Combining cross-validation and plug-in methods - for kernel density bandwidth selection O

O Combining cross-validation and plug-in methods - for kernel density bandwidth selection O O Combining cross-validation and plug-in methods - for kernel density selection O Carlos Tenreiro CMUC and DMUC, University of Coimbra PhD Program UC UP February 18, 2011 1 Overview The nonparametric problem

More information

Parametric Techniques Lecture 3

Parametric Techniques Lecture 3 Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to

More information

Nonparametric Inference via Bootstrapping the Debiased Estimator

Nonparametric Inference via Bootstrapping the Debiased Estimator Nonparametric Inference via Bootstrapping the Debiased Estimator Yen-Chi Chen Department of Statistics, University of Washington ICSA-Canada Chapter Symposium 2017 1 / 21 Problem Setup Let X 1,, X n be

More information

Parametric Techniques

Parametric Techniques Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure

More information

LECTURE NOTE #3 PROF. ALAN YUILLE

LECTURE NOTE #3 PROF. ALAN YUILLE LECTURE NOTE #3 PROF. ALAN YUILLE 1. Three Topics (1) Precision and Recall Curves. Receiver Operating Characteristic Curves (ROC). What to do if we do not fix the loss function? (2) The Curse of Dimensionality.

More information

Statistics. Lecture 2 August 7, 2000 Frank Porter Caltech. The Fundamentals; Point Estimation. Maximum Likelihood, Least Squares and All That

Statistics. Lecture 2 August 7, 2000 Frank Porter Caltech. The Fundamentals; Point Estimation. Maximum Likelihood, Least Squares and All That Statistics Lecture 2 August 7, 2000 Frank Porter Caltech The plan for these lectures: The Fundamentals; Point Estimation Maximum Likelihood, Least Squares and All That What is a Confidence Interval? Interval

More information

A Few Notes on Fisher Information (WIP)

A Few Notes on Fisher Information (WIP) A Few Notes on Fisher Information (WIP) David Meyer dmm@{-4-5.net,uoregon.edu} Last update: April 30, 208 Definitions There are so many interesting things about Fisher Information and its theoretical properties

More information

Lecture 2 Machine Learning Review

Lecture 2 Machine Learning Review Lecture 2 Machine Learning Review CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago March 29, 2017 Things we will look at today Formal Setup for Supervised Learning Things

More information

Adaptive Nonparametric Density Estimators

Adaptive Nonparametric Density Estimators Adaptive Nonparametric Density Estimators by Alan J. Izenman Introduction Theoretical results and practical application of histograms as density estimators usually assume a fixed-partition approach, where

More information

Econ 582 Nonparametric Regression

Econ 582 Nonparametric Regression Econ 582 Nonparametric Regression Eric Zivot May 28, 2013 Nonparametric Regression Sofarwehaveonlyconsideredlinearregressionmodels = x 0 β + [ x ]=0 [ x = x] =x 0 β = [ x = x] [ x = x] x = β The assume

More information

Instance-based Learning CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016

Instance-based Learning CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016 Instance-based Learning CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Outline Non-parametric approach Unsupervised: Non-parametric density estimation Parzen Windows Kn-Nearest

More information

Sparse Nonparametric Density Estimation in High Dimensions Using the Rodeo

Sparse Nonparametric Density Estimation in High Dimensions Using the Rodeo Sparse Nonparametric Density Estimation in High Dimensions Using the Rodeo Han Liu John Lafferty Larry Wasserman Statistics Department Computer Science Department Machine Learning Department Carnegie Mellon

More information

Nonparametric Methods

Nonparametric Methods Nonparametric Methods Michael R. Roberts Department of Finance The Wharton School University of Pennsylvania July 28, 2009 Michael R. Roberts Nonparametric Methods 1/42 Overview Great for data analysis

More information

Physics 509: Bootstrap and Robust Parameter Estimation

Physics 509: Bootstrap and Robust Parameter Estimation Physics 509: Bootstrap and Robust Parameter Estimation Scott Oser Lecture #20 Physics 509 1 Nonparametric parameter estimation Question: what error estimate should you assign to the slope and intercept

More information

Statistical Methods for Particle Physics (I)

Statistical Methods for Particle Physics (I) Statistical Methods for Particle Physics (I) https://agenda.infn.it/conferencedisplay.py?confid=14407 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan

More information

Inferring from data. Theory of estimators

Inferring from data. Theory of estimators Inferring from data Theory of estimators 1 Estimators Estimator is any function of the data e(x) used to provide an estimate ( a measurement ) of an unknown parameter. Because estimators are functions

More information

41903: Introduction to Nonparametrics

41903: Introduction to Nonparametrics 41903: Notes 5 Introduction Nonparametrics fundamentally about fitting flexible models: want model that is flexible enough to accommodate important patterns but not so flexible it overspecializes to specific

More information

Chapter 2: Resampling Maarten Jansen

Chapter 2: Resampling Maarten Jansen Chapter 2: Resampling Maarten Jansen Randomization tests Randomized experiment random assignment of sample subjects to groups Example: medical experiment with control group n 1 subjects for true medicine,

More information

12 - Nonparametric Density Estimation

12 - Nonparametric Density Estimation ST 697 Fall 2017 1/49 12 - Nonparametric Density Estimation ST 697 Fall 2017 University of Alabama Density Review ST 697 Fall 2017 2/49 Continuous Random Variables ST 697 Fall 2017 3/49 1.0 0.8 F(x) 0.6

More information

Statistics 3858 : Maximum Likelihood Estimators

Statistics 3858 : Maximum Likelihood Estimators Statistics 3858 : Maximum Likelihood Estimators 1 Method of Maximum Likelihood In this method we construct the so called likelihood function, that is L(θ) = L(θ; X 1, X 2,..., X n ) = f n (X 1, X 2,...,

More information

COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017

COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017 COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University FEATURE EXPANSIONS FEATURE EXPANSIONS

More information

Statistical inference

Statistical inference Statistical inference Contents 1. Main definitions 2. Estimation 3. Testing L. Trapani MSc Induction - Statistical inference 1 1 Introduction: definition and preliminary theory In this chapter, we shall

More information

Akaike Information Criterion to Select the Parametric Detection Function for Kernel Estimator Using Line Transect Data

Akaike Information Criterion to Select the Parametric Detection Function for Kernel Estimator Using Line Transect Data Journal of Modern Applied Statistical Methods Volume 12 Issue 2 Article 21 11-1-2013 Akaike Information Criterion to Select the Parametric Detection Function for Kernel Estimator Using Line Transect Data

More information

Robustness and Distribution Assumptions

Robustness and Distribution Assumptions Chapter 1 Robustness and Distribution Assumptions 1.1 Introduction In statistics, one often works with model assumptions, i.e., one assumes that data follow a certain model. Then one makes use of methodology

More information

Statistical Methods for Particle Physics Lecture 1: parameter estimation, statistical tests

Statistical Methods for Particle Physics Lecture 1: parameter estimation, statistical tests Statistical Methods for Particle Physics Lecture 1: parameter estimation, statistical tests http://benasque.org/2018tae/cgi-bin/talks/allprint.pl TAE 2018 Benasque, Spain 3-15 Sept 2018 Glen Cowan Physics

More information

HYPOTHESIS TESTING: FREQUENTIST APPROACH.

HYPOTHESIS TESTING: FREQUENTIST APPROACH. HYPOTHESIS TESTING: FREQUENTIST APPROACH. These notes summarize the lectures on (the frequentist approach to) hypothesis testing. You should be familiar with the standard hypothesis testing from previous

More information

Statistics and Data Analysis

Statistics and Data Analysis Statistics and Data Analysis The Crash Course Physics 226, Fall 2013 "There are three kinds of lies: lies, damned lies, and statistics. Mark Twain, allegedly after Benjamin Disraeli Statistics and Data

More information

Monte Carlo Studies. The response in a Monte Carlo study is a random variable.

Monte Carlo Studies. The response in a Monte Carlo study is a random variable. Monte Carlo Studies The response in a Monte Carlo study is a random variable. The response in a Monte Carlo study has a variance that comes from the variance of the stochastic elements in the data-generating

More information

CS 195-5: Machine Learning Problem Set 1

CS 195-5: Machine Learning Problem Set 1 CS 95-5: Machine Learning Problem Set Douglas Lanman dlanman@brown.edu 7 September Regression Problem Show that the prediction errors y f(x; ŵ) are necessarily uncorrelated with any linear function of

More information

Density and Distribution Estimation

Density and Distribution Estimation Density and Distribution Estimation Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 04-Jan-2017 Nathaniel E. Helwig (U of Minnesota) Density

More information

IEOR 165 Lecture 7 1 Bias-Variance Tradeoff

IEOR 165 Lecture 7 1 Bias-Variance Tradeoff IEOR 165 Lecture 7 Bias-Variance Tradeoff 1 Bias-Variance Tradeoff Consider the case of parametric regression with β R, and suppose we would like to analyze the error of the estimate ˆβ in comparison to

More information

Expect Values and Probability Density Functions

Expect Values and Probability Density Functions Intelligent Systems: Reasoning and Recognition James L. Crowley ESIAG / osig Second Semester 00/0 Lesson 5 8 april 0 Expect Values and Probability Density Functions otation... Bayesian Classification (Reminder...3

More information

Monte Carlo Simulations

Monte Carlo Simulations Monte Carlo Simulations What are Monte Carlo Simulations and why ones them? Pseudo Random Number generators Creating a realization of a general PDF The Bootstrap approach A real life example: LOFAR simulations

More information

Statistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart

Statistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart Statistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart 1 Motivation and Problem In Lecture 1 we briefly saw how histograms

More information

The function graphed below is continuous everywhere. The function graphed below is NOT continuous everywhere, it is discontinuous at x 2 and

The function graphed below is continuous everywhere. The function graphed below is NOT continuous everywhere, it is discontinuous at x 2 and Section 1.4 Continuity A function is a continuous at a point if its graph has no gaps, holes, breaks or jumps at that point. If a function is not continuous at a point, then we say it is discontinuous

More information

STATISTICS/ECONOMETRICS PREP COURSE PROF. MASSIMO GUIDOLIN

STATISTICS/ECONOMETRICS PREP COURSE PROF. MASSIMO GUIDOLIN Massimo Guidolin Massimo.Guidolin@unibocconi.it Dept. of Finance STATISTICS/ECONOMETRICS PREP COURSE PROF. MASSIMO GUIDOLIN SECOND PART, LECTURE 2: MODES OF CONVERGENCE AND POINT ESTIMATION Lecture 2:

More information

Statistics for Python

Statistics for Python Statistics for Python An extension module for the Python scripting language Michiel de Hoon, Columbia University 2 September 2010 Statistics for Python, an extension module for the Python scripting language.

More information

Probability Models for Bayesian Recognition

Probability Models for Bayesian Recognition Intelligent Systems: Reasoning and Recognition James L. Crowley ENSIAG / osig Second Semester 06/07 Lesson 9 0 arch 07 Probability odels for Bayesian Recognition Notation... Supervised Learning for Bayesian

More information

Deep Learning for Computer Vision

Deep Learning for Computer Vision Deep Learning for Computer Vision Lecture 4: Curse of Dimensionality, High Dimensional Feature Spaces, Linear Classifiers, Linear Regression, Python, and Jupyter Notebooks Peter Belhumeur Computer Science

More information

Finite Population Sampling and Inference

Finite Population Sampling and Inference Finite Population Sampling and Inference A Prediction Approach RICHARD VALLIANT ALAN H. DORFMAN RICHARD M. ROYALL A Wiley-Interscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim Brisbane

More information

Preface. 1 Nonparametric Density Estimation and Testing. 1.1 Introduction. 1.2 Univariate Density Estimation

Preface. 1 Nonparametric Density Estimation and Testing. 1.1 Introduction. 1.2 Univariate Density Estimation Preface Nonparametric econometrics has become one of the most important sub-fields in modern econometrics. The primary goal of this lecture note is to introduce various nonparametric and semiparametric

More information

Econometrics I. Lecture 10: Nonparametric Estimation with Kernels. Paul T. Scott NYU Stern. Fall 2018

Econometrics I. Lecture 10: Nonparametric Estimation with Kernels. Paul T. Scott NYU Stern. Fall 2018 Econometrics I Lecture 10: Nonparametric Estimation with Kernels Paul T. Scott NYU Stern Fall 2018 Paul T. Scott NYU Stern Econometrics I Fall 2018 1 / 12 Nonparametric Regression: Intuition Let s get

More information

Summary and discussion of The central role of the propensity score in observational studies for causal effects

Summary and discussion of The central role of the propensity score in observational studies for causal effects Summary and discussion of The central role of the propensity score in observational studies for causal effects Statistics Journal Club, 36-825 Jessica Chemali and Michael Vespe 1 Summary 1.1 Background

More information

Estimation of cumulative distribution function with spline functions

Estimation of cumulative distribution function with spline functions INTERNATIONAL JOURNAL OF ECONOMICS AND STATISTICS Volume 5, 017 Estimation of cumulative distribution function with functions Akhlitdin Nizamitdinov, Aladdin Shamilov Abstract The estimation of the cumulative

More information

The bootstrap. Patrick Breheny. December 6. The empirical distribution function The bootstrap

The bootstrap. Patrick Breheny. December 6. The empirical distribution function The bootstrap Patrick Breheny December 6 Patrick Breheny BST 764: Applied Statistical Modeling 1/21 The empirical distribution function Suppose X F, where F (x) = Pr(X x) is a distribution function, and we wish to estimate

More information

Uncertainty Quantification for Inverse Problems. November 7, 2011

Uncertainty Quantification for Inverse Problems. November 7, 2011 Uncertainty Quantification for Inverse Problems November 7, 2011 Outline UQ and inverse problems Review: least-squares Review: Gaussian Bayesian linear model Parametric reductions for IP Bias, variance

More information

COMP90051 Statistical Machine Learning

COMP90051 Statistical Machine Learning COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Trevor Cohn 2. Statistical Schools Adapted from slides by Ben Rubinstein Statistical Schools of Thought Remainder of lecture is to provide

More information

A Novel Nonparametric Density Estimator

A Novel Nonparametric Density Estimator A Novel Nonparametric Density Estimator Z. I. Botev The University of Queensland Australia Abstract We present a novel nonparametric density estimator and a new data-driven bandwidth selection method with

More information

BAYESIAN DECISION THEORY

BAYESIAN DECISION THEORY Last updated: September 17, 2012 BAYESIAN DECISION THEORY Problems 2 The following problems from the textbook are relevant: 2.1 2.9, 2.11, 2.17 For this week, please at least solve Problem 2.3. We will

More information

Supervised Learning: Non-parametric Estimation

Supervised Learning: Non-parametric Estimation Supervised Learning: Non-parametric Estimation Edmondo Trentin March 18, 2018 Non-parametric Estimates No assumptions are made on the form of the pdfs 1. There are 3 major instances of non-parametric estimates:

More information

Kernel density estimation

Kernel density estimation Kernel density estimation Patrick Breheny October 18 Patrick Breheny STA 621: Nonparametric Statistics 1/34 Introduction Kernel Density Estimation We ve looked at one method for estimating density: histograms

More information

NONPARAMETRIC DENSITY ESTIMATION WITH RESPECT TO THE LINEX LOSS FUNCTION

NONPARAMETRIC DENSITY ESTIMATION WITH RESPECT TO THE LINEX LOSS FUNCTION NONPARAMETRIC DENSITY ESTIMATION WITH RESPECT TO THE LINEX LOSS FUNCTION R. HASHEMI, S. REZAEI AND L. AMIRI Department of Statistics, Faculty of Science, Razi University, 67149, Kermanshah, Iran. ABSTRACT

More information

BaBar Analysis School Statistics Topics

BaBar Analysis School Statistics Topics BaBar Analysis School Statistics Topics A toolkit The R Project Frequency and/or Bayes? Interval estimation (much to say here) Hypothesis testing (here also) Displaying Poisson errors Blind methodologies

More information

Review and continuation from last week Properties of MLEs

Review and continuation from last week Properties of MLEs Review and continuation from last week Properties of MLEs As we have mentioned, MLEs have a nice intuitive property, and as we have seen, they have a certain equivariance property. We will see later that

More information

Practice Problems Section Problems

Practice Problems Section Problems Practice Problems Section 4-4-3 4-4 4-5 4-6 4-7 4-8 4-10 Supplemental Problems 4-1 to 4-9 4-13, 14, 15, 17, 19, 0 4-3, 34, 36, 38 4-47, 49, 5, 54, 55 4-59, 60, 63 4-66, 68, 69, 70, 74 4-79, 81, 84 4-85,

More information

Lecture Notes 15 Prediction Chapters 13, 22, 20.4.

Lecture Notes 15 Prediction Chapters 13, 22, 20.4. Lecture Notes 15 Prediction Chapters 13, 22, 20.4. 1 Introduction Prediction is covered in detail in 36-707, 36-701, 36-715, 10/36-702. Here, we will just give an introduction. We observe training data

More information

Understanding Generalization Error: Bounds and Decompositions

Understanding Generalization Error: Bounds and Decompositions CIS 520: Machine Learning Spring 2018: Lecture 11 Understanding Generalization Error: Bounds and Decompositions Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the

More information

Machine Learning. Nonparametric Methods. Space of ML Problems. Todo. Histograms. Instance-Based Learning (aka non-parametric methods)

Machine Learning. Nonparametric Methods. Space of ML Problems. Todo. Histograms. Instance-Based Learning (aka non-parametric methods) Machine Learning InstanceBased Learning (aka nonparametric methods) Supervised Learning Unsupervised Learning Reinforcement Learning Parametric Non parametric CSE 446 Machine Learning Daniel Weld March

More information

Lecture 4: Training a Classifier

Lecture 4: Training a Classifier Lecture 4: Training a Classifier Roger Grosse 1 Introduction Now that we ve defined what binary classification is, let s actually train a classifier. We ll approach this problem in much the same way as

More information

Probability and Statistical Decision Theory

Probability and Statistical Decision Theory Tufts COMP 135: Introduction to Machine Learning https://www.cs.tufts.edu/comp/135/2019s/ Probability and Statistical Decision Theory Many slides attributable to: Erik Sudderth (UCI) Prof. Mike Hughes

More information

Time Series and Forecasting Lecture 4 NonLinear Time Series

Time Series and Forecasting Lecture 4 NonLinear Time Series Time Series and Forecasting Lecture 4 NonLinear Time Series Bruce E. Hansen Summer School in Economics and Econometrics University of Crete July 23-27, 2012 Bruce Hansen (University of Wisconsin) Foundations

More information

Histogram Härdle, Müller, Sperlich, Werwatz, 1995, Nonparametric and Semiparametric Models, An Introduction

Histogram Härdle, Müller, Sperlich, Werwatz, 1995, Nonparametric and Semiparametric Models, An Introduction Härdle, Müller, Sperlich, Werwatz, 1995, Nonparametric and Semiparametric Models, An Introduction Tine Buch-Kromann Construction X 1,..., X n iid r.v. with (unknown) density, f. Aim: Estimate the density

More information

Nonparametric Density Estimation

Nonparametric Density Estimation Nonparametric Density Estimation Econ 690 Purdue University Justin L. Tobias (Purdue) Nonparametric Density Estimation 1 / 29 Density Estimation Suppose that you had some data, say on wages, and you wanted

More information

Recall the Basics of Hypothesis Testing

Recall the Basics of Hypothesis Testing Recall the Basics of Hypothesis Testing The level of significance α, (size of test) is defined as the probability of X falling in w (rejecting H 0 ) when H 0 is true: P(X w H 0 ) = α. H 0 TRUE H 1 TRUE

More information

Lecture 7 Introduction to Statistical Decision Theory

Lecture 7 Introduction to Statistical Decision Theory Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7

More information

Testing Restrictions and Comparing Models

Testing Restrictions and Comparing Models Econ. 513, Time Series Econometrics Fall 00 Chris Sims Testing Restrictions and Comparing Models 1. THE PROBLEM We consider here the problem of comparing two parametric models for the data X, defined by

More information

Statistics. Lecture 4 August 9, 2000 Frank Porter Caltech. 1. The Fundamentals; Point Estimation. 2. Maximum Likelihood, Least Squares and All That

Statistics. Lecture 4 August 9, 2000 Frank Porter Caltech. 1. The Fundamentals; Point Estimation. 2. Maximum Likelihood, Least Squares and All That Statistics Lecture 4 August 9, 2000 Frank Porter Caltech The plan for these lectures: 1. The Fundamentals; Point Estimation 2. Maximum Likelihood, Least Squares and All That 3. What is a Confidence Interval?

More information

26, 24, 26, 28, 23, 23, 25, 24, 26, 25

26, 24, 26, 28, 23, 23, 25, 24, 26, 25 The ormal Distribution Introduction Chapter 5 in the text constitutes the theoretical heart of the subject of error analysis. We start by envisioning a series of experimental measurements of a quantity.

More information

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis Lecture 3 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,

More information

Brief Review on Estimation Theory

Brief Review on Estimation Theory Brief Review on Estimation Theory K. Abed-Meraim ENST PARIS, Signal and Image Processing Dept. abed@tsi.enst.fr This presentation is essentially based on the course BASTA by E. Moulines Brief review on

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

1 The Multiple Regression Model: Freeing Up the Classical Assumptions

1 The Multiple Regression Model: Freeing Up the Classical Assumptions 1 The Multiple Regression Model: Freeing Up the Classical Assumptions Some or all of classical assumptions were crucial for many of the derivations of the previous chapters. Derivation of the OLS estimator

More information

Reduction of Variance. Importance Sampling

Reduction of Variance. Importance Sampling Reduction of Variance As we discussed earlier, the statistical error goes as: error = sqrt(variance/computer time). DEFINE: Efficiency = = 1/vT v = error of mean and T = total CPU time How can you make

More information

Confidence Intervals

Confidence Intervals Quantitative Foundations Project 3 Instructor: Linwei Wang Confidence Intervals Contents 1 Introduction 3 1.1 Warning....................................... 3 1.2 Goals of Statistics..................................

More information

A tailor made nonparametric density estimate

A tailor made nonparametric density estimate A tailor made nonparametric density estimate Daniel Carando 1, Ricardo Fraiman 2 and Pablo Groisman 1 1 Universidad de Buenos Aires 2 Universidad de San Andrés School and Workshop on Probability Theory

More information

A Calculator for Confidence Intervals

A Calculator for Confidence Intervals A Calculator for Confidence Intervals Roger Barlow Department of Physics Manchester University England Abstract A calculator program has been written to give confidence intervals on branching ratios for

More information

CHMC: Finite Fields 9/23/17

CHMC: Finite Fields 9/23/17 CHMC: Finite Fields 9/23/17 1 Introduction This worksheet is an introduction to the fascinating subject of finite fields. Finite fields have many important applications in coding theory and cryptography,

More information

Numerical Methods Lecture 7 - Statistics, Probability and Reliability

Numerical Methods Lecture 7 - Statistics, Probability and Reliability Topics Numerical Methods Lecture 7 - Statistics, Probability and Reliability A summary of statistical analysis A summary of probability methods A summary of reliability analysis concepts Statistical Analysis

More information

COM336: Neural Computing

COM336: Neural Computing COM336: Neural Computing http://www.dcs.shef.ac.uk/ sjr/com336/ Lecture 2: Density Estimation Steve Renals Department of Computer Science University of Sheffield Sheffield S1 4DP UK email: s.renals@dcs.shef.ac.uk

More information

More on Estimation. Maximum Likelihood Estimation.

More on Estimation. Maximum Likelihood Estimation. More on Estimation. In the previous chapter we looked at the properties of estimators and the criteria we could use to choose between types of estimators. Here we examine more closely some very popular

More information

01 Probability Theory and Statistics Review

01 Probability Theory and Statistics Review NAVARCH/EECS 568, ROB 530 - Winter 2018 01 Probability Theory and Statistics Review Maani Ghaffari January 08, 2018 Last Time: Bayes Filters Given: Stream of observations z 1:t and action data u 1:t Sensor/measurement

More information

1.1.1 Algebraic Operations

1.1.1 Algebraic Operations 1.1.1 Algebraic Operations We need to learn how our basic algebraic operations interact. When confronted with many operations, we follow the order of operations: Parentheses Exponentials Multiplication

More information

Spring 2012 Math 541A Exam 1. X i, S 2 = 1 n. n 1. X i I(X i < c), T n =

Spring 2012 Math 541A Exam 1. X i, S 2 = 1 n. n 1. X i I(X i < c), T n = Spring 2012 Math 541A Exam 1 1. (a) Let Z i be independent N(0, 1), i = 1, 2,, n. Are Z = 1 n n Z i and S 2 Z = 1 n 1 n (Z i Z) 2 independent? Prove your claim. (b) Let X 1, X 2,, X n be independent identically

More information

Density estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas

Density estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas 0 0 5 Motivation: Regression discontinuity (Angrist&Pischke) Outcome.5 1 1.5 A. Linear E[Y 0i X i] 0.2.4.6.8 1 X Outcome.5 1 1.5 B. Nonlinear E[Y 0i X i] i 0.2.4.6.8 1 X utcome.5 1 1.5 C. Nonlinearity

More information

Nonparametric Methods Lecture 5

Nonparametric Methods Lecture 5 Nonparametric Methods Lecture 5 Jason Corso SUNY at Buffalo 17 Feb. 29 J. Corso (SUNY at Buffalo) Nonparametric Methods Lecture 5 17 Feb. 29 1 / 49 Nonparametric Methods Lecture 5 Overview Previously,

More information

Reminders. Thought questions should be submitted on eclass. Please list the section related to the thought question

Reminders. Thought questions should be submitted on eclass. Please list the section related to the thought question Linear regression Reminders Thought questions should be submitted on eclass Please list the section related to the thought question If it is a more general, open-ended question not exactly related to a

More information

NADARAYA WATSON ESTIMATE JAN 10, 2006: version 2. Y ik ( x i

NADARAYA WATSON ESTIMATE JAN 10, 2006: version 2. Y ik ( x i NADARAYA WATSON ESTIMATE JAN 0, 2006: version 2 DATA: (x i, Y i, i =,..., n. ESTIMATE E(Y x = m(x by n i= ˆm (x = Y ik ( x i x n i= K ( x i x EXAMPLES OF K: K(u = I{ u c} (uniform or box kernel K(u = u

More information

Sparse Nonparametric Density Estimation in High Dimensions Using the Rodeo

Sparse Nonparametric Density Estimation in High Dimensions Using the Rodeo Outline in High Dimensions Using the Rodeo Han Liu 1,2 John Lafferty 2,3 Larry Wasserman 1,2 1 Statistics Department, 2 Machine Learning Department, 3 Computer Science Department, Carnegie Mellon University

More information

Solution: chapter 2, problem 5, part a:

Solution: chapter 2, problem 5, part a: Learning Chap. 4/8/ 5:38 page Solution: chapter, problem 5, part a: Let y be the observed value of a sampling from a normal distribution with mean µ and standard deviation. We ll reserve µ for the estimator

More information

Statistics 3657 : Moment Approximations

Statistics 3657 : Moment Approximations Statistics 3657 : Moment Approximations Preliminaries Suppose that we have a r.v. and that we wish to calculate the expectation of g) for some function g. Of course we could calculate it as Eg)) by the

More information