Bootstrap metody II Kernelové Odhady Hustot

Size: px
Start display at page:

Download "Bootstrap metody II Kernelové Odhady Hustot"

Transcription

1 Bootstrap metody II Kernelové Odhady Hustot Mgr. Rudolf B. Blažek, Ph.D. prof. RNDr. Roman Kotecký, DrSc. Katedra počítačových systémů Katedra teoretické informatiky Fakulta informačních technologií České vysoké učení technické v Praze Rudolf Blažek & Roman Kotecký, 2011 Statistika pro informatiku MI-SPI, ZS 2011/12, Přednáška 24 Evropský sociální fond Praha & EU: Investujeme do vaší budoucnos@

2 Bootstrap methods II Kernel Estimates Mgr. Rudolf B. Blažek, Ph.D. prof. RNDr. Roman Kotecký, DrSc. Department of Computer Systems Department of Theoretical Informatics Faculty of Information Technologies Czech Technical University in Prague Rudolf Blažek & Roman Kotecký, 2011 Statistics for Informatics MI-SPI, ZS 2011/12, Lecture 24 The European Social Fund Prague & EU: We Invest in Your Future

3 Classical Confidence Intervals Confidence Interval for the Mean μ Approx. distribution from CLT, exact for Gaussian Xi Z = X n µ / p n N(0, 1) α/2 1 α α/ zα/2 zα/2 3

4 Classical Confidence Intervals Confidence Interval for the Mean μ Exact distribution for Gaussian Xi T = X n µ s/ p n t(n 1) α/2 1 α α/2 -tα/2,n tα/2,n-1 4

5 Classical Confidence Intervals Confidence Interval for the Mean μ P( X n µ < z /2 / n) (1 ) X n N(µ, 2 /n) α/2 1 α α/2 μ zα/2-2σ 0μ μ+ zα/2 2 σ / p n / p n 5

6 Classical Confidence Intervals Confidence Interval for the Mean μ We have obtained P( X n µ < z /2 / n) (1 ) Therefore we can construct a confidence interval for μ P µ X n ± z /2 / n (1 ) If σ is unknown, then we will estimate it by s and use the Student t-distribution with n-1 degrees of freedom P µ X n ± t /2,n 1 s/ n (1 ) 6

7 Classical Confidence Intervals Student-t CI for the Mean of a Die Histogram of x μ = x Histogram of xbar (average of 50 random values) % CI: about 1 in 20 miss μ = μ = xbar 7

8 Classical Confidence Intervals Student-t CI for the Mean of a Die Histogram of x μ = x Histogram of xbar (average of 50 random values) μ = xbar 95% CI: about 1 in 20 miss μ = 3.5 8

9 Classical Confidence Intervals Student-t CI for the Mean of a Die Histogram of x μ = x Histogram of xbar (average of 50 random values) μ = xbar 95% CI: about 1 in 20 miss μ = 3.5 9

10 Classical Confidence Intervals Student-t CI for the Mean of a Die Histogram of x μ = x Histogram of xbar (average of 50 random values) μ = xbar 95% CI: about 1 in 20 miss μ =

11 Classical Confidence Intervals Student-t CI for the Mean of a Die Histogram of x μ = x Histogram of xbar (average of 50 random values) μ = xbar 95% CI: about 1 in 20 miss μ =

12 Classical Confidence Intervals Student-t CI for the Mean of a Die Histogram of x μ = x Histogram of xbar (average of 50 random values) μ = xbar 95% CI: about 1 in 20 miss μ =

13 Bootstrap Methods (Resampling Techniques) Bootstrap metody Statistika pro informatiku MI-SPI ZS 2011/12, Přednáška 23 13

14 Literature Textbook Jun Shao & Dongsheng Tu The Jackknife and Bootstrap Springer Series in Statistics 1st ed. Jul 21, 1995 ISBN-10: ISBN-13:

15 Introduction Classical Approach Random Sample Mean & Std. Deviation Confidence Interval based on Gaussian Approximation Information Loss: n values 2 values Mean k1 s.d. Sample Mean Mean + k2 s.d

16 Introduction Central Limit Theorem Gaussian Approximation Needs finite 2nd moment Needs large n

17 Introduction Bootstrap Resampling Resampling: Monte-Carlo from the Histogram: Estimates the distribution No information loss histogram

18 Bootstrap Applications Permutation bootstrap Leads to Permutation Tests Used to train Change-Point Detection for Network Intrusions Bootstrap in Random Processes Resampling of inter-arrival times Improves test accuracy 18

19 Bootstrap-t Confidence Intervals The Bootstrap Method Algorithm Let X1, X2, X3,..., Xn be i.i.d. (independent & identically distributed) random variables with a distribution function F. Assume that we want to estimate a parameter θ of F. ˆ ( (...(a point estimator of the population parameter θ ˆ 2 n ˆ... an estimator of the variance of The bootstrap-t method (Efron, 1982) is based on a studentized pivot R n = If the distribution of Rn is unknown, we will use resampling. ˆ ˆ n 19

20 Bootstrap-t Confidence Intervals The Bootstrap Method Example For example θ could be the mean μ of the distribution. The point estimator and its variance would then be X n = 1 n nx X i Var X n = 2 /n with Var X i = 2 i=1 The studentized pivotal quantity is R n = X n µ s/ n and the estimate of Var X n is ˆ 2 n = s2 /n (where s 2 is the sample variance). 20

21 Bootstrap-t Confidence Intervals The Bootstrap Method Example For example θ could be the mean μ of the distribution. The point estimator and its variance would then be X n = 1 n nx i=1 The classical confidence interval is based on the CLT X n N(µ, Estimator of σ X i Var X n = 2 /n with Var X i = 2 2 /n), Z = X n µ R n = X µ n s/ p n / p n N(0, 1) Student-t(n 1) (at least approximately) 21

22 Bootstrap-t Confidence Intervals Confidence Interval for the Mean μ The classical confidence interval for μ is either of P µ X n ± z /2 / n (1 ) P µ X n ± t /2,n 1 s/ n (1 ) The CI can be rewritten as SE(X n )= X n k 1 SE(X n ), X n + k 2 SE(X n ) p Var X n = / p n is the standard error of X n 22

23 Bootstrap-t Confidence Intervals Confidence Interval for a Parameter θ The CI for the mean μ X n k 1 SE(X n ), X n + k 2 SE(X n ) is based on P X n k 1 SE(X n ) apple µ apple X n + k 2 SE(X n ) = P k 1 SE(X n ) apple µ X n apple k 2 SE(X n ) = P k 2 apple X n µ SE(X n ) apple k

24 Bootstrap-t Confidence Intervals Confidence Interval for a Parameter θ The CI for the mean μ X n k 1 SE(X n ), X n + k 2 SE(X n ) is based on P k 2 apple X n µ SE(X n ) apple k 1 1 X n µ / p N(0, 1) R n = X µ n n s/ p n (at least approximately) % X n... the point p estimator of μ SE(X n )= Var X n = Student-t(n 1) / p n can be estimated by s/ p n 24

25 Bootstrap-t Confidence Intervals Confidence Interval for the Mean μ The distribution is known using the CLT Z = X n µ / p n N(0, 1) α/2 1 α α/ k2=-zα/2 k1=zα/2 25

26 Bootstrap-t Confidence Intervals Confidence Interval for the Mean μ The distribution is known using the CLT R n = X n µ s/ n t(n 1) α/2 1 α α/ k2=-tα/2,n-1 k1=tα/2,n-1 If the distribution of Rn is unknown, we will use resampling. 26

27 Bootstrap-t Confidence Intervals Confidence Interval for a Parameter θ The CI for the mean μ is based on P X n k 1 SE(X n ), X n + k 2 SE(X n ) k 2 apple X n µ SE(X n ) apple k 1 1 The general form of a confidence interval for a parameter θ ˆ k 1 SE ˆ, ˆ + k 2 SE ˆ will similarly be based on P ˆ k 2 apple SE( ˆ ) apple k

28 Bootstrap-t Confidence Intervals Confidence Interval for a Parameter θ The CI for a parameter θ ˆ k 1 SE ˆ, ˆ + k 2 SE ˆ is based on P ˆ k 2 apple SE( ˆ ) apple k 1 1 where ˆ is theppoint estimator of SE ˆ = Var ˆ is the standard error of ˆ k1, k2 are selected so that the coverage probability is 1 α Steps:( 1. The standard error is estimated from the data ( ( ( 2. k1, k2 are estimated using resampling of the data 28

29 Bootstrap-t Confidence Intervals The Bootstrap Method Algorithm The bootstrap-t method (Efron, 1982) is based on a studentized pivot R n = If the distribution of Rn is unknown we will use resampling: X1, X2, X3,..., Xn is the original i.i.d. sample from distribution function F. Assume that ˆF is an estimator of the distribution function F (parametric or non-parametric) Let X * 1, X * 2, X * 3,..., X * n be a new i.i.d. sample from ˆ n ˆ n ˆF 29

30 Bootstrap-t Confidence Intervals The Bootstrap Method Algorithm X * 1, X * 2, X * 3,..., X * n is a new i.i.d. sample from the original data (i.e. resampling with replacement) R n = ˆ n R n = ˆ n ˆ n ˆ n ˆ n Resampling is repeated, and the R * n are sorted by size. α/2 100% of smallest and largest values are discarded. These cut-off points are used as the quantiles in the CI ˆ n k 1 ˆ n, ˆ n + k 2 ˆ n 30

31 Bootstrap-t Confidence Intervals The Bootstrap Method Example Let X1, X2, X3,..., Xn be i.i.d. random variables from log-normal distribution with parameters μ and σ. That is ln(xi) are i.i.d ~ N(μ, σ 2 ). The log-normal pdf is 1 (ln x µ) 2 f (x) = p x 2 exp 2 2, x > 0 Goal: find a confidence interval for the median Point estimator:( ( ( ( ( with variance: ˆ = X n = e µ SE 2 ˆ = Var ˆ = 2 n =(e 2 /n 1)e 2µ+ 2 /n 31

32 Bootstrap-t Confidence Intervals The Bootstrap Method Histogram of R n = ˆ n ˆ n n

33 Bootstrap-t Confidence Intervals The Bootstrap Method Histogram of R n = ˆ n ˆ n CLT approximation n

34 Bootstrap-t Confidence Intervals The Bootstrap Method Histogram of R n = ˆ n ˆ n CLT approximation of R n = n ˆ n n

35 Bootstrap-t Confidence Intervals The Bootstrap Method Histogram of R n = ˆ n ˆ n ˆ n

36 Bootstrap-t Confidence Intervals The Bootstrap Method Histogram of R n = ˆ n ˆ n CLT approximation ˆ n

37 Bootstrap-t Confidence Intervals The Bootstrap Method 95% CI for e µ : ˆ k 1 SE ˆ, ˆ + k 2 SE ˆ ˆ = % CI: (-1.22,1.15) SE ˆ = k1 = k2 = 4.42 α/2= α=0.95 α/2=

38 Kernel Estimators Kernel Estimators 38

39 Kernel Estimators Kernel Estimators Algorithm% % % % % % % % % Kernelový odhad hustoty Let X1, X2, X3,..., Xn be i.i.d. (independent & identically distributed) random variables with a density function f. A kernel density estimator (of the density f ) is nx nx ˆf h (x) = 1 K h (x x i ) = 1 x K n nh h i=1 where K is a kernel and h is a smoothing parameter. A common choice of the kernel is the Gaussian density. The bandwidth h is selection is a non-trivial task. i=1 xi 39

40 Kernel Estimators Kernel Estimators Algorithm Selection of the bandwidth h based on L2 optimality: ( Use h that minimizes the mean integrated squared error MISE(h) =E Z 1 1 ˆf h (x) f (x) 2 dx Sometimes h is changed adaptively. 40

41 Kernel Estimators Histogram of x x Kernel Estimate N = 3500 Bandwidth =

42 Kernel Estimators Kernel Estimate N = 3500 Bandwidth = Kernel Estimate N = 3500 Bandwidth =

43 Kernel Estimators Kernel Estimate N = 3 Bandwidth = 0.1 Kernel Estimate N = 3 Bandwidth = 2 Kernel Estimate N = 3 Bandwidth =

Markovské řetězce se spojitým parametrem

Markovské řetězce se spojitým parametrem Markovské řetězce se spojitým parametrem Mgr. Rudolf B. Blažek, Ph.D. prof. RNDr. Roman Kotecký, DrSc. Katedra počítačových systémů Katedra teoretické informatiky Fakulta informačních technologií České

More information

Základy teorie front II

Základy teorie front II Základy teorie front II Aplikace Poissonova procesu v teorii front Mgr. Rudolf B. Blažek, Ph.D. prof. RNDr. Roman Kotecký, DrSc. Katedra počítačových systémů Katedra teoretické informatiky Fakulta informačních

More information

Statistika pro informatiku

Statistika pro informatiku Statistika pro informatiku prof. RNDr. Roman Kotecký DrSc., Dr. Rudolf Blažek, PhD Katedra teoretické informatiky FIT České vysoké učení technické v Praze MI-SPI, ZS 2011/12, Přednáška 5 Evropský sociální

More information

Základy teorie front

Základy teorie front Základy teorie front Mgr. Rudolf B. Blažek, Ph.D. prof. RNDr. Roman Kotecký, DrSc. Katedra počítačových systémů Katedra teoretické informatiky Fakulta informačních technologií České vysoké učení technické

More information

Statistika pro informatiku

Statistika pro informatiku Statistika pro informatiku prof. RNDr. Roman Kotecký DrSc., Dr. Rudolf Blažek, PhD Katedra teoretické informatiky FIT České vysoké učení technické v Praze MI-SPI, ZS 2011/12, Přednáška 2 Evropský sociální

More information

Statistika pro informatiku

Statistika pro informatiku Statistika pro informatiku prof. RNDr. Roman Kotecký DrSc., Dr. Rudolf Blažek, PhD Katedra teoretické informatiky FIT České vysoké učení technické v Praze MI-SPI, ZS 2011/12, Přednáška 1 Evropský sociální

More information

Quantum computing. Jan Černý, FIT, Czech Technical University in Prague. České vysoké učení technické v Praze. Fakulta informačních technologií

Quantum computing. Jan Černý, FIT, Czech Technical University in Prague. České vysoké učení technické v Praze. Fakulta informačních technologií České vysoké učení technické v Praze Fakulta informačních technologií Katedra teoretické informatiky Evropský sociální fond Praha & EU: Investujeme do vaší budoucnosti MI-MVI Methods of Computational Intelligence(2010/2011)

More information

Cole s MergeSort. prof. Ing. Pavel Tvrdík CSc. Fakulta informačních technologií České vysoké učení technické v Praze c Pavel Tvrdík, 2010

Cole s MergeSort. prof. Ing. Pavel Tvrdík CSc. Fakulta informačních technologií České vysoké učení technické v Praze c Pavel Tvrdík, 2010 Cole s MergeSort prof. Ing. Pavel Tvrdík CSc. Katedra počítačových systémů Fakulta informačních technologií České vysoké učení technické v Praze c Pavel Tvrdík, 2010 Pokročilé paralelní algoritmy (PI-PPA)

More information

Bootstrap, Jackknife and other resampling methods

Bootstrap, Jackknife and other resampling methods Bootstrap, Jackknife and other resampling methods Part III: Parametric Bootstrap Rozenn Dahyot Room 128, Department of Statistics Trinity College Dublin, Ireland dahyot@mee.tcd.ie 2005 R. Dahyot (TCD)

More information

Computational intelligence methods

Computational intelligence methods Computational intelligence methods GA, schemas, diversity Pavel Kordík, Martin Šlapák Katedra teoretické informatiky FIT České vysoké učení technické v Praze MI-MVI, ZS 2011/12, Lect. 5 https://edux.fit.cvut.cz/courses/mi-mvi/

More information

Branch-and-Bound Algorithm. Pattern Recognition XI. Michal Haindl. Outline

Branch-and-Bound Algorithm. Pattern Recognition XI. Michal Haindl. Outline Branch-and-Bound Algorithm assumption - can be used if a feature selection criterion satisfies the monotonicity property monotonicity property - for nested feature sets X j related X 1 X 2... X l the criterion

More information

Chapter 2: Resampling Maarten Jansen

Chapter 2: Resampling Maarten Jansen Chapter 2: Resampling Maarten Jansen Randomization tests Randomized experiment random assignment of sample subjects to groups Example: medical experiment with control group n 1 subjects for true medicine,

More information

Computational Intelligence Methods

Computational Intelligence Methods Computational Intelligence Methods Ant Colony Optimization, Partical Swarm Optimization Pavel Kordík, Martin Šlapák Katedra teoretické informatiky FIT České vysoké učení technické v Praze MI-MVI, ZS 2011/12,

More information

Confidence Intervals Unknown σ

Confidence Intervals Unknown σ Confidence Intervals Unknown σ Estimate σ Student s t-distribution Step-by-step instructions Example Confidence Intervals - Known σ Standard normal distribution aaad1hicbvjbtxqxfc6sf1xvoi/gzojcgrohm4neetahyqipxlbzmaer6xtozjz0kmnk5cmt8zxn33vf+ap8qf44h/xdedcbzp0eubr951lz8kqkwobx7+nplvnzl+4ohopffnk1wvxz+duvk/1yhdocs1+zcxgqrq0lpcsvhqgwbljmer10p91ufwnrcq017ueg/ziusa8gzreijo5zj6jxfcxe38tuznxgpblz0kiojm7a/xvz+9vpxs7c9m/aa75qarluwr1vzkle07zqzgenybjmqogn9lbwyjqvgjdd81aftoiuiojwba4fyaubxiwnlxr+uwrfpktlhkiqzcqcw9ar7o3jud0jviwukbougdccm4n4kbnglb5dmyry47osmcrpbjogtpo+oxigdtf1ek+nkibrp/giujpa4e4gtskuxe0tpibaemohx3rchh47x57pkjjhcqhavyozu7ssr0nbtrsi1nl5gm47m5c0al5ciff+9pxd/yrtjlbabeonyhkshf/gwdtmrajtbvdpaztj4cuqxipgrdb3mfstkso1kpqamtf6z7cxxt9//7htq6tiqitckcjcndkmnvxxgrybn69ffbeuzrd7qyp0m66sulpohygqp3jlsfixeijndhxxdhkvnnexptpvfhigzzhauzwmo4jwkng55cnkktxu9dgl1kx6tdnzekmm1q6rosrjoqhwsppyqbpeu4m+ua+kx+trzzvfw59oarotx1pbpkj1fr6fqtsik=

More information

Confidence intervals for kernel density estimation

Confidence intervals for kernel density estimation Stata User Group - 9th UK meeting - 19/20 May 2003 Confidence intervals for kernel density estimation Carlo Fiorio c.fiorio@lse.ac.uk London School of Economics and STICERD Stata User Group - 9th UK meeting

More information

The Nonparametric Bootstrap

The Nonparametric Bootstrap The Nonparametric Bootstrap The nonparametric bootstrap may involve inferences about a parameter, but we use a nonparametric procedure in approximating the parametric distribution using the ECDF. We use

More information

STAT 830 Non-parametric Inference Basics

STAT 830 Non-parametric Inference Basics STAT 830 Non-parametric Inference Basics Richard Lockhart Simon Fraser University STAT 801=830 Fall 2012 Richard Lockhart (Simon Fraser University)STAT 830 Non-parametric Inference Basics STAT 801=830

More information

Some Assorted Formulae. Some confidence intervals: σ n. x ± z α/2. x ± t n 1;α/2 n. ˆp(1 ˆp) ˆp ± z α/2 n. χ 2 n 1;1 α/2. n 1;α/2

Some Assorted Formulae. Some confidence intervals: σ n. x ± z α/2. x ± t n 1;α/2 n. ˆp(1 ˆp) ˆp ± z α/2 n. χ 2 n 1;1 α/2. n 1;α/2 STA 248 H1S MIDTERM TEST February 26, 2008 SURNAME: SOLUTIONS GIVEN NAME: STUDENT NUMBER: INSTRUCTIONS: Time: 1 hour and 50 minutes Aids allowed: calculator Tables of the standard normal, t and chi-square

More information

Analytical Bootstrap Methods for Censored Data

Analytical Bootstrap Methods for Censored Data JOURNAL OF APPLIED MATHEMATICS AND DECISION SCIENCES, 6(2, 129 141 Copyright c 2002, Lawrence Erlbaum Associates, Inc. Analytical Bootstrap Methods for Censored Data ALAN D. HUTSON Division of Biostatistics,

More information

Terminology Suppose we have N observations {x(n)} N 1. Estimators as Random Variables. {x(n)} N 1

Terminology Suppose we have N observations {x(n)} N 1. Estimators as Random Variables. {x(n)} N 1 Estimation Theory Overview Properties Bias, Variance, and Mean Square Error Cramér-Rao lower bound Maximum likelihood Consistency Confidence intervals Properties of the mean estimator Properties of the

More information

Lecture 12: Small Sample Intervals Based on a Normal Population Distribution

Lecture 12: Small Sample Intervals Based on a Normal Population Distribution Lecture 12: Small Sample Intervals Based on a Normal Population MSU-STT-351-Sum-17B (P. Vellaisamy: MSU-STT-351-Sum-17B) Probability & Statistics for Engineers 1 / 24 In this lecture, we will discuss (i)

More information

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Statistics - Lecture One. Outline. Charlotte Wickham  1. Basic ideas about estimation Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence

More information

Pivotal Quantities. Mathematics 47: Lecture 16. Dan Sloughter. Furman University. March 30, 2006

Pivotal Quantities. Mathematics 47: Lecture 16. Dan Sloughter. Furman University. March 30, 2006 Pivotal Quantities Mathematics 47: Lecture 16 Dan Sloughter Furman University March 30, 2006 Dan Sloughter (Furman University) Pivotal Quantities March 30, 2006 1 / 10 Pivotal quantities Definition Suppose

More information

One-Sample Numerical Data

One-Sample Numerical Data One-Sample Numerical Data quantiles, boxplot, histogram, bootstrap confidence intervals, goodness-of-fit tests University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html

More information

Importance sampling in scenario generation

Importance sampling in scenario generation Importance sampling in scenario generation Václav Kozmík Faculty of Mathematics and Physics Charles University in Prague September 14, 2013 Introduction Monte Carlo techniques have received significant

More information

2.830J / 6.780J / ESD.63J Control of Manufacturing Processes (SMA 6303) Spring 2008

2.830J / 6.780J / ESD.63J Control of Manufacturing Processes (SMA 6303) Spring 2008 MIT OpenCourseWare http://ocw.mit.edu 2.830J / 6.780J / ESD.63J Control of Processes (SMA 6303) Spring 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

More information

Set Theory. Pattern Recognition III. Michal Haindl. Set Operations. Outline

Set Theory. Pattern Recognition III. Michal Haindl. Set Operations. Outline Set Theory A, B sets e.g. A = {ζ 1,...,ζ n } A = { c x y d} S space (universe) A,B S Outline Pattern Recognition III Michal Haindl Faculty of Information Technology, KTI Czech Technical University in Prague

More information

MVE055/MSG Lecture 8

MVE055/MSG Lecture 8 MVE055/MSG810 2017 Lecture 8 Petter Mostad Chalmers September 23, 2017 The Central Limit Theorem (CLT) Assume X 1,..., X n is a random sample from a distribution with expectation µ and variance σ 2. Then,

More information

MI-RUB Testing Lecture 10

MI-RUB Testing Lecture 10 MI-RUB Testing Lecture 10 Pavel Strnad pavel.strnad@fel.cvut.cz Dept. of Computer Science, FEE CTU Prague, Karlovo nám. 13, 121 35 Praha, Czech Republic MI-RUB, WS 2011/12 Evropský sociální fond Praha

More information

MI-RUB Testing II Lecture 11

MI-RUB Testing II Lecture 11 MI-RUB Testing II Lecture 11 Pavel Strnad pavel.strnad@fel.cvut.cz Dept. of Computer Science, FEE CTU Prague, Karlovo nám. 13, 121 35 Praha, Czech Republic MI-RUB, WS 2011/12 Evropský sociální fond Praha

More information

The bootstrap. Patrick Breheny. December 6. The empirical distribution function The bootstrap

The bootstrap. Patrick Breheny. December 6. The empirical distribution function The bootstrap Patrick Breheny December 6 Patrick Breheny BST 764: Applied Statistical Modeling 1/21 The empirical distribution function Suppose X F, where F (x) = Pr(X x) is a distribution function, and we wish to estimate

More information

Feature Selection. Pattern Recognition X. Michal Haindl. Feature Selection. Outline

Feature Selection. Pattern Recognition X. Michal Haindl. Feature Selection. Outline Feature election Outline Pattern Recognition X motivation technical recognition problem dimensionality reduction ց class separability increase ր data compression (e.g. required communication channel capacity)

More information

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Review. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with

More information

Hypothesis Testing with the Bootstrap. Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods

Hypothesis Testing with the Bootstrap. Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods Hypothesis Testing with the Bootstrap Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods Bootstrap Hypothesis Testing A bootstrap hypothesis test starts with a test statistic

More information

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing 1 In most statistics problems, we assume that the data have been generated from some unknown probability distribution. We desire

More information

Comparing Systems Using Sample Data

Comparing Systems Using Sample Data Comparing Systems Using Sample Data Dr. John Mellor-Crummey Department of Computer Science Rice University johnmc@cs.rice.edu COMP 528 Lecture 8 10 February 2005 Goals for Today Understand Population and

More information

Statistics. Statistics

Statistics. Statistics The main aims of statistics 1 1 Choosing a model 2 Estimating its parameter(s) 1 point estimates 2 interval estimates 3 Testing hypotheses Distributions used in statistics: χ 2 n-distribution 2 Let X 1,

More information

Lecture 13: Subsampling vs Bootstrap. Dimitris N. Politis, Joseph P. Romano, Michael Wolf

Lecture 13: Subsampling vs Bootstrap. Dimitris N. Politis, Joseph P. Romano, Michael Wolf Lecture 13: 2011 Bootstrap ) R n x n, θ P)) = τ n ˆθn θ P) Example: ˆθn = X n, τ n = n, θ = EX = µ P) ˆθ = min X n, τ n = n, θ P) = sup{x : F x) 0} ) Define: J n P), the distribution of τ n ˆθ n θ P) under

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2009 Prof. Gesine Reinert Our standard situation is that we have data x = x 1, x 2,..., x n, which we view as realisations of random

More information

Nonparametric Methods II

Nonparametric Methods II Nonparametric Methods II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw http://tigpbp.iis.sinica.edu.tw/courses.htm 1 PART 3: Statistical Inference by

More information

Distributed Estimation, Information Loss and Exponential Families. Qiang Liu Department of Computer Science Dartmouth College

Distributed Estimation, Information Loss and Exponential Families. Qiang Liu Department of Computer Science Dartmouth College Distributed Estimation, Information Loss and Exponential Families Qiang Liu Department of Computer Science Dartmouth College Statistical Learning / Estimation Learning generative models from data Topic

More information

Post-exam 2 practice questions 18.05, Spring 2014

Post-exam 2 practice questions 18.05, Spring 2014 Post-exam 2 practice questions 18.05, Spring 2014 Note: This is a set of practice problems for the material that came after exam 2. In preparing for the final you should use the previous review materials,

More information

Contents 1. Contents

Contents 1. Contents Contents 1 Contents 1 One-Sample Methods 3 1.1 Parametric Methods.................... 4 1.1.1 One-sample Z-test (see Chapter 0.3.1)...... 4 1.1.2 One-sample t-test................. 6 1.1.3 Large sample

More information

Permutation Tests. Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods

Permutation Tests. Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods Permutation Tests Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods The Two-Sample Problem We observe two independent random samples: F z = z 1, z 2,, z n independently of

More information

Confidence Intervals in Ridge Regression using Jackknife and Bootstrap Methods

Confidence Intervals in Ridge Regression using Jackknife and Bootstrap Methods Chapter 4 Confidence Intervals in Ridge Regression using Jackknife and Bootstrap Methods 4.1 Introduction It is now explicable that ridge regression estimator (here we take ordinary ridge estimator (ORE)

More information

Interval Estimation III: Fisher's Information & Bootstrapping

Interval Estimation III: Fisher's Information & Bootstrapping Interval Estimation III: Fisher's Information & Bootstrapping Frequentist Confidence Interval Will consider four approaches to estimating confidence interval Standard Error (+/- 1.96 se) Likelihood Profile

More information

Better Bootstrap Confidence Intervals

Better Bootstrap Confidence Intervals by Bradley Efron University of Washington, Department of Statistics April 12, 2012 An example Suppose we wish to make inference on some parameter θ T (F ) (e.g. θ = E F X ), based on data We might suppose

More information

Resampling and the Bootstrap

Resampling and the Bootstrap Resampling and the Bootstrap Axel Benner Biostatistics, German Cancer Research Center INF 280, D-69120 Heidelberg benner@dkfz.de Resampling and the Bootstrap 2 Topics Estimation and Statistical Testing

More information

4 Resampling Methods: The Bootstrap

4 Resampling Methods: The Bootstrap 4 Resampling Methods: The Bootstrap Situation: Let x 1, x 2,..., x n be a SRS of size n taken from a distribution that is unknown. Let θ be a parameter of interest associated with this distribution and

More information

STAT 512 sp 2018 Summary Sheet

STAT 512 sp 2018 Summary Sheet STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}

More information

Distributions of Functions of Random Variables. 5.1 Functions of One Random Variable

Distributions of Functions of Random Variables. 5.1 Functions of One Random Variable Distributions of Functions of Random Variables 5.1 Functions of One Random Variable 5.2 Transformations of Two Random Variables 5.3 Several Random Variables 5.4 The Moment-Generating Function Technique

More information

Quantitative Economics for the Evaluation of the European Policy. Dipartimento di Economia e Management

Quantitative Economics for the Evaluation of the European Policy. Dipartimento di Economia e Management Quantitative Economics for the Evaluation of the European Policy Dipartimento di Economia e Management Irene Brunetti 1 Davide Fiaschi 2 Angela Parenti 3 9 ottobre 2015 1 ireneb@ec.unipi.it. 2 davide.fiaschi@unipi.it.

More information

Confidence intervals for parameters of normal distribution.

Confidence intervals for parameters of normal distribution. Lecture 5 Confidence intervals for parameters of normal distribution. Let us consider a Matlab example based on the dataset of body temperature measurements of 30 individuals from the article []. The dataset

More information

Lecture 8. October 22, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.

Lecture 8. October 22, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University. Lecture 8 Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University October 22, 2007 1 2 3 4 5 6 1 Define convergent series 2 Define the Law of Large Numbers

More information

Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009 there were participants

Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009 there were participants 18.650 Statistics for Applications Chapter 5: Parametric hypothesis testing 1/37 Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009

More information

NONPARAMETRIC DENSITY ESTIMATION WITH RESPECT TO THE LINEX LOSS FUNCTION

NONPARAMETRIC DENSITY ESTIMATION WITH RESPECT TO THE LINEX LOSS FUNCTION NONPARAMETRIC DENSITY ESTIMATION WITH RESPECT TO THE LINEX LOSS FUNCTION R. HASHEMI, S. REZAEI AND L. AMIRI Department of Statistics, Faculty of Science, Razi University, 67149, Kermanshah, Iran. ABSTRACT

More information

Inference For High Dimensional M-estimates. Fixed Design Results

Inference For High Dimensional M-estimates. Fixed Design Results : Fixed Design Results Lihua Lei Advisors: Peter J. Bickel, Michael I. Jordan joint work with Peter J. Bickel and Noureddine El Karoui Dec. 8, 2016 1/57 Table of Contents 1 Background 2 Main Results and

More information

UQ, Semester 1, 2017, Companion to STAT2201/CIVL2530 Exam Formulae and Tables

UQ, Semester 1, 2017, Companion to STAT2201/CIVL2530 Exam Formulae and Tables UQ, Semester 1, 2017, Companion to STAT2201/CIVL2530 Exam Formulae and Tables To be provided to students with STAT2201 or CIVIL-2530 (Probability and Statistics) Exam Main exam date: Tuesday, 20 June 1

More information

Econ 582 Nonparametric Regression

Econ 582 Nonparametric Regression Econ 582 Nonparametric Regression Eric Zivot May 28, 2013 Nonparametric Regression Sofarwehaveonlyconsideredlinearregressionmodels = x 0 β + [ x ]=0 [ x = x] =x 0 β = [ x = x] [ x = x] x = β The assume

More information

A Resampling Method on Pivotal Estimating Functions

A Resampling Method on Pivotal Estimating Functions A Resampling Method on Pivotal Estimating Functions Kun Nie Biostat 277,Winter 2004 March 17, 2004 Outline Introduction A General Resampling Method Examples - Quantile Regression -Rank Regression -Simulation

More information

Bootstrap & Confidence/Prediction intervals

Bootstrap & Confidence/Prediction intervals Bootstrap & Confidence/Prediction intervals Olivier Roustant Mines Saint-Étienne 2017/11 Olivier Roustant (EMSE) Bootstrap & Confidence/Prediction intervals 2017/11 1 / 9 Framework Consider a model with

More information

Non-parametric Inference and Resampling

Non-parametric Inference and Resampling Non-parametric Inference and Resampling Exercises by David Wozabal (Last update. Juni 010) 1 Basic Facts about Rank and Order Statistics 1.1 10 students were asked about the amount of time they spend surfing

More information

Spring 2012 Math 541B Exam 1

Spring 2012 Math 541B Exam 1 Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote

More information

Bootstrap Confidence Intervals

Bootstrap Confidence Intervals Bootstrap Confidence Intervals Patrick Breheny September 18 Patrick Breheny STA 621: Nonparametric Statistics 1/22 Introduction Bootstrap confidence intervals So far, we have discussed the idea behind

More information

Inference on distributions and quantiles using a finite-sample Dirichlet process

Inference on distributions and quantiles using a finite-sample Dirichlet process Dirichlet IDEAL Theory/methods Simulations Inference on distributions and quantiles using a finite-sample Dirichlet process David M. Kaplan University of Missouri Matt Goldman UC San Diego Midwest Econometrics

More information

Bootstrap. Director of Center for Astrostatistics. G. Jogesh Babu. Penn State University babu.

Bootstrap. Director of Center for Astrostatistics. G. Jogesh Babu. Penn State University  babu. Bootstrap G. Jogesh Babu Penn State University http://www.stat.psu.edu/ babu Director of Center for Astrostatistics http://astrostatistics.psu.edu Outline 1 Motivation 2 Simple statistical problem 3 Resampling

More information

Estimating a population mean

Estimating a population mean Introductory Statistics Lectures Estimating a population mean Confidence intervals for means Department of Mathematics Pima Community College Redistribution of this material is prohibited without written

More information

1 Statistical inference for a population mean

1 Statistical inference for a population mean 1 Statistical inference for a population mean 1. Inference for a large sample, known variance Suppose X 1,..., X n represents a large random sample of data from a population with unknown mean µ and known

More information

2WB05 Simulation Lecture 7: Output analysis

2WB05 Simulation Lecture 7: Output analysis 2WB05 Simulation Lecture 7: Output analysis Marko Boon http://www.win.tue.nl/courses/2wb05 December 17, 2012 Outline 2/33 Output analysis of a simulation Confidence intervals Warm-up interval Common random

More information

Binary Decision Diagrams

Binary Decision Diagrams Binary Decision Diagrams Logic Circuits Design Seminars WS2010/2011, Lecture 2 Ing. Petr Fišer, Ph.D. Department of Digital Design Faculty of Information Technology Czech Technical University in Prague

More information

MIT Spring 2015

MIT Spring 2015 MIT 18.443 Dr. Kempthorne Spring 2015 MIT 18.443 1 Outline 1 MIT 18.443 2 Batches of data: single or multiple x 1, x 2,..., x n y 1, y 2,..., y m w 1, w 2,..., w l etc. Graphical displays Summary statistics:

More information

CS 147: Computer Systems Performance Analysis

CS 147: Computer Systems Performance Analysis CS 147: Computer Systems Performance Analysis Summarizing Variability and Determining Distributions CS 147: Computer Systems Performance Analysis Summarizing Variability and Determining Distributions 1

More information

12 - Nonparametric Density Estimation

12 - Nonparametric Density Estimation ST 697 Fall 2017 1/49 12 - Nonparametric Density Estimation ST 697 Fall 2017 University of Alabama Density Review ST 697 Fall 2017 2/49 Continuous Random Variables ST 697 Fall 2017 3/49 1.0 0.8 F(x) 0.6

More information

ST 371 (IX): Theories of Sampling Distributions

ST 371 (IX): Theories of Sampling Distributions ST 371 (IX): Theories of Sampling Distributions 1 Sample, Population, Parameter and Statistic The major use of inferential statistics is to use information from a sample to infer characteristics about

More information

Practice Problems Section Problems

Practice Problems Section Problems Practice Problems Section 4-4-3 4-4 4-5 4-6 4-7 4-8 4-10 Supplemental Problems 4-1 to 4-9 4-13, 14, 15, 17, 19, 0 4-3, 34, 36, 38 4-47, 49, 5, 54, 55 4-59, 60, 63 4-66, 68, 69, 70, 74 4-79, 81, 84 4-85,

More information

Chapter 11. Output Analysis for a Single Model Prof. Dr. Mesut Güneş Ch. 11 Output Analysis for a Single Model

Chapter 11. Output Analysis for a Single Model Prof. Dr. Mesut Güneş Ch. 11 Output Analysis for a Single Model Chapter Output Analysis for a Single Model. Contents Types of Simulation Stochastic Nature of Output Data Measures of Performance Output Analysis for Terminating Simulations Output Analysis for Steady-state

More information

Notation. Pattern Recognition II. Michal Haindl. Outline - PR Basic Concepts. Pattern Recognition Notions

Notation. Pattern Recognition II. Michal Haindl. Outline - PR Basic Concepts. Pattern Recognition Notions Notation S pattern space X feature vector X = [x 1,...,x l ] l = dim{x} number of features X feature space K number of classes ω i class indicator Ω = {ω 1,...,ω K } g(x) discriminant function H decision

More information

Cramér-Type Moderate Deviation Theorems for Two-Sample Studentized (Self-normalized) U-Statistics. Wen-Xin Zhou

Cramér-Type Moderate Deviation Theorems for Two-Sample Studentized (Self-normalized) U-Statistics. Wen-Xin Zhou Cramér-Type Moderate Deviation Theorems for Two-Sample Studentized (Self-normalized) U-Statistics Wen-Xin Zhou Department of Mathematics and Statistics University of Melbourne Joint work with Prof. Qi-Man

More information

Accuracy & confidence

Accuracy & confidence Accuracy & confidence Most of course so far: estimating stuff from data Today: how much do we trust our estimates? Last week: one answer to this question prove ahead of time that training set estimate

More information

Sensitivity Analysis with Correlated Variables

Sensitivity Analysis with Correlated Variables Sensitivity Analysis with Correlated Variables st Workshop on Nonlinear Analysis of Shell Structures INTALES GmbH Engineering Solutions University of Innsbruck, Faculty of Civil Engineering University

More information

Space Telescope Science Institute statistics mini-course. October Inference I: Estimation, Confidence Intervals, and Tests of Hypotheses

Space Telescope Science Institute statistics mini-course. October Inference I: Estimation, Confidence Intervals, and Tests of Hypotheses Space Telescope Science Institute statistics mini-course October 2011 Inference I: Estimation, Confidence Intervals, and Tests of Hypotheses James L Rosenberger Acknowledgements: Donald Richards, William

More information

IEOR E4703: Monte-Carlo Simulation

IEOR E4703: Monte-Carlo Simulation IEOR E4703: Monte-Carlo Simulation Output Analysis for Monte-Carlo Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com Output Analysis

More information

Introduction to Self-normalized Limit Theory

Introduction to Self-normalized Limit Theory Introduction to Self-normalized Limit Theory Qi-Man Shao The Chinese University of Hong Kong E-mail: qmshao@cuhk.edu.hk Outline What is the self-normalization? Why? Classical limit theorems Self-normalized

More information

Percentage point z /2

Percentage point z /2 Chapter 8: Statistical Intervals Why? point estimate is not reliable under resampling. Interval Estimates: Bounds that represent an interval of plausible values for a parameter There are three types of

More information

Business Statistics: A Decision-Making Approach 6 th Edition. Chapter Goals

Business Statistics: A Decision-Making Approach 6 th Edition. Chapter Goals Chapter 6 Student Lecture Notes 6-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter 6 Introduction to Sampling Distributions Chap 6-1 Chapter Goals To use information from the sample

More information

Tutorial on Markov Chain Monte Carlo Simulations and Their Statistical Analysis (in Fortran)

Tutorial on Markov Chain Monte Carlo Simulations and Their Statistical Analysis (in Fortran) Tutorial on Markov Chain Monte Carlo Simulations and Their Statistical Analysis (in Fortran) Bernd Berg Singapore MCMC Meeting, March 2004 Overview 1. Lecture I/II: Statistics as Needed. 2. Lecture II:

More information

Math 475. Jimin Ding. August 29, Department of Mathematics Washington University in St. Louis jmding/math475/index.

Math 475. Jimin Ding. August 29, Department of Mathematics Washington University in St. Louis   jmding/math475/index. istical A istic istics : istical Department of Mathematics Washington University in St. Louis www.math.wustl.edu/ jmding/math475/index.html August 29, 2013 istical August 29, 2013 1 / 18 istical A istic

More information

A note on multiple imputation for general purpose estimation

A note on multiple imputation for general purpose estimation A note on multiple imputation for general purpose estimation Shu Yang Jae Kwang Kim SSC meeting June 16, 2015 Shu Yang, Jae Kwang Kim Multiple Imputation June 16, 2015 1 / 32 Introduction Basic Setup Assume

More information

Introduction to Probability and Statistics (Continued)

Introduction to Probability and Statistics (Continued) Introduction to Probability and Statistics (Continued) Prof. icholas Zabaras Center for Informatics and Computational Science https://cics.nd.edu/ University of otre Dame otre Dame, Indiana, USA Email:

More information

36. Multisample U-statistics and jointly distributed U-statistics Lehmann 6.1

36. Multisample U-statistics and jointly distributed U-statistics Lehmann 6.1 36. Multisample U-statistics jointly distributed U-statistics Lehmann 6.1 In this topic, we generalize the idea of U-statistics in two different directions. First, we consider single U-statistics for situations

More information

Using R in Undergraduate and Graduate Probability and Mathematical Statistics Courses*

Using R in Undergraduate and Graduate Probability and Mathematical Statistics Courses* Using R in Undergraduate and Graduate Probability and Mathematical Statistics Courses* Amy G. Froelich Michael D. Larsen Iowa State University *The work presented in this talk was partially supported by

More information

Ch. 7. One sample hypothesis tests for µ and σ

Ch. 7. One sample hypothesis tests for µ and σ Ch. 7. One sample hypothesis tests for µ and σ Prof. Tesler Math 18 Winter 2019 Prof. Tesler Ch. 7: One sample hypoth. tests for µ, σ Math 18 / Winter 2019 1 / 23 Introduction Data Consider the SAT math

More information

Outline. Confidence intervals More parametric tests More bootstrap and randomization tests. Cohen Empirical Methods CS650

Outline. Confidence intervals More parametric tests More bootstrap and randomization tests. Cohen Empirical Methods CS650 Outline Confidence intervals More parametric tests More bootstrap and randomization tests Parameter Estimation Collect a sample to estimate the value of a population parameter. Example: estimate mean age

More information

The Analysis of Uncertainty of Climate Change by Means of SDSM Model Case Study: Kermanshah

The Analysis of Uncertainty of Climate Change by Means of SDSM Model Case Study: Kermanshah World Applied Sciences Journal 23 (1): 1392-1398, 213 ISSN 1818-4952 IDOSI Publications, 213 DOI: 1.5829/idosi.wasj.213.23.1.3152 The Analysis of Uncertainty of Climate Change by Means of SDSM Model Case

More information

Point and Interval Estimation II Bios 662

Point and Interval Estimation II Bios 662 Point and Interval Estimation II Bios 662 Michael G. Hudgens, Ph.D. mhudgens@bios.unc.edu http://www.bios.unc.edu/ mhudgens 2006-09-13 17:17 BIOS 662 1 Point and Interval Estimation II Nonparametric CI

More information

Maximum Likelihood Large Sample Theory

Maximum Likelihood Large Sample Theory Maximum Likelihood Large Sample Theory MIT 18.443 Dr. Kempthorne Spring 2015 1 Outline 1 Large Sample Theory of Maximum Likelihood Estimates 2 Asymptotic Results: Overview Asymptotic Framework Data Model

More information

L09. PARTICLE FILTERING. NA568 Mobile Robotics: Methods & Algorithms

L09. PARTICLE FILTERING. NA568 Mobile Robotics: Methods & Algorithms L09. PARTICLE FILTERING NA568 Mobile Robotics: Methods & Algorithms Particle Filters Different approach to state estimation Instead of parametric description of state (and uncertainty), use a set of state

More information

Resampling and the Bootstrap

Resampling and the Bootstrap Resampling and the Bootstrap Axel Benner Biostatistics, German Cancer Research Center INF 280, D-69120 Heidelberg benner@dkfz.de Resampling and the Bootstrap 2 Topics Estimation and Statistical Testing

More information

10/8/2014. The Multivariate Gaussian Distribution. Time-Series Plot of Tetrode Recordings. Recordings. Tetrode

10/8/2014. The Multivariate Gaussian Distribution. Time-Series Plot of Tetrode Recordings. Recordings. Tetrode 10/8/014 9.07 INTRODUCTION TO STATISTICS FOR BRAIN AND COGNITIVE SCIENCES Lecture 4 Emery N. Brown The Multivariate Gaussian Distribution Case : Probability Model for Spike Sorting The data are tetrode

More information

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing STAT 135 Lab 5 Bootstrapping and Hypothesis Testing Rebecca Barter March 2, 2015 The Bootstrap Bootstrap Suppose that we are interested in estimating a parameter θ from some population with members x 1,...,

More information