Dimension-free PAC-Bayesian bounds for the estimation of the mean of a random vector

Size: px
Start display at page:

Download "Dimension-free PAC-Bayesian bounds for the estimation of the mean of a random vector"

Transcription

1 Dimesio-free PAC-Bayesia bouds for the estimatio of the mea of a radom vector Olivier Catoi CREST CNRS UMR 9194 Uiversité Paris Saclay olivier.catoi@esae.fr Ilaria Giulii Laboratoire de Probabilités et Modèles Aléatoires Uiversité Paris Diderot giulii@math.uiv-paris-diderot.fr Abstract I this paper, we preset a ew estimator of the mea of a radom vector, computed by applyig some threshold fuctio to the orm. No asymptotic dimesio-free almost sub-gaussia bouds are proved uder weak momet assumptios, usig PAC-Bayesia iequalities. 1 Itroductio Estimatig the mea of a radom vector uder weak tail assumptios has attracted a lot of attetio recetly. A umber of properties have spurred the iterest for these ew results, where the empirical mea is replaced by a more robust estimator. Oe aspect is that it is possible to obtai a estimator with a sub-gaussia tail while assumig much weaker assumptios o the data, up to the fact of assumig oly the existece of a fiite covariace matrix. Aother appealig feature is that it is possible to obtai dimesio-free o asymptotic bouds that remai valid i a separable Hilbert space. Some importat refereces are Catoi [01] i the oe dimesioal case ad Misker [015] ad Lugosi ad Medelso [017] i the multidimesioal case. Buildig o the breakthrough of Misker [015], that uses a multidimesioal geeralizatio of the media of meas estimator, Joly et al. [017] ad Lugosi ad Medelso [017] propose successive improvemets of the media of meas approach to get a estimator with a geuie sub-gaussia dimesio-free tail boud, while still requirig oly the existece of the covariace matrix. I the mea time, the M-estimator approach of Catoi [01] has also bee geeralized to multidimesioal settigs through the use of matrix iequalities i Misker [016] ad Misker ad Wei [017]. Here we follow a differet route, based o a multidimesioal extesio of Catoi [01] usig PAC-Bayesia bouds. Our ew estimator is a simple modificatio of the empirical mea, where some threshold is applied to the orm of the sample vectors. Therefore, it is straightforward to compute, ad this is a strog poit of our approach, compared to others. Note also that we make here some compromise o the sharpess of the estimatio error boud, i order to simplify the defiitio ad computatio of the estimator. This compromise cosists i the presece of secod order terms, while the first order terms ca be made as close as desired to a true sub-gaussia boud with exact costats, as stated i Lugosi ad Medelso [017, eq. (1.1]. With a more ivolved estimator, a true sub-gaussia boud without secod order terms is possible ad will be described i a separate publicatio. Thresholdig the orm Cosider X R d, a radom vector, ad (X 1,..., X a sample made of idepedet copies of X. The questio is to estimate E(X from the sample, uder the assumptio that E ( X p <, for some p. 31st Coferece o Neural Iformatio Processig Systems (NIPS 017, Log Beach, CA, USA.

2 Cosider the threshold fuctio ψ(t = mi{t, 1}, t R +, ad for some positive real parameter λ to be chose later, itroduce the thresholded sample Y i = ψ( λ X i λ X i X i. Our estimator of m = E(X will simply be the thresholded empirical mea m = 1 Propositio.1 Itroduce the icreasig fuctios g 1 (t = 1 ( exp(t 1 ad g (t = (exp(t t t 1 t, t R, Y i. that are defied by cotiuity at t = 0 ad are such that g 1 (0 = g (0 = 1. Assume that E ( X < ad that we kow v such that sup θ S d E ( θ, X m v <, where S d = { θ R d, θ = 1 } is the uit sphere of R d. For some positive real parameter µ, put log(δ λ = µ, T = max { E ( X m, v }, av ( a = g (µ 1, b = exp(µg 1 µ av T log(δ With probability at least 1 δ, av log(δ m m + bt + if p 1 C p + if p/ p C p p/, where C p = 1 ( p ( p log(δ p/ sup E ( X p θ, X m, ad (µ av θ S d C p = 1 ( p ( p log(δ p/ E ( X p a log(δ m ( 1 + m. (µ av v Remarks 1. Note that i case E ( X < but E ( X p = for p >, we ca use the boud C 1 + C 1 log(δ (T + m + 8 log(δ µ a 7µ av E( X m ( ( a log(δ 1 log(δ (T + m 1 + m = O. v µ a Note also that if we take µ = 1/4 ad assume that δ exp(, the a 1. ad b 4. If moreover E ( X p+1 <, for some p > 1, we obtai with probability at least 1 δ that.4 v log(δ 4T m m + + C p + C p+1 p/, (p+1/ meaig that the tail distributio of m m has a sub-gaussia behavior, up to secod order terms. Remark that by takig µ small, we ca make a ad b as close as desired to 1, at the expese of the values of C p ad C p.

3 Proof The rest of the paper is devoted to the proof of Propositio.1. A elemetary computatio shows that the threshold fuctio ψ satisfies 0 1 ψ(t t p ( p p if, t R +, (1 t p 1 where o iteger values of the expoet p are allowed. Let Y = ψ( λ X X ad m = E(Y. We λ X ca decompose the estimatio error i directio θ ito θ, m m = θ, m m + 1 θ, Y i m, θ R d. ( Itroduce α = ψ( λ X λ X ad let us deal with the first term first. As 0 1 α λp X p ( p p θ, m m = E [ (α 1 θ, X ] = E [ (α 1 θ, X m ] + E(α 1 θ, m λ p ( p p if E( X p λ p ( p p θ, X m + if E ( X p θ, m, p 1 ( p ( where r = max{0, r} is the egative part of iteger r. Let us ow look at the secod term of the decompositio (. To gai uiformity i θ, we will use a PAC-Bayesia iequality ad the family of ormal distributios ρ θ = N ( θ, β I d, bearig o the parameter θ R d, where I d R d d is the idetity matrix of size d d, ad where β is a positive parameter to be chose later o. We will use the followig PAC-Bayesia iequality without recallig its proof, that is a simple cosequece of Catoi [004, eq. (5..1 page 159]: Lemma. For ay bouded measurable fuctio f : R d R d R, for ay probability measure π M+( 1 R d, for ay δ ]0, 1[, with probability at least 1 δ, for ay probability measure ρ M 1 +(R d, 1 f ( θ, X i dρ(θ [ ( log E exp ( f(θ, X ] dρ(θ + K(ρ, π + log(δ { ( log ρ/π dρ, whe ρ π, where K is the Kullback-Liebler divergece K(ρ, π = +, otherwise. Remarkig that 1 θ, Y i m = 1 θ, Y i m dρ θ (θ, usig π = ρ 0, ad takig ito accout the fact that K(ρ θ, ρ 0 = β θ /, we obtai as a cosequece of the previous lemma that with probability at least 1 δ, for ay θ S d, 1 θ, Y i m 1 ( ( log E exp µλ θ, Y m dρ θ (θ + β µλ µλ + log(δ µλ. I our settig f is ot bouded i θ, but the required extesio is valid as explaied i Catoi [004]. Sice the logarithm is cocave, ( ( log E exp µλ θ, Y m [ ( ( ] dρ θ (θ log E exp µλ θ, Y m dρ θ (θ [ ( = log E exp (µλ θ, Y m + µ λ Y m ], β where we have used the explicit expressio of the Laplace trasform of a Gaussia distributio., 3

4 To go further, remidig as a source of ispiratio the proof of Beett s iequality, let us itroduce the icreasig fuctios g 1 ad g defied i Propositio.1. These fuctios will be used to boud the expoetial fuctio by polyomials. More precisely, we will exploit the fact that whe t b, exp(t 1 + t + g (bt / ad exp(t 1 + g 1 (bt. From this, it results that if t b ad u c, exp(t + u exp(t ( 1 + g 1 (cu exp(t + g 1 (c exp(bu 1 + t + g (bt / + g 1 (c exp(bu. Legitimate values for b ad c will be deduced from the remark that λ Y 1, implyig λ m 1. Namely, i our cotext, we will use b = µ ad c = µ /β. These argumets put together lead to the iequality ( E exp (µλ θ, Y m + µ λ Y m β 1 + g (µ µ λ Replacig i the previous iequalities, we obtai E( θ, Y m + exp(µg 1 ( µ µ λ β β E ( Y m. Lemma.3 With probability at least 1 δ, for ay θ S d, θ, m m = 1 θ, Y i m g (µ µλ E( θ, Y m ( µ µλ + exp(µg 1 β β E( Y m + β + log(δ. µλ Remark that θ, Y m = θ, αx m = ( α θ, X m (1 α θ, m α θ, X m + (1 α θ, m θ, X m + (1 α θ, m. Therefore, usig iequality (1 ad the defiitio of α, E ( θ, Y m E ( θ, Y m E ( θ, X m + θ, m λ p ( p p if E ( X p. p Remark also that Y = g(x, where g is a cotractio (beig the projectio o a ball. Cosequetly E ( Y m = 1 E( Y 1 Y 1 E( X 1 X = E ( X m. I view of these remarks, the previous lemma traslates to ( ( µ Lemma.4 Let a = g µ ad b exp(µg1. β With probability at least 1 δ, for ay θ S d, θ, m m aµλ E( θ, X m + bµλ β E( X m + β + log(δ µλ + if p 1 λ p + if p ( p λ p p E ( X p θ, X m ( p p E ( X p( θ, m + aµλ θ, m. Propositio.1 follows by takig b as metioed there, λ = 1 log(δ, ad β = µ av bt log(δ T log(δ, so that the coditio o b is satisfied. av av 4

5 Refereces O. Catoi. Statistical Learig Theory ad Stochastic Optimizatio, Lectures o Probability Theory ad Statistics, École d Été de Probabilités de Sait-Flour XXXI 001, volume 1851 of Lecture Notes i Mathematics. Spriger, 004. pages O. Catoi. Challegig the empirical mea ad empirical variace: a deviatio study. A. Ist. Heri Poicaré, 48(4: , 01. E. Joly, G. Lugosi, ad R. I. Oliveira. O the estimatio of the mea of a radom vector. Electroic Joural of Statistics, 11: , 017. G. Lugosi ad S. Medelso. Sub-gaussia estimators of the mea of a radom vector. Aals of Statistics, to appear, 017. S. Misker. Geometric media ad robust estimatio i Baach spaces. Beroulli, 4: , 015. S. Misker. Sub-Gaussia estimators of the mea of a radom matrix with heavy-tailed etries. Aals of Statistics, to appear, 016. S. Misker ad X. Wei. Estimatio of the covariace structure of heavy-tailed distributios. I NIPS 017, to appear,

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak

More information

A survey on penalized empirical risk minimization Sara A. van de Geer

A survey on penalized empirical risk minimization Sara A. van de Geer A survey o pealized empirical risk miimizatio Sara A. va de Geer We address the questio how to choose the pealty i empirical risk miimizatio. Roughly speakig, this pealty should be a good boud for the

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

Chapter 5. Inequalities. 5.1 The Markov and Chebyshev inequalities

Chapter 5. Inequalities. 5.1 The Markov and Chebyshev inequalities Chapter 5 Iequalities 5.1 The Markov ad Chebyshev iequalities As you have probably see o today s frot page: every perso i the upper teth percetile ears at least 1 times more tha the average salary. I other

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/5.070J Fall 203 Lecture 3 9//203 Large deviatios Theory. Cramér s Theorem Cotet.. Cramér s Theorem. 2. Rate fuctio ad properties. 3. Chage of measure techique.

More information

Exponential Families and Bayesian Inference

Exponential Families and Bayesian Inference Computer Visio Expoetial Families ad Bayesia Iferece Lecture Expoetial Families A expoetial family of distributios is a d-parameter family f(x; havig the followig form: f(x; = h(xe g(t T (x B(, (. where

More information

Asymptotic distribution of products of sums of independent random variables

Asymptotic distribution of products of sums of independent random variables Proc. Idia Acad. Sci. Math. Sci. Vol. 3, No., May 03, pp. 83 9. c Idia Academy of Scieces Asymptotic distributio of products of sums of idepedet radom variables YANLING WANG, SUXIA YAO ad HONGXIA DU ollege

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit theorems Throughout this sectio we will assume a probability space (Ω, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

Law of the sum of Bernoulli random variables

Law of the sum of Bernoulli random variables Law of the sum of Beroulli radom variables Nicolas Chevallier Uiversité de Haute Alsace, 4, rue des frères Lumière 68093 Mulhouse icolas.chevallier@uha.fr December 006 Abstract Let be the set of all possible

More information

On the convergence rates of Gladyshev s Hurst index estimator

On the convergence rates of Gladyshev s Hurst index estimator Noliear Aalysis: Modellig ad Cotrol, 2010, Vol 15, No 4, 445 450 O the covergece rates of Gladyshev s Hurst idex estimator K Kubilius 1, D Melichov 2 1 Istitute of Mathematics ad Iformatics, Vilius Uiversity

More information

Rademacher Complexity

Rademacher Complexity EECS 598: Statistical Learig Theory, Witer 204 Topic 0 Rademacher Complexity Lecturer: Clayto Scott Scribe: Ya Deg, Kevi Moo Disclaimer: These otes have ot bee subjected to the usual scrutiy reserved for

More information

Lecture 3: August 31

Lecture 3: August 31 36-705: Itermediate Statistics Fall 018 Lecturer: Siva Balakrisha Lecture 3: August 31 This lecture will be mostly a summary of other useful expoetial tail bouds We will ot prove ay of these i lecture,

More information

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Convergence of random variables. (telegram style notes) P.J.C. Spreij Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space

More information

This section is optional.

This section is optional. 4 Momet Geeratig Fuctios* This sectio is optioal. The momet geeratig fuctio g : R R of a radom variable X is defied as g(t) = E[e tx ]. Propositio 1. We have g () (0) = E[X ] for = 1, 2,... Proof. Therefore

More information

Lecture 19: Convergence

Lecture 19: Convergence Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may

More information

Precise Rates in Complete Moment Convergence for Negatively Associated Sequences

Precise Rates in Complete Moment Convergence for Negatively Associated Sequences Commuicatios of the Korea Statistical Society 29, Vol. 16, No. 5, 841 849 Precise Rates i Complete Momet Covergece for Negatively Associated Sequeces Dae-Hee Ryu 1,a a Departmet of Computer Sciece, ChugWoo

More information

The random version of Dvoretzky s theorem in l n

The random version of Dvoretzky s theorem in l n The radom versio of Dvoretzky s theorem i l Gideo Schechtma Abstract We show that with high probability a sectio of the l ball of dimesio k cε log c > 0 a uiversal costat) is ε close to a multiple of the

More information

Lecture 33: Bootstrap

Lecture 33: Bootstrap Lecture 33: ootstrap Motivatio To evaluate ad compare differet estimators, we eed cosistet estimators of variaces or asymptotic variaces of estimators. This is also importat for hypothesis testig ad cofidece

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

An Introduction to Randomized Algorithms

An Introduction to Randomized Algorithms A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis

More information

Regression with quadratic loss

Regression with quadratic loss Regressio with quadratic loss Maxim Ragisky October 13, 2015 Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X,Y, where, as before,

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013 MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013 Fuctioal Law of Large Numbers. Costructio of the Wieer Measure Cotet. 1. Additioal techical results o weak covergece

More information

Self-normalized deviation inequalities with application to t-statistic

Self-normalized deviation inequalities with application to t-statistic Self-ormalized deviatio iequalities with applicatio to t-statistic Xiequa Fa Ceter for Applied Mathematics, Tiaji Uiversity, 30007 Tiaji, Chia Abstract Let ξ i i 1 be a sequece of idepedet ad symmetric

More information

ON THE DELOCALIZED PHASE OF THE RANDOM PINNING MODEL

ON THE DELOCALIZED PHASE OF THE RANDOM PINNING MODEL O THE DELOCALIZED PHASE OF THE RADOM PIIG MODEL JEA-CHRISTOPHE MOURRAT Abstract. We cosider the model of a directed polymer pied to a lie of i.i.d. radom charges, ad focus o the iterior of the delocalized

More information

arxiv: v1 [math.pr] 4 Dec 2013

arxiv: v1 [math.pr] 4 Dec 2013 Squared-Norm Empirical Process i Baach Space arxiv:32005v [mathpr] 4 Dec 203 Vicet Q Vu Departmet of Statistics The Ohio State Uiversity Columbus, OH vqv@statosuedu Abstract Jig Lei Departmet of Statistics

More information

REGRESSION WITH QUADRATIC LOSS

REGRESSION WITH QUADRATIC LOSS REGRESSION WITH QUADRATIC LOSS MAXIM RAGINSKY Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X, Y ), where, as before, X is a R d

More information

Erratum to: An empirical central limit theorem for intermittent maps

Erratum to: An empirical central limit theorem for intermittent maps Probab. Theory Relat. Fields (2013) 155:487 491 DOI 10.1007/s00440-011-0393-0 ERRATUM Erratum to: A empirical cetral limit theorem for itermittet maps J. Dedecker Published olie: 25 October 2011 Spriger-Verlag

More information

Empirical Process Theory and Oracle Inequalities

Empirical Process Theory and Oracle Inequalities Stat 928: Statistical Learig Theory Lecture: 10 Empirical Process Theory ad Oracle Iequalities Istructor: Sham Kakade 1 Risk vs Risk See Lecture 0 for a discussio o termiology. 2 The Uio Boud / Boferoi

More information

Learning Theory: Lecture Notes

Learning Theory: Lecture Notes Learig Theory: Lecture Notes Kamalika Chaudhuri October 4, 0 Cocetratio of Averages Cocetratio of measure is very useful i showig bouds o the errors of machie-learig algorithms. We will begi with a basic

More information

The standard deviation of the mean

The standard deviation of the mean Physics 6C Fall 20 The stadard deviatio of the mea These otes provide some clarificatio o the distictio betwee the stadard deviatio ad the stadard deviatio of the mea.. The sample mea ad variace Cosider

More information

Notes 19 : Martingale CLT

Notes 19 : Martingale CLT Notes 9 : Martigale CLT Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces: [Bil95, Chapter 35], [Roc, Chapter 3]. Sice we have ot ecoutered weak covergece i some time, we first recall

More information

Supplemental Material: Proofs

Supplemental Material: Proofs Proof to Theorem Supplemetal Material: Proofs Proof. Let be the miimal umber of traiig items to esure a uique solutio θ. First cosider the case. It happes if ad oly if θ ad Rak(A) d, which is a special

More information

Glivenko-Cantelli Classes

Glivenko-Cantelli Classes CS28B/Stat24B (Sprig 2008 Statistical Learig Theory Lecture: 4 Gliveko-Catelli Classes Lecturer: Peter Bartlett Scribe: Michelle Besi Itroductio This lecture will cover Gliveko-Catelli (GC classes ad itroduce

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio

More information

On the estimation of the mean of a random vector

On the estimation of the mean of a random vector O the estimatio of the mea of a radom vector Emilie Joly Uiversit Paris Ouest Naterre, Frace; emilie.joly@u-paris10.fr Gábor Lugosi ICREA ad Departmet of Ecoomics, Pompeu Fabra Uiversity, Barceloa, Spai;

More information

Lecture 19. sup y 1,..., yn B d n

Lecture 19. sup y 1,..., yn B d n STAT 06A: Polyomials of adom Variables Lecture date: Nov Lecture 19 Grothedieck s Iequality Scribe: Be Hough The scribes are based o a guest lecture by ya O Doell. I this lecture we prove Grothedieck s

More information

1 Review and Overview

1 Review and Overview DRAFT a fial versio will be posted shortly CS229T/STATS231: Statistical Learig Theory Lecturer: Tegyu Ma Lecture #3 Scribe: Migda Qiao October 1, 2013 1 Review ad Overview I the first half of this course,

More information

Lecture 12: September 27

Lecture 12: September 27 36-705: Itermediate Statistics Fall 207 Lecturer: Siva Balakrisha Lecture 2: September 27 Today we will discuss sufficiecy i more detail ad the begi to discuss some geeral strategies for costructig estimators.

More information

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015 ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],

More information

Notes 5 : More on the a.s. convergence of sums

Notes 5 : More on the a.s. convergence of sums Notes 5 : More o the a.s. covergece of sums Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces: Dur0, Sectios.5; Wil9, Sectio 4.7, Shi96, Sectio IV.4, Dur0, Sectio.. Radom series. Three-series

More information

Detailed proofs of Propositions 3.1 and 3.2

Detailed proofs of Propositions 3.1 and 3.2 Detailed proofs of Propositios 3. ad 3. Proof of Propositio 3. NB: itegratio sets are geerally omitted for itegrals defied over a uit hypercube [0, s with ay s d. We first give four lemmas. The proof of

More information

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio

More information

1 Introduction to reducing variance in Monte Carlo simulations

1 Introduction to reducing variance in Monte Carlo simulations Copyright c 010 by Karl Sigma 1 Itroductio to reducig variace i Mote Carlo simulatios 11 Review of cofidece itervals for estimatig a mea I statistics, we estimate a ukow mea µ = E(X) of a distributio by

More information

1 = δ2 (0, ), Y Y n nδ. , T n = Y Y n n. ( U n,k + X ) ( f U n,k + Y ) n 2n f U n,k + θ Y ) 2 E X1 2 X1

1 = δ2 (0, ), Y Y n nδ. , T n = Y Y n n. ( U n,k + X ) ( f U n,k + Y ) n 2n f U n,k + θ Y ) 2 E X1 2 X1 8. The cetral limit theorems 8.1. The cetral limit theorem for i.i.d. sequeces. ecall that C ( is N -separatig. Theorem 8.1. Let X 1, X,... be i.i.d. radom variables with EX 1 = ad EX 1 = σ (,. Suppose

More information

2.2. Central limit theorem.

2.2. Central limit theorem. 36.. Cetral limit theorem. The most ideal case of the CLT is that the radom variables are iid with fiite variace. Although it is a special case of the more geeral Lideberg-Feller CLT, it is most stadard

More information

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

5 Birkhoff s Ergodic Theorem

5 Birkhoff s Ergodic Theorem 5 Birkhoff s Ergodic Theorem Amog the most useful of the various geeralizatios of KolmogorovâĂŹs strog law of large umbers are the ergodic theorems of Birkhoff ad Kigma, which exted the validity of the

More information

ON POINTWISE BINOMIAL APPROXIMATION

ON POINTWISE BINOMIAL APPROXIMATION Iteratioal Joural of Pure ad Applied Mathematics Volume 71 No. 1 2011, 57-66 ON POINTWISE BINOMIAL APPROXIMATION BY w-functions K. Teerapabolar 1, P. Wogkasem 2 Departmet of Mathematics Faculty of Sciece

More information

Central limit theorem and almost sure central limit theorem for the product of some partial sums

Central limit theorem and almost sure central limit theorem for the product of some partial sums Proc. Idia Acad. Sci. Math. Sci. Vol. 8, No. 2, May 2008, pp. 289 294. Prited i Idia Cetral it theorem ad almost sure cetral it theorem for the product of some partial sums YU MIAO College of Mathematics

More information

The log-behavior of n p(n) and n p(n)/n

The log-behavior of n p(n) and n p(n)/n Ramauja J. 44 017, 81-99 The log-behavior of p ad p/ William Y.C. Che 1 ad Ke Y. Zheg 1 Ceter for Applied Mathematics Tiaji Uiversity Tiaji 0007, P. R. Chia Ceter for Combiatorics, LPMC Nakai Uivercity

More information

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1 EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum

More information

THE KALMAN FILTER RAUL ROJAS

THE KALMAN FILTER RAUL ROJAS THE KALMAN FILTER RAUL ROJAS Abstract. This paper provides a getle itroductio to the Kalma filter, a umerical method that ca be used for sesor fusio or for calculatio of trajectories. First, we cosider

More information

LECTURE 8: ASYMPTOTICS I

LECTURE 8: ASYMPTOTICS I LECTURE 8: ASYMPTOTICS I We are iterested i the properties of estimators as. Cosider a sequece of radom variables {, X 1}. N. M. Kiefer, Corell Uiversity, Ecoomics 60 1 Defiitio: (Weak covergece) A sequece

More information

1 Convergence in Probability and the Weak Law of Large Numbers

1 Convergence in Probability and the Weak Law of Large Numbers 36-752 Advaced Probability Overview Sprig 2018 8. Covergece Cocepts: i Probability, i L p ad Almost Surely Istructor: Alessadro Rialdo Associated readig: Sec 2.4, 2.5, ad 4.11 of Ash ad Doléas-Dade; Sec

More information

Sieve Estimators: Consistency and Rates of Convergence

Sieve Estimators: Consistency and Rates of Convergence EECS 598: Statistical Learig Theory, Witer 2014 Topic 6 Sieve Estimators: Cosistecy ad Rates of Covergece Lecturer: Clayto Scott Scribe: Julia Katz-Samuels, Brado Oselio, Pi-Yu Che Disclaimer: These otes

More information

Advanced Stochastic Processes.

Advanced Stochastic Processes. Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios.

More information

On Weak and Strong Convergence Theorems for a Finite Family of Nonself I-asymptotically Nonexpansive Mappings

On Weak and Strong Convergence Theorems for a Finite Family of Nonself I-asymptotically Nonexpansive Mappings Mathematica Moravica Vol. 19-2 2015, 49 64 O Weak ad Strog Covergece Theorems for a Fiite Family of Noself I-asymptotically Noexpasive Mappigs Birol Güdüz ad Sezgi Akbulut Abstract. We prove the weak ad

More information

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i

More information

Lecture 3. Properties of Summary Statistics: Sampling Distribution

Lecture 3. Properties of Summary Statistics: Sampling Distribution Lecture 3 Properties of Summary Statistics: Samplig Distributio Mai Theme How ca we use math to justify that our umerical summaries from the sample are good summaries of the populatio? Lecture Summary

More information

Entropy and Ergodic Theory Lecture 5: Joint typicality and conditional AEP

Entropy and Ergodic Theory Lecture 5: Joint typicality and conditional AEP Etropy ad Ergodic Theory Lecture 5: Joit typicality ad coditioal AEP 1 Notatio: from RVs back to distributios Let (Ω, F, P) be a probability space, ad let X ad Y be A- ad B-valued discrete RVs, respectively.

More information

arxiv: v1 [math.pr] 13 Oct 2011

arxiv: v1 [math.pr] 13 Oct 2011 A tail iequality for quadratic forms of subgaussia radom vectors Daiel Hsu, Sham M. Kakade,, ad Tog Zhag 3 arxiv:0.84v math.pr] 3 Oct 0 Microsoft Research New Eglad Departmet of Statistics, Wharto School,

More information

EE 4TM4: Digital Communications II Probability Theory

EE 4TM4: Digital Communications II Probability Theory 1 EE 4TM4: Digital Commuicatios II Probability Theory I. RANDOM VARIABLES A radom variable is a real-valued fuctio defied o the sample space. Example: Suppose that our experimet cosists of tossig two fair

More information

( θ. sup θ Θ f X (x θ) = L. sup Pr (Λ (X) < c) = α. x : Λ (x) = sup θ H 0. sup θ Θ f X (x θ) = ) < c. NH : θ 1 = θ 2 against AH : θ 1 θ 2

( θ. sup θ Θ f X (x θ) = L. sup Pr (Λ (X) < c) = α. x : Λ (x) = sup θ H 0. sup θ Θ f X (x θ) = ) < c. NH : θ 1 = θ 2 against AH : θ 1 θ 2 82 CHAPTER 4. MAXIMUM IKEIHOOD ESTIMATION Defiitio: et X be a radom sample with joit p.m/d.f. f X x θ. The geeralised likelihood ratio test g.l.r.t. of the NH : θ H 0 agaist the alterative AH : θ H 1,

More information

An almost sure invariance principle for trimmed sums of random vectors

An almost sure invariance principle for trimmed sums of random vectors Proc. Idia Acad. Sci. Math. Sci. Vol. 20, No. 5, November 200, pp. 6 68. Idia Academy of Scieces A almost sure ivariace priciple for trimmed sums of radom vectors KE-ANG FU School of Statistics ad Mathematics,

More information

Element sampling: Part 2

Element sampling: Part 2 Chapter 4 Elemet samplig: Part 2 4.1 Itroductio We ow cosider uequal probability samplig desigs which is very popular i practice. I the uequal probability samplig, we ca improve the efficiecy of the resultig

More information

LONG SNAKES IN POWERS OF THE COMPLETE GRAPH WITH AN ODD NUMBER OF VERTICES

LONG SNAKES IN POWERS OF THE COMPLETE GRAPH WITH AN ODD NUMBER OF VERTICES J Lodo Math Soc (2 50, (1994, 465 476 LONG SNAKES IN POWERS OF THE COMPLETE GRAPH WITH AN ODD NUMBER OF VERTICES Jerzy Wojciechowski Abstract I [5] Abbott ad Katchalski ask if there exists a costat c >

More information

Lecture 4. We also define the set of possible values for the random walk as the set of all x R d such that P(S n = x) > 0 for some n.

Lecture 4. We also define the set of possible values for the random walk as the set of all x R d such that P(S n = x) > 0 for some n. Radom Walks ad Browia Motio Tel Aviv Uiversity Sprig 20 Lecture date: Mar 2, 20 Lecture 4 Istructor: Ro Peled Scribe: Lira Rotem This lecture deals primarily with recurrece for geeral radom walks. We preset

More information

1 6 = 1 6 = + Factorials and Euler s Gamma function

1 6 = 1 6 = + Factorials and Euler s Gamma function Royal Holloway Uiversity of Lodo Departmet of Physics Factorials ad Euler s Gamma fuctio Itroductio The is a self-cotaied part of the course dealig, essetially, with the factorial fuctio ad its geeralizatio

More information

Lecture 2. The Lovász Local Lemma

Lecture 2. The Lovász Local Lemma Staford Uiversity Sprig 208 Math 233A: No-costructive methods i combiatorics Istructor: Ja Vodrák Lecture date: Jauary 0, 208 Origial scribe: Apoorva Khare Lecture 2. The Lovász Local Lemma 2. Itroductio

More information

Machine Learning for Data Science (CS 4786)

Machine Learning for Data Science (CS 4786) Machie Learig for Data Sciece CS 4786) Lecture & 3: Pricipal Compoet Aalysis The text i black outlies high level ideas. The text i blue provides simple mathematical details to derive or get to the algorithm

More information

Chapter 7 Isoperimetric problem

Chapter 7 Isoperimetric problem Chapter 7 Isoperimetric problem Recall that the isoperimetric problem (see the itroductio its coectio with ido s proble) is oe of the most classical problem of a shape optimizatio. It ca be formulated

More information

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4. 4. BASES I BAACH SPACES 39 4. BASES I BAACH SPACES Sice a Baach space X is a vector space, it must possess a Hamel, or vector space, basis, i.e., a subset {x γ } γ Γ whose fiite liear spa is all of X ad

More information

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 016 MODULE : Statistical Iferece Time allowed: Three hours Cadidates should aswer FIVE questios. All questios carry equal marks. The umber

More information

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator Ecoomics 24B Relatio to Method of Momets ad Maximum Likelihood OLSE as a Maximum Likelihood Estimator Uder Assumptio 5 we have speci ed the distributio of the error, so we ca estimate the model parameters

More information

Sequences and Series of Functions

Sequences and Series of Functions Chapter 6 Sequeces ad Series of Fuctios 6.1. Covergece of a Sequece of Fuctios Poitwise Covergece. Defiitio 6.1. Let, for each N, fuctio f : A R be defied. If, for each x A, the sequece (f (x)) coverges

More information

A Note on Sums of Independent Random Variables

A Note on Sums of Independent Random Variables Cotemorary Mathematics Volume 00 XXXX A Note o Sums of Ideedet Radom Variables Pawe l Hitczeko ad Stehe Motgomery-Smith Abstract I this ote a two sided boud o the tail robability of sums of ideedet ad

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013 Large Deviatios for i.i.d. Radom Variables Cotet. Cheroff boud usig expoetial momet geeratig fuctios. Properties of a momet

More information

Machine Learning Theory (CS 6783)

Machine Learning Theory (CS 6783) Machie Learig Theory (CS 6783) Lecture 2 : Learig Frameworks, Examples Settig up learig problems. X : istace space or iput space Examples: Computer Visio: Raw M N image vectorized X = 0, 255 M N, SIFT

More information

Estimation of the essential supremum of a regression function

Estimation of the essential supremum of a regression function Estimatio of the essetial supremum of a regressio fuctio Michael ohler, Adam rzyżak 2, ad Harro Walk 3 Fachbereich Mathematik, Techische Uiversität Darmstadt, Schlossgartestr. 7, 64289 Darmstadt, Germay,

More information

The natural exponential function

The natural exponential function The atural expoetial fuctio Attila Máté Brookly College of the City Uiversity of New York December, 205 Cotets The atural expoetial fuctio for real x. Beroulli s iequality.....................................2

More information

Math Solutions to homework 6

Math Solutions to homework 6 Math 175 - Solutios to homework 6 Cédric De Groote November 16, 2017 Problem 1 (8.11 i the book): Let K be a compact Hermitia operator o a Hilbert space H ad let the kerel of K be {0}. Show that there

More information

Clases 7-8: Métodos de reducción de varianza en Monte Carlo *

Clases 7-8: Métodos de reducción de varianza en Monte Carlo * Clases 7-8: Métodos de reducció de variaza e Mote Carlo * 9 de septiembre de 27 Ídice. Variace reductio 2. Atithetic variates 2 2.. Example: Uiform radom variables................ 3 2.2. Example: Tail

More information

ECE 330:541, Stochastic Signals and Systems Lecture Notes on Limit Theorems from Probability Fall 2002

ECE 330:541, Stochastic Signals and Systems Lecture Notes on Limit Theorems from Probability Fall 2002 ECE 330:541, Stochastic Sigals ad Systems Lecture Notes o Limit Theorems from robability Fall 00 I practice, there are two ways we ca costruct a ew sequece of radom variables from a old sequece of radom

More information

Lecture 7: Properties of Random Samples

Lecture 7: Properties of Random Samples Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ

More information

Math 2784 (or 2794W) University of Connecticut

Math 2784 (or 2794W) University of Connecticut ORDERS OF GROWTH PAT SMITH Math 2784 (or 2794W) Uiversity of Coecticut Date: Mar. 2, 22. ORDERS OF GROWTH. Itroductio Gaiig a ituitive feel for the relative growth of fuctios is importat if you really

More information

Series III. Chapter Alternating Series

Series III. Chapter Alternating Series Chapter 9 Series III With the exceptio of the Null Sequece Test, all the tests for series covergece ad divergece that we have cosidered so far have dealt oly with series of oegative terms. Series with

More information

Notes 27 : Brownian motion: path properties

Notes 27 : Brownian motion: path properties Notes 27 : Browia motio: path properties Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces:[Dur10, Sectio 8.1], [MP10, Sectio 1.1, 1.2, 1.3]. Recall: DEF 27.1 (Covariace) Let X = (X

More information

Lecture Stat Maximum Likelihood Estimation

Lecture Stat Maximum Likelihood Estimation Lecture Stat 461-561 Maximum Likelihood Estimatio A.D. Jauary 2008 A.D. () Jauary 2008 1 / 63 Maximum Likelihood Estimatio Ivariace Cosistecy E ciecy Nuisace Parameters A.D. () Jauary 2008 2 / 63 Parametric

More information

Lecture 3 : Random variables and their distributions

Lecture 3 : Random variables and their distributions Lecture 3 : Radom variables ad their distributios 3.1 Radom variables Let (Ω, F) ad (S, S) be two measurable spaces. A map X : Ω S is measurable or a radom variable (deoted r.v.) if X 1 (A) {ω : X(ω) A}

More information

3. Z Transform. Recall that the Fourier transform (FT) of a DT signal xn [ ] is ( ) [ ] = In order for the FT to exist in the finite magnitude sense,

3. Z Transform. Recall that the Fourier transform (FT) of a DT signal xn [ ] is ( ) [ ] = In order for the FT to exist in the finite magnitude sense, 3. Z Trasform Referece: Etire Chapter 3 of text. Recall that the Fourier trasform (FT) of a DT sigal x [ ] is ω ( ) [ ] X e = j jω k = xe I order for the FT to exist i the fiite magitude sese, S = x [

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week Lecture: Cocept Check Exercises Starred problems are optioal. Statistical Learig Theory. Suppose A = Y = R ad X is some other set. Furthermore, assume P X Y is a discrete

More information

Estimation for Complete Data

Estimation for Complete Data Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of

More information

arxiv: v1 [math.st] 17 Apr 2015

arxiv: v1 [math.st] 17 Apr 2015 Robust estimatio of U-statistics arxiv:1504.04580v1 [math.st] 17 Apr 2015 Emilie Joly Gábor Lugosi April 20, 2015 This paper is dedicated to the memory of Evarist Gié. Abstract A importat part of the legacy

More information

Monte Carlo Integration

Monte Carlo Integration Mote Carlo Itegratio I these otes we first review basic umerical itegratio methods (usig Riema approximatio ad the trapezoidal rule) ad their limitatios for evaluatig multidimesioal itegrals. Next we itroduce

More information

Quantile regression with multilayer perceptrons.

Quantile regression with multilayer perceptrons. Quatile regressio with multilayer perceptros. S.-F. Dimby ad J. Rykiewicz Uiversite Paris 1 - SAMM 90 Rue de Tolbiac, 75013 Paris - Frace Abstract. We cosider oliear quatile regressio ivolvig multilayer

More information

5.1 Review of Singular Value Decomposition (SVD)

5.1 Review of Singular Value Decomposition (SVD) MGMT 69000: Topics i High-dimesioal Data Aalysis Falll 06 Lecture 5: Spectral Clusterig: Overview (cotd) ad Aalysis Lecturer: Jiamig Xu Scribe: Adarsh Barik, Taotao He, September 3, 06 Outlie Review of

More information

Supplementary Materials for Statistical-Computational Phase Transitions in Planted Models: The High-Dimensional Setting

Supplementary Materials for Statistical-Computational Phase Transitions in Planted Models: The High-Dimensional Setting Supplemetary Materials for Statistical-Computatioal Phase Trasitios i Plated Models: The High-Dimesioal Settig Yudog Che The Uiversity of Califoria, Berkeley yudog.che@eecs.berkeley.edu Jiamig Xu Uiversity

More information

ON MEAN ERGODIC CONVERGENCE IN THE CALKIN ALGEBRAS

ON MEAN ERGODIC CONVERGENCE IN THE CALKIN ALGEBRAS PROCEEDINGS OF THE AMERICAN MATHEMATICAL SOCIETY Volume 00, Number 0, Pages 000 000 S 0002-9939(XX0000-0 ON MEAN ERGODIC CONVERGENCE IN THE CALKIN ALGEBRAS MARCH T. BOEDIHARDJO AND WILLIAM B. JOHNSON 2

More information

STAT Homework 1 - Solutions

STAT Homework 1 - Solutions STAT-36700 Homework 1 - Solutios Fall 018 September 11, 018 This cotais solutios for Homework 1. Please ote that we have icluded several additioal commets ad approaches to the problems to give you better

More information

Lecture 27. Capacity of additive Gaussian noise channel and the sphere packing bound

Lecture 27. Capacity of additive Gaussian noise channel and the sphere packing bound Lecture 7 Ageda for the lecture Gaussia chael with average power costraits Capacity of additive Gaussia oise chael ad the sphere packig boud 7. Additive Gaussia oise chael Up to this poit, we have bee

More information