Dimension-free PAC-Bayesian bounds for the estimation of the mean of a random vector
|
|
- Lindsay Hubbard
- 6 years ago
- Views:
Transcription
1 Dimesio-free PAC-Bayesia bouds for the estimatio of the mea of a radom vector Olivier Catoi CREST CNRS UMR 9194 Uiversité Paris Saclay olivier.catoi@esae.fr Ilaria Giulii Laboratoire de Probabilités et Modèles Aléatoires Uiversité Paris Diderot giulii@math.uiv-paris-diderot.fr Abstract I this paper, we preset a ew estimator of the mea of a radom vector, computed by applyig some threshold fuctio to the orm. No asymptotic dimesio-free almost sub-gaussia bouds are proved uder weak momet assumptios, usig PAC-Bayesia iequalities. 1 Itroductio Estimatig the mea of a radom vector uder weak tail assumptios has attracted a lot of attetio recetly. A umber of properties have spurred the iterest for these ew results, where the empirical mea is replaced by a more robust estimator. Oe aspect is that it is possible to obtai a estimator with a sub-gaussia tail while assumig much weaker assumptios o the data, up to the fact of assumig oly the existece of a fiite covariace matrix. Aother appealig feature is that it is possible to obtai dimesio-free o asymptotic bouds that remai valid i a separable Hilbert space. Some importat refereces are Catoi [01] i the oe dimesioal case ad Misker [015] ad Lugosi ad Medelso [017] i the multidimesioal case. Buildig o the breakthrough of Misker [015], that uses a multidimesioal geeralizatio of the media of meas estimator, Joly et al. [017] ad Lugosi ad Medelso [017] propose successive improvemets of the media of meas approach to get a estimator with a geuie sub-gaussia dimesio-free tail boud, while still requirig oly the existece of the covariace matrix. I the mea time, the M-estimator approach of Catoi [01] has also bee geeralized to multidimesioal settigs through the use of matrix iequalities i Misker [016] ad Misker ad Wei [017]. Here we follow a differet route, based o a multidimesioal extesio of Catoi [01] usig PAC-Bayesia bouds. Our ew estimator is a simple modificatio of the empirical mea, where some threshold is applied to the orm of the sample vectors. Therefore, it is straightforward to compute, ad this is a strog poit of our approach, compared to others. Note also that we make here some compromise o the sharpess of the estimatio error boud, i order to simplify the defiitio ad computatio of the estimator. This compromise cosists i the presece of secod order terms, while the first order terms ca be made as close as desired to a true sub-gaussia boud with exact costats, as stated i Lugosi ad Medelso [017, eq. (1.1]. With a more ivolved estimator, a true sub-gaussia boud without secod order terms is possible ad will be described i a separate publicatio. Thresholdig the orm Cosider X R d, a radom vector, ad (X 1,..., X a sample made of idepedet copies of X. The questio is to estimate E(X from the sample, uder the assumptio that E ( X p <, for some p. 31st Coferece o Neural Iformatio Processig Systems (NIPS 017, Log Beach, CA, USA.
2 Cosider the threshold fuctio ψ(t = mi{t, 1}, t R +, ad for some positive real parameter λ to be chose later, itroduce the thresholded sample Y i = ψ( λ X i λ X i X i. Our estimator of m = E(X will simply be the thresholded empirical mea m = 1 Propositio.1 Itroduce the icreasig fuctios g 1 (t = 1 ( exp(t 1 ad g (t = (exp(t t t 1 t, t R, Y i. that are defied by cotiuity at t = 0 ad are such that g 1 (0 = g (0 = 1. Assume that E ( X < ad that we kow v such that sup θ S d E ( θ, X m v <, where S d = { θ R d, θ = 1 } is the uit sphere of R d. For some positive real parameter µ, put log(δ λ = µ, T = max { E ( X m, v }, av ( a = g (µ 1, b = exp(µg 1 µ av T log(δ With probability at least 1 δ, av log(δ m m + bt + if p 1 C p + if p/ p C p p/, where C p = 1 ( p ( p log(δ p/ sup E ( X p θ, X m, ad (µ av θ S d C p = 1 ( p ( p log(δ p/ E ( X p a log(δ m ( 1 + m. (µ av v Remarks 1. Note that i case E ( X < but E ( X p = for p >, we ca use the boud C 1 + C 1 log(δ (T + m + 8 log(δ µ a 7µ av E( X m ( ( a log(δ 1 log(δ (T + m 1 + m = O. v µ a Note also that if we take µ = 1/4 ad assume that δ exp(, the a 1. ad b 4. If moreover E ( X p+1 <, for some p > 1, we obtai with probability at least 1 δ that.4 v log(δ 4T m m + + C p + C p+1 p/, (p+1/ meaig that the tail distributio of m m has a sub-gaussia behavior, up to secod order terms. Remark that by takig µ small, we ca make a ad b as close as desired to 1, at the expese of the values of C p ad C p.
3 Proof The rest of the paper is devoted to the proof of Propositio.1. A elemetary computatio shows that the threshold fuctio ψ satisfies 0 1 ψ(t t p ( p p if, t R +, (1 t p 1 where o iteger values of the expoet p are allowed. Let Y = ψ( λ X X ad m = E(Y. We λ X ca decompose the estimatio error i directio θ ito θ, m m = θ, m m + 1 θ, Y i m, θ R d. ( Itroduce α = ψ( λ X λ X ad let us deal with the first term first. As 0 1 α λp X p ( p p θ, m m = E [ (α 1 θ, X ] = E [ (α 1 θ, X m ] + E(α 1 θ, m λ p ( p p if E( X p λ p ( p p θ, X m + if E ( X p θ, m, p 1 ( p ( where r = max{0, r} is the egative part of iteger r. Let us ow look at the secod term of the decompositio (. To gai uiformity i θ, we will use a PAC-Bayesia iequality ad the family of ormal distributios ρ θ = N ( θ, β I d, bearig o the parameter θ R d, where I d R d d is the idetity matrix of size d d, ad where β is a positive parameter to be chose later o. We will use the followig PAC-Bayesia iequality without recallig its proof, that is a simple cosequece of Catoi [004, eq. (5..1 page 159]: Lemma. For ay bouded measurable fuctio f : R d R d R, for ay probability measure π M+( 1 R d, for ay δ ]0, 1[, with probability at least 1 δ, for ay probability measure ρ M 1 +(R d, 1 f ( θ, X i dρ(θ [ ( log E exp ( f(θ, X ] dρ(θ + K(ρ, π + log(δ { ( log ρ/π dρ, whe ρ π, where K is the Kullback-Liebler divergece K(ρ, π = +, otherwise. Remarkig that 1 θ, Y i m = 1 θ, Y i m dρ θ (θ, usig π = ρ 0, ad takig ito accout the fact that K(ρ θ, ρ 0 = β θ /, we obtai as a cosequece of the previous lemma that with probability at least 1 δ, for ay θ S d, 1 θ, Y i m 1 ( ( log E exp µλ θ, Y m dρ θ (θ + β µλ µλ + log(δ µλ. I our settig f is ot bouded i θ, but the required extesio is valid as explaied i Catoi [004]. Sice the logarithm is cocave, ( ( log E exp µλ θ, Y m [ ( ( ] dρ θ (θ log E exp µλ θ, Y m dρ θ (θ [ ( = log E exp (µλ θ, Y m + µ λ Y m ], β where we have used the explicit expressio of the Laplace trasform of a Gaussia distributio., 3
4 To go further, remidig as a source of ispiratio the proof of Beett s iequality, let us itroduce the icreasig fuctios g 1 ad g defied i Propositio.1. These fuctios will be used to boud the expoetial fuctio by polyomials. More precisely, we will exploit the fact that whe t b, exp(t 1 + t + g (bt / ad exp(t 1 + g 1 (bt. From this, it results that if t b ad u c, exp(t + u exp(t ( 1 + g 1 (cu exp(t + g 1 (c exp(bu 1 + t + g (bt / + g 1 (c exp(bu. Legitimate values for b ad c will be deduced from the remark that λ Y 1, implyig λ m 1. Namely, i our cotext, we will use b = µ ad c = µ /β. These argumets put together lead to the iequality ( E exp (µλ θ, Y m + µ λ Y m β 1 + g (µ µ λ Replacig i the previous iequalities, we obtai E( θ, Y m + exp(µg 1 ( µ µ λ β β E ( Y m. Lemma.3 With probability at least 1 δ, for ay θ S d, θ, m m = 1 θ, Y i m g (µ µλ E( θ, Y m ( µ µλ + exp(µg 1 β β E( Y m + β + log(δ. µλ Remark that θ, Y m = θ, αx m = ( α θ, X m (1 α θ, m α θ, X m + (1 α θ, m θ, X m + (1 α θ, m. Therefore, usig iequality (1 ad the defiitio of α, E ( θ, Y m E ( θ, Y m E ( θ, X m + θ, m λ p ( p p if E ( X p. p Remark also that Y = g(x, where g is a cotractio (beig the projectio o a ball. Cosequetly E ( Y m = 1 E( Y 1 Y 1 E( X 1 X = E ( X m. I view of these remarks, the previous lemma traslates to ( ( µ Lemma.4 Let a = g µ ad b exp(µg1. β With probability at least 1 δ, for ay θ S d, θ, m m aµλ E( θ, X m + bµλ β E( X m + β + log(δ µλ + if p 1 λ p + if p ( p λ p p E ( X p θ, X m ( p p E ( X p( θ, m + aµλ θ, m. Propositio.1 follows by takig b as metioed there, λ = 1 log(δ, ad β = µ av bt log(δ T log(δ, so that the coditio o b is satisfied. av av 4
5 Refereces O. Catoi. Statistical Learig Theory ad Stochastic Optimizatio, Lectures o Probability Theory ad Statistics, École d Été de Probabilités de Sait-Flour XXXI 001, volume 1851 of Lecture Notes i Mathematics. Spriger, 004. pages O. Catoi. Challegig the empirical mea ad empirical variace: a deviatio study. A. Ist. Heri Poicaré, 48(4: , 01. E. Joly, G. Lugosi, ad R. I. Oliveira. O the estimatio of the mea of a radom vector. Electroic Joural of Statistics, 11: , 017. G. Lugosi ad S. Medelso. Sub-gaussia estimators of the mea of a radom vector. Aals of Statistics, to appear, 017. S. Misker. Geometric media ad robust estimatio i Baach spaces. Beroulli, 4: , 015. S. Misker. Sub-Gaussia estimators of the mea of a radom matrix with heavy-tailed etries. Aals of Statistics, to appear, 016. S. Misker ad X. Wei. Estimatio of the covariace structure of heavy-tailed distributios. I NIPS 017, to appear,
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS
MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak
More informationA survey on penalized empirical risk minimization Sara A. van de Geer
A survey o pealized empirical risk miimizatio Sara A. va de Geer We address the questio how to choose the pealty i empirical risk miimizatio. Roughly speakig, this pealty should be a good boud for the
More information7.1 Convergence of sequences of random variables
Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite
More informationChapter 5. Inequalities. 5.1 The Markov and Chebyshev inequalities
Chapter 5 Iequalities 5.1 The Markov ad Chebyshev iequalities As you have probably see o today s frot page: every perso i the upper teth percetile ears at least 1 times more tha the average salary. I other
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/5.070J Fall 203 Lecture 3 9//203 Large deviatios Theory. Cramér s Theorem Cotet.. Cramér s Theorem. 2. Rate fuctio ad properties. 3. Chage of measure techique.
More informationExponential Families and Bayesian Inference
Computer Visio Expoetial Families ad Bayesia Iferece Lecture Expoetial Families A expoetial family of distributios is a d-parameter family f(x; havig the followig form: f(x; = h(xe g(t T (x B(, (. where
More informationAsymptotic distribution of products of sums of independent random variables
Proc. Idia Acad. Sci. Math. Sci. Vol. 3, No., May 03, pp. 83 9. c Idia Academy of Scieces Asymptotic distributio of products of sums of idepedet radom variables YANLING WANG, SUXIA YAO ad HONGXIA DU ollege
More information7.1 Convergence of sequences of random variables
Chapter 7 Limit theorems Throughout this sectio we will assume a probability space (Ω, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite
More informationLaw of the sum of Bernoulli random variables
Law of the sum of Beroulli radom variables Nicolas Chevallier Uiversité de Haute Alsace, 4, rue des frères Lumière 68093 Mulhouse icolas.chevallier@uha.fr December 006 Abstract Let be the set of all possible
More informationOn the convergence rates of Gladyshev s Hurst index estimator
Noliear Aalysis: Modellig ad Cotrol, 2010, Vol 15, No 4, 445 450 O the covergece rates of Gladyshev s Hurst idex estimator K Kubilius 1, D Melichov 2 1 Istitute of Mathematics ad Iformatics, Vilius Uiversity
More informationRademacher Complexity
EECS 598: Statistical Learig Theory, Witer 204 Topic 0 Rademacher Complexity Lecturer: Clayto Scott Scribe: Ya Deg, Kevi Moo Disclaimer: These otes have ot bee subjected to the usual scrutiy reserved for
More informationLecture 3: August 31
36-705: Itermediate Statistics Fall 018 Lecturer: Siva Balakrisha Lecture 3: August 31 This lecture will be mostly a summary of other useful expoetial tail bouds We will ot prove ay of these i lecture,
More informationConvergence of random variables. (telegram style notes) P.J.C. Spreij
Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space
More informationThis section is optional.
4 Momet Geeratig Fuctios* This sectio is optioal. The momet geeratig fuctio g : R R of a radom variable X is defied as g(t) = E[e tx ]. Propositio 1. We have g () (0) = E[X ] for = 1, 2,... Proof. Therefore
More informationLecture 19: Convergence
Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may
More informationPrecise Rates in Complete Moment Convergence for Negatively Associated Sequences
Commuicatios of the Korea Statistical Society 29, Vol. 16, No. 5, 841 849 Precise Rates i Complete Momet Covergece for Negatively Associated Sequeces Dae-Hee Ryu 1,a a Departmet of Computer Sciece, ChugWoo
More informationThe random version of Dvoretzky s theorem in l n
The radom versio of Dvoretzky s theorem i l Gideo Schechtma Abstract We show that with high probability a sectio of the l ball of dimesio k cε log c > 0 a uiversal costat) is ε close to a multiple of the
More informationLecture 33: Bootstrap
Lecture 33: ootstrap Motivatio To evaluate ad compare differet estimators, we eed cosistet estimators of variaces or asymptotic variaces of estimators. This is also importat for hypothesis testig ad cofidece
More informationECE 901 Lecture 12: Complexity Regularization and the Squared Loss
ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality
More informationAn Introduction to Randomized Algorithms
A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis
More informationRegression with quadratic loss
Regressio with quadratic loss Maxim Ragisky October 13, 2015 Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X,Y, where, as before,
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013 Fuctioal Law of Large Numbers. Costructio of the Wieer Measure Cotet. 1. Additioal techical results o weak covergece
More informationSelf-normalized deviation inequalities with application to t-statistic
Self-ormalized deviatio iequalities with applicatio to t-statistic Xiequa Fa Ceter for Applied Mathematics, Tiaji Uiversity, 30007 Tiaji, Chia Abstract Let ξ i i 1 be a sequece of idepedet ad symmetric
More informationON THE DELOCALIZED PHASE OF THE RANDOM PINNING MODEL
O THE DELOCALIZED PHASE OF THE RADOM PIIG MODEL JEA-CHRISTOPHE MOURRAT Abstract. We cosider the model of a directed polymer pied to a lie of i.i.d. radom charges, ad focus o the iterior of the delocalized
More informationarxiv: v1 [math.pr] 4 Dec 2013
Squared-Norm Empirical Process i Baach Space arxiv:32005v [mathpr] 4 Dec 203 Vicet Q Vu Departmet of Statistics The Ohio State Uiversity Columbus, OH vqv@statosuedu Abstract Jig Lei Departmet of Statistics
More informationREGRESSION WITH QUADRATIC LOSS
REGRESSION WITH QUADRATIC LOSS MAXIM RAGINSKY Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X, Y ), where, as before, X is a R d
More informationErratum to: An empirical central limit theorem for intermittent maps
Probab. Theory Relat. Fields (2013) 155:487 491 DOI 10.1007/s00440-011-0393-0 ERRATUM Erratum to: A empirical cetral limit theorem for itermittet maps J. Dedecker Published olie: 25 October 2011 Spriger-Verlag
More informationEmpirical Process Theory and Oracle Inequalities
Stat 928: Statistical Learig Theory Lecture: 10 Empirical Process Theory ad Oracle Iequalities Istructor: Sham Kakade 1 Risk vs Risk See Lecture 0 for a discussio o termiology. 2 The Uio Boud / Boferoi
More informationLearning Theory: Lecture Notes
Learig Theory: Lecture Notes Kamalika Chaudhuri October 4, 0 Cocetratio of Averages Cocetratio of measure is very useful i showig bouds o the errors of machie-learig algorithms. We will begi with a basic
More informationThe standard deviation of the mean
Physics 6C Fall 20 The stadard deviatio of the mea These otes provide some clarificatio o the distictio betwee the stadard deviatio ad the stadard deviatio of the mea.. The sample mea ad variace Cosider
More informationNotes 19 : Martingale CLT
Notes 9 : Martigale CLT Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces: [Bil95, Chapter 35], [Roc, Chapter 3]. Sice we have ot ecoutered weak covergece i some time, we first recall
More informationSupplemental Material: Proofs
Proof to Theorem Supplemetal Material: Proofs Proof. Let be the miimal umber of traiig items to esure a uique solutio θ. First cosider the case. It happes if ad oly if θ ad Rak(A) d, which is a special
More informationGlivenko-Cantelli Classes
CS28B/Stat24B (Sprig 2008 Statistical Learig Theory Lecture: 4 Gliveko-Catelli Classes Lecturer: Peter Bartlett Scribe: Michelle Besi Itroductio This lecture will cover Gliveko-Catelli (GC classes ad itroduce
More informationMachine Learning Brett Bernstein
Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio
More informationOn the estimation of the mean of a random vector
O the estimatio of the mea of a radom vector Emilie Joly Uiversit Paris Ouest Naterre, Frace; emilie.joly@u-paris10.fr Gábor Lugosi ICREA ad Departmet of Ecoomics, Pompeu Fabra Uiversity, Barceloa, Spai;
More informationLecture 19. sup y 1,..., yn B d n
STAT 06A: Polyomials of adom Variables Lecture date: Nov Lecture 19 Grothedieck s Iequality Scribe: Be Hough The scribes are based o a guest lecture by ya O Doell. I this lecture we prove Grothedieck s
More information1 Review and Overview
DRAFT a fial versio will be posted shortly CS229T/STATS231: Statistical Learig Theory Lecturer: Tegyu Ma Lecture #3 Scribe: Migda Qiao October 1, 2013 1 Review ad Overview I the first half of this course,
More informationLecture 12: September 27
36-705: Itermediate Statistics Fall 207 Lecturer: Siva Balakrisha Lecture 2: September 27 Today we will discuss sufficiecy i more detail ad the begi to discuss some geeral strategies for costructig estimators.
More informationECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015
ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],
More informationNotes 5 : More on the a.s. convergence of sums
Notes 5 : More o the a.s. covergece of sums Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces: Dur0, Sectios.5; Wil9, Sectio 4.7, Shi96, Sectio IV.4, Dur0, Sectio.. Radom series. Three-series
More informationDetailed proofs of Propositions 3.1 and 3.2
Detailed proofs of Propositios 3. ad 3. Proof of Propositio 3. NB: itegratio sets are geerally omitted for itegrals defied over a uit hypercube [0, s with ay s d. We first give four lemmas. The proof of
More informationCEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering
CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio
More information1 Introduction to reducing variance in Monte Carlo simulations
Copyright c 010 by Karl Sigma 1 Itroductio to reducig variace i Mote Carlo simulatios 11 Review of cofidece itervals for estimatig a mea I statistics, we estimate a ukow mea µ = E(X) of a distributio by
More information1 = δ2 (0, ), Y Y n nδ. , T n = Y Y n n. ( U n,k + X ) ( f U n,k + Y ) n 2n f U n,k + θ Y ) 2 E X1 2 X1
8. The cetral limit theorems 8.1. The cetral limit theorem for i.i.d. sequeces. ecall that C ( is N -separatig. Theorem 8.1. Let X 1, X,... be i.i.d. radom variables with EX 1 = ad EX 1 = σ (,. Suppose
More information2.2. Central limit theorem.
36.. Cetral limit theorem. The most ideal case of the CLT is that the radom variables are iid with fiite variace. Although it is a special case of the more geeral Lideberg-Feller CLT, it is most stadard
More informationResampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.
Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator
More information5 Birkhoff s Ergodic Theorem
5 Birkhoff s Ergodic Theorem Amog the most useful of the various geeralizatios of KolmogorovâĂŹs strog law of large umbers are the ergodic theorems of Birkhoff ad Kigma, which exted the validity of the
More informationON POINTWISE BINOMIAL APPROXIMATION
Iteratioal Joural of Pure ad Applied Mathematics Volume 71 No. 1 2011, 57-66 ON POINTWISE BINOMIAL APPROXIMATION BY w-functions K. Teerapabolar 1, P. Wogkasem 2 Departmet of Mathematics Faculty of Sciece
More informationCentral limit theorem and almost sure central limit theorem for the product of some partial sums
Proc. Idia Acad. Sci. Math. Sci. Vol. 8, No. 2, May 2008, pp. 289 294. Prited i Idia Cetral it theorem ad almost sure cetral it theorem for the product of some partial sums YU MIAO College of Mathematics
More informationThe log-behavior of n p(n) and n p(n)/n
Ramauja J. 44 017, 81-99 The log-behavior of p ad p/ William Y.C. Che 1 ad Ke Y. Zheg 1 Ceter for Applied Mathematics Tiaji Uiversity Tiaji 0007, P. R. Chia Ceter for Combiatorics, LPMC Nakai Uivercity
More informationEECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1
EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum
More informationTHE KALMAN FILTER RAUL ROJAS
THE KALMAN FILTER RAUL ROJAS Abstract. This paper provides a getle itroductio to the Kalma filter, a umerical method that ca be used for sesor fusio or for calculatio of trajectories. First, we cosider
More informationLECTURE 8: ASYMPTOTICS I
LECTURE 8: ASYMPTOTICS I We are iterested i the properties of estimators as. Cosider a sequece of radom variables {, X 1}. N. M. Kiefer, Corell Uiversity, Ecoomics 60 1 Defiitio: (Weak covergece) A sequece
More information1 Convergence in Probability and the Weak Law of Large Numbers
36-752 Advaced Probability Overview Sprig 2018 8. Covergece Cocepts: i Probability, i L p ad Almost Surely Istructor: Alessadro Rialdo Associated readig: Sec 2.4, 2.5, ad 4.11 of Ash ad Doléas-Dade; Sec
More informationSieve Estimators: Consistency and Rates of Convergence
EECS 598: Statistical Learig Theory, Witer 2014 Topic 6 Sieve Estimators: Cosistecy ad Rates of Covergece Lecturer: Clayto Scott Scribe: Julia Katz-Samuels, Brado Oselio, Pi-Yu Che Disclaimer: These otes
More informationAdvanced Stochastic Processes.
Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios.
More informationOn Weak and Strong Convergence Theorems for a Finite Family of Nonself I-asymptotically Nonexpansive Mappings
Mathematica Moravica Vol. 19-2 2015, 49 64 O Weak ad Strog Covergece Theorems for a Fiite Family of Noself I-asymptotically Noexpasive Mappigs Birol Güdüz ad Sezgi Akbulut Abstract. We prove the weak ad
More informationChapter 3. Strong convergence. 3.1 Definition of almost sure convergence
Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i
More informationLecture 3. Properties of Summary Statistics: Sampling Distribution
Lecture 3 Properties of Summary Statistics: Samplig Distributio Mai Theme How ca we use math to justify that our umerical summaries from the sample are good summaries of the populatio? Lecture Summary
More informationEntropy and Ergodic Theory Lecture 5: Joint typicality and conditional AEP
Etropy ad Ergodic Theory Lecture 5: Joit typicality ad coditioal AEP 1 Notatio: from RVs back to distributios Let (Ω, F, P) be a probability space, ad let X ad Y be A- ad B-valued discrete RVs, respectively.
More informationarxiv: v1 [math.pr] 13 Oct 2011
A tail iequality for quadratic forms of subgaussia radom vectors Daiel Hsu, Sham M. Kakade,, ad Tog Zhag 3 arxiv:0.84v math.pr] 3 Oct 0 Microsoft Research New Eglad Departmet of Statistics, Wharto School,
More informationEE 4TM4: Digital Communications II Probability Theory
1 EE 4TM4: Digital Commuicatios II Probability Theory I. RANDOM VARIABLES A radom variable is a real-valued fuctio defied o the sample space. Example: Suppose that our experimet cosists of tossig two fair
More information( θ. sup θ Θ f X (x θ) = L. sup Pr (Λ (X) < c) = α. x : Λ (x) = sup θ H 0. sup θ Θ f X (x θ) = ) < c. NH : θ 1 = θ 2 against AH : θ 1 θ 2
82 CHAPTER 4. MAXIMUM IKEIHOOD ESTIMATION Defiitio: et X be a radom sample with joit p.m/d.f. f X x θ. The geeralised likelihood ratio test g.l.r.t. of the NH : θ H 0 agaist the alterative AH : θ H 1,
More informationAn almost sure invariance principle for trimmed sums of random vectors
Proc. Idia Acad. Sci. Math. Sci. Vol. 20, No. 5, November 200, pp. 6 68. Idia Academy of Scieces A almost sure ivariace priciple for trimmed sums of radom vectors KE-ANG FU School of Statistics ad Mathematics,
More informationElement sampling: Part 2
Chapter 4 Elemet samplig: Part 2 4.1 Itroductio We ow cosider uequal probability samplig desigs which is very popular i practice. I the uequal probability samplig, we ca improve the efficiecy of the resultig
More informationLONG SNAKES IN POWERS OF THE COMPLETE GRAPH WITH AN ODD NUMBER OF VERTICES
J Lodo Math Soc (2 50, (1994, 465 476 LONG SNAKES IN POWERS OF THE COMPLETE GRAPH WITH AN ODD NUMBER OF VERTICES Jerzy Wojciechowski Abstract I [5] Abbott ad Katchalski ask if there exists a costat c >
More informationLecture 4. We also define the set of possible values for the random walk as the set of all x R d such that P(S n = x) > 0 for some n.
Radom Walks ad Browia Motio Tel Aviv Uiversity Sprig 20 Lecture date: Mar 2, 20 Lecture 4 Istructor: Ro Peled Scribe: Lira Rotem This lecture deals primarily with recurrece for geeral radom walks. We preset
More information1 6 = 1 6 = + Factorials and Euler s Gamma function
Royal Holloway Uiversity of Lodo Departmet of Physics Factorials ad Euler s Gamma fuctio Itroductio The is a self-cotaied part of the course dealig, essetially, with the factorial fuctio ad its geeralizatio
More informationLecture 2. The Lovász Local Lemma
Staford Uiversity Sprig 208 Math 233A: No-costructive methods i combiatorics Istructor: Ja Vodrák Lecture date: Jauary 0, 208 Origial scribe: Apoorva Khare Lecture 2. The Lovász Local Lemma 2. Itroductio
More informationMachine Learning for Data Science (CS 4786)
Machie Learig for Data Sciece CS 4786) Lecture & 3: Pricipal Compoet Aalysis The text i black outlies high level ideas. The text i blue provides simple mathematical details to derive or get to the algorithm
More informationChapter 7 Isoperimetric problem
Chapter 7 Isoperimetric problem Recall that the isoperimetric problem (see the itroductio its coectio with ido s proble) is oe of the most classical problem of a shape optimizatio. It ca be formulated
More informationDefinition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.
4. BASES I BAACH SPACES 39 4. BASES I BAACH SPACES Sice a Baach space X is a vector space, it must possess a Hamel, or vector space, basis, i.e., a subset {x γ } γ Γ whose fiite liear spa is all of X ad
More informationEXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY
EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 016 MODULE : Statistical Iferece Time allowed: Three hours Cadidates should aswer FIVE questios. All questios carry equal marks. The umber
More informationEconomics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator
Ecoomics 24B Relatio to Method of Momets ad Maximum Likelihood OLSE as a Maximum Likelihood Estimator Uder Assumptio 5 we have speci ed the distributio of the error, so we ca estimate the model parameters
More informationSequences and Series of Functions
Chapter 6 Sequeces ad Series of Fuctios 6.1. Covergece of a Sequece of Fuctios Poitwise Covergece. Defiitio 6.1. Let, for each N, fuctio f : A R be defied. If, for each x A, the sequece (f (x)) coverges
More informationA Note on Sums of Independent Random Variables
Cotemorary Mathematics Volume 00 XXXX A Note o Sums of Ideedet Radom Variables Pawe l Hitczeko ad Stehe Motgomery-Smith Abstract I this ote a two sided boud o the tail robability of sums of ideedet ad
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013 Large Deviatios for i.i.d. Radom Variables Cotet. Cheroff boud usig expoetial momet geeratig fuctios. Properties of a momet
More informationMachine Learning Theory (CS 6783)
Machie Learig Theory (CS 6783) Lecture 2 : Learig Frameworks, Examples Settig up learig problems. X : istace space or iput space Examples: Computer Visio: Raw M N image vectorized X = 0, 255 M N, SIFT
More informationEstimation of the essential supremum of a regression function
Estimatio of the essetial supremum of a regressio fuctio Michael ohler, Adam rzyżak 2, ad Harro Walk 3 Fachbereich Mathematik, Techische Uiversität Darmstadt, Schlossgartestr. 7, 64289 Darmstadt, Germay,
More informationThe natural exponential function
The atural expoetial fuctio Attila Máté Brookly College of the City Uiversity of New York December, 205 Cotets The atural expoetial fuctio for real x. Beroulli s iequality.....................................2
More informationMath Solutions to homework 6
Math 175 - Solutios to homework 6 Cédric De Groote November 16, 2017 Problem 1 (8.11 i the book): Let K be a compact Hermitia operator o a Hilbert space H ad let the kerel of K be {0}. Show that there
More informationClases 7-8: Métodos de reducción de varianza en Monte Carlo *
Clases 7-8: Métodos de reducció de variaza e Mote Carlo * 9 de septiembre de 27 Ídice. Variace reductio 2. Atithetic variates 2 2.. Example: Uiform radom variables................ 3 2.2. Example: Tail
More informationECE 330:541, Stochastic Signals and Systems Lecture Notes on Limit Theorems from Probability Fall 2002
ECE 330:541, Stochastic Sigals ad Systems Lecture Notes o Limit Theorems from robability Fall 00 I practice, there are two ways we ca costruct a ew sequece of radom variables from a old sequece of radom
More informationLecture 7: Properties of Random Samples
Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ
More informationMath 2784 (or 2794W) University of Connecticut
ORDERS OF GROWTH PAT SMITH Math 2784 (or 2794W) Uiversity of Coecticut Date: Mar. 2, 22. ORDERS OF GROWTH. Itroductio Gaiig a ituitive feel for the relative growth of fuctios is importat if you really
More informationSeries III. Chapter Alternating Series
Chapter 9 Series III With the exceptio of the Null Sequece Test, all the tests for series covergece ad divergece that we have cosidered so far have dealt oly with series of oegative terms. Series with
More informationNotes 27 : Brownian motion: path properties
Notes 27 : Browia motio: path properties Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces:[Dur10, Sectio 8.1], [MP10, Sectio 1.1, 1.2, 1.3]. Recall: DEF 27.1 (Covariace) Let X = (X
More informationLecture Stat Maximum Likelihood Estimation
Lecture Stat 461-561 Maximum Likelihood Estimatio A.D. Jauary 2008 A.D. () Jauary 2008 1 / 63 Maximum Likelihood Estimatio Ivariace Cosistecy E ciecy Nuisace Parameters A.D. () Jauary 2008 2 / 63 Parametric
More informationLecture 3 : Random variables and their distributions
Lecture 3 : Radom variables ad their distributios 3.1 Radom variables Let (Ω, F) ad (S, S) be two measurable spaces. A map X : Ω S is measurable or a radom variable (deoted r.v.) if X 1 (A) {ω : X(ω) A}
More information3. Z Transform. Recall that the Fourier transform (FT) of a DT signal xn [ ] is ( ) [ ] = In order for the FT to exist in the finite magnitude sense,
3. Z Trasform Referece: Etire Chapter 3 of text. Recall that the Fourier trasform (FT) of a DT sigal x [ ] is ω ( ) [ ] X e = j jω k = xe I order for the FT to exist i the fiite magitude sese, S = x [
More informationMachine Learning Brett Bernstein
Machie Learig Brett Berstei Week Lecture: Cocept Check Exercises Starred problems are optioal. Statistical Learig Theory. Suppose A = Y = R ad X is some other set. Furthermore, assume P X Y is a discrete
More informationEstimation for Complete Data
Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of
More informationarxiv: v1 [math.st] 17 Apr 2015
Robust estimatio of U-statistics arxiv:1504.04580v1 [math.st] 17 Apr 2015 Emilie Joly Gábor Lugosi April 20, 2015 This paper is dedicated to the memory of Evarist Gié. Abstract A importat part of the legacy
More informationMonte Carlo Integration
Mote Carlo Itegratio I these otes we first review basic umerical itegratio methods (usig Riema approximatio ad the trapezoidal rule) ad their limitatios for evaluatig multidimesioal itegrals. Next we itroduce
More informationQuantile regression with multilayer perceptrons.
Quatile regressio with multilayer perceptros. S.-F. Dimby ad J. Rykiewicz Uiversite Paris 1 - SAMM 90 Rue de Tolbiac, 75013 Paris - Frace Abstract. We cosider oliear quatile regressio ivolvig multilayer
More information5.1 Review of Singular Value Decomposition (SVD)
MGMT 69000: Topics i High-dimesioal Data Aalysis Falll 06 Lecture 5: Spectral Clusterig: Overview (cotd) ad Aalysis Lecturer: Jiamig Xu Scribe: Adarsh Barik, Taotao He, September 3, 06 Outlie Review of
More informationSupplementary Materials for Statistical-Computational Phase Transitions in Planted Models: The High-Dimensional Setting
Supplemetary Materials for Statistical-Computatioal Phase Trasitios i Plated Models: The High-Dimesioal Settig Yudog Che The Uiversity of Califoria, Berkeley yudog.che@eecs.berkeley.edu Jiamig Xu Uiversity
More informationON MEAN ERGODIC CONVERGENCE IN THE CALKIN ALGEBRAS
PROCEEDINGS OF THE AMERICAN MATHEMATICAL SOCIETY Volume 00, Number 0, Pages 000 000 S 0002-9939(XX0000-0 ON MEAN ERGODIC CONVERGENCE IN THE CALKIN ALGEBRAS MARCH T. BOEDIHARDJO AND WILLIAM B. JOHNSON 2
More informationSTAT Homework 1 - Solutions
STAT-36700 Homework 1 - Solutios Fall 018 September 11, 018 This cotais solutios for Homework 1. Please ote that we have icluded several additioal commets ad approaches to the problems to give you better
More informationLecture 27. Capacity of additive Gaussian noise channel and the sphere packing bound
Lecture 7 Ageda for the lecture Gaussia chael with average power costraits Capacity of additive Gaussia oise chael ad the sphere packig boud 7. Additive Gaussia oise chael Up to this poit, we have bee
More information