Nonparametric regression with martingale increment errors
|
|
- Martin Foster
- 5 years ago
- Views:
Transcription
1 S. Gaïffas (LSTA - Paris 6) joint work with S. Delattre (LPMA - Paris 7) work in progress
2 Motivations Some facts: Theoretical study of statistical algorithms requires stationary and ergodicity. Concentration inequalities for finite sample results. These are standard tools when the data is assumed i.i.d.: Bernstein s or Talagrand s inequality are quite popular in statistics Beyond independence, it is also standard to use a mixing assumption, like β-mixing: it allows to get back independence using coupling (the well-known Berbee s lemma), so that roughly, the independence tools can be used again (on blocks). approach adopted in many papers Problem: stationarity and mixing are hard to verify on data. Moreover, under a mixing assumption, the statistical procedure often depends on the mixing coefficients, but they cannot be estimated! The aim of this work is to study well-known statistical procedures (kernel estimation, Lepski s method), without stationarity and ergodicity assumptions Idea: We replace such assumptions by an assumption on the structure of the model: we consider a regression model where the noise is a martingale increment.
3 A model: regression with martingale increment errors (X k ) k 0 and (Y k ) k 1 are (F k ) k 0 -adapted sequences of real random variables, such that: Y k = f (X k 1 ) + ε k, (1) where (ε k ) k 0 is a F k -martingale increment: E( ε k F k 1 ) < and E(ε k F k 1 ) = 0, where f : R R is the unknown function of interest. Assumption: There is a F k -adapted sequence (σ k ) k 0 > 0 such that: k 0, where µ, γ > 0. We observe (σ k ) k 0. [ ( ) ] E exp µ ε2 k F σk 1 2 k 1 γ, (2) Goal: Estimate f at a point x R based on observations (Y 1,..., Y N ) and (X 0,..., X N 1 ), where N 1 is a stopping time.
4 A model: regression with martingale increment errors Particular cases Very particular cases: usual i.i.d. regression and autoregression models. Regression model: We observe (Y k, X k 1 ) N k=1 satisfying Y k = f (X k 1 ) + v(x k 1 )ζ k, where (ζ k ) is i.i.d., centered and subgaussian, and independent of F k = σ(x 0,..., X k ). Autoregression model: We observe (X k ) N k=0 satisfying X k = f (X k 1 ) + v(x k 1 )ζ k, where (ζ k ) is i.i.d., centered, independent of X 0 and subgaussian.
5 A model: regression with martingale increment errors Remarks Remark 1 (stopping time) We observe (X k 1, Y k ) N k=1 with N is a stopping time: non asymptotic results online setting: the statistician decides to stop the sampling according to some rule (clinical trials,...) Remark 2 (variance) The variance process (σ k ) is observed. We can use a two-steps Lepski s procedure to estimate the conditional variance. While ε k /σ k is conditionaly subgaussian, ε k is not in general!! Think of a GARCH model (ε k, σ k ) for instance: even if (ζ k ) is Gaussian i.i.d., ε k = σ k ζ k can have heavy tails.
6 The Lepski s method Some preliminary definitions Important object: for h > 0, take L(h) = N 1 1 σ 2 Xk 1 x h k=1 k 1 = occupation time of (X k ) k 0 at x renormalized by (σ k ). If L(h) > 0, define the kernel estimator ˆf (h) = 1 L(h) Consider the set of bandwidths N 1 σ 2 k=1 k 1 1 Xi 1 x hy k. H := {h j : L(h j ) > 0} where h j = h 0q j for some parameters h 0 > 0 and q (0, 1) [other choices are possible].
7 The Lepski s method Definition Define for some b > 0: For u > 0, define on { L(h 0) 1/2 u } : ψ(h) := 1 + b log(h 0/h). { H u = min h H : ( ψ(h) ) 1/2 } u, L(h) and let u 0 > 0. On {L(h 0) 1/2 u 0}, we select Ĥ according to the following standard Lepski s rule (Lepski (1992),...): where ν > 0. { Ĥ := max h H : h H u0 and such that h [H u0, h] H, ˆf (h) ˆf (h ) ν ( ψ(h ) ) 1/2 }, L(h )
8 An adaptive upper bound Preliminaries Remark. u 0 > 0 is such that L(h 0) 1/2 u 0. This is very mild: if there is some data close to x, and if h 0 is large enough, then L(h 0) should be larger than some constant. [We don t have ergodicity, so we don t know if L(h) is close to EL(h) np X ([x h, x + h])...] We want to prove an oracle result: we prove that ˆf (Ĥ) has the same rate of convergence as ˆf (H ), where H is the oracle bandwidth. Consider W (bias condition) such that sup h [H u0,h] H N i=1 f (X i 1)/σi 11 2 Xi 1 x h N i=1 1/σ2 i 1 1 X i 1 x h f (x) W (h). (OK if f is Hölder for instance...). Nothing is required on W, but we need to bound it from below and above: where ε 0, α 0 > 0. W (h) := (ε 0(h/h 0) α 0 ) W (h) u 0,
9 An adaptive upper bound On the event { L(h 0) 1/2 W (h 0) }, define the oracle bandwidth { H := min h H : Denote P (A) = P(A Ω ), where ( ψ(h) ) 1/2 W (h)}. L(h) Ω := { L(h 0) 1/2 W (h 0), W (H ) u 0 }. Ω is a minimal technical requirement for the proof of the adaptive upper bound. Theorem (Adaptive upper bound) Grant (2) (ε k /σ k is subgaussian cond. on F k 1 ) and let ˆf (Ĥ) be the procedure given by the Lepski s rule. Then, we have for any t > 0: P [ W (H ) 1 ˆf (Ĥ) f (x) > t ] c(1 + log(1 + t))t bλν2 /(33α 0 ), where c is a constant which depends on λ, µ, γ, q, b, u 0, ε 0, α 0, ν.
10 An adaptive upper bound Remarks Don t need stationarity or ergodicity for the proof. BUT, without further assumption, we cannot give the behaviour of H (and of W (H )): it does not necessarily go to 0, when (X k ) is transient for instance. This is why we needed to introduce W. Cornerstone of the proof: a new result concerning the stability of self-normalized martingales Only true assumptions: martingale difference structure of ε k and ε k /σ k is subgaussian cond. on F k 1. Indeed, we also require that L(h 0) 1/2 W (h 0) and W (H ) u 0. But, this is no more a restriction when f is Hölder for instance, since in this case W (h) = Lh s. We use kernel estimation with K(x) = 1 [ 1,1] (x)/2, for technical simplicity we can obtain adaptation for an Hölder exponent s smaller than 1 only. One can consider the Lepski s method applied to local polynomials.
11 A tool: stability for self-normalized martingales Let (M n) n 0 be a locally square integrale (G n)-martingale. The predictable quadratic variation of M n is where M n := M n M n 1. M n := n E( Mk 2 G k 1 ), k=1 A standard concentration inequalities is Freedman s inequality: if (M n) n 0 is such that M n c a.s., then for any x, y > 0: ( x 2 ) P[M n x, M n y] exp. 2(y + cx) One can use the Bernstein s condition instead of M k c: n k=1 see Pinelis (1994), de la Peña (1999). E [ M k p G k 1 ] p! 2 cp 2 M n, p 2,
12 A tool: stability for self-normalized martingales Problem: a Freedman s type inequality (or Bernstein s inequality) is not enough to prove the adaptive upper bound: in Freedman s inequality, we must work on { M n y} this requires an ad-hoc assumption on the (X k ), such as some mixing property, that we wanted to avoid. First idea: give a deviation for M T / M T. But it is well-known that in general M T / M T is not even tight!! [Simple counter-example: consider M n = B n where (B t) standard Brownian motion, and define T c = inf{n 1 : B n/ n c}, where c > 0. For any c > 0, T c is finite a.s. So, one has M Tc / M Tc = M Tc / T c c, for any c > 0!!] A simple solution: replace M T M T by a MT (a + M T ). We prove that for any a > 0, a M T /(a + M T ) is sub-gaussian when M n is sub-gaussian, hence the name stability.
13 A tool: stability for self-normalized martingales Theorem (stability) Assume that M n = s n 1ζ n where (ζ n) is a sequence of (G n)-martingale increments such that for some µ > 0: E [ exp(µζ 2 n) G n 1 ] γ for any n 1, and where (s n) n N is a (G n)-adapted sequence of non-negative random variables. Let us define n V n := sk 1. 2 Then, for any λ [0, with c λ := exp ( λγ λ 2(1 2λΓ λ ) k=1 µ ), any a > 0 and any stopping-time T, we have: 2(1+γ) [ ( E exp λ amt 2 )] (a + V T ) 2 ) (exp(λγλ ) 1) and Γ λ := 1+2γ 2(µ λ). 1 + c λ, (3)
14 A tool: stability for self-normalized martingales A similar result holds for sub-exponential martingale increments This theorem shows that when ζ k is sub-gaussian (resp. sub-exponential) conditionally to G k 1, then a M T /(a + V T ) is also sub-gaussian (resp. sub-exponential), hence the name stability. No concentration of the measure phenomenon here: we only prove tat the tails of a M T /(a + V T ) are the same as that of ζ n (actually M n can be equal to ζ n, take s n 1 = 1 and s k = 0 if k n 1). It is tempting to take a = V T to mimize the exponential moment, but not possible!! (again, M T / M T is not even tight in general)
15 A link with the usual minimax theory Some preliminaries Aim: Under a dependence assumption for (X k ), find a deterministic equivalent to our random rate of convergence. This makes our theory consistent with the usual minimax theory of deterministic rates. To measure the dependence of (X k ) k 0, we can use β-mixing (see Kolmogorov and Rozanov (1960), and see Doukhan (1994) for topics on dependence). Introduce Xu v = σ(x k : u k v), where u, k, v N. We say that a strictly stationary sequence (X k ) k Z is β-mixing if β q := 1 2 sup ( I i=1 J ) P[U i V j ] P[U i ]P[V j ] 0 as q +, j=1 where sup is taken among all finite partitions (U i ) I i=1 and (V j ) J j=1 that are X 0 and X q + measurable. β-mixing has been used a lot (in statistics) mainly because of a coupling result by Berbee, see Berbee (1979), that allows to get back independence on blocks, on which one can use Bernstein s or Talagrand s inequality.
16 A link with the usual minimax theory Some preliminaries Assumptions The sequence (σ k ) k 0 is equal to a known constant σ (X k ) k 0 is a strictly stationary process We observe (X k 1, Y k ) n k=0 (stopping time N n). We assume that f has Hölder-type smoothness in a neighbourhood of x. Let us fix two constants κ, τ > 0. Assumption (Smoothness of f ) There are η > 0, 0 < s 1 and a slowly varying function l w such that sup f (y) f (x) w(h), where w(h) := h s l w (h) y: y x h for any h η, and such that w is increasing on [0, η], w(h) τh 2 and w(h) κ for any h [0, η]. Slightly more general than Hölder smoothness, where: l w r with r > 0.
17 A link with the usual minimax theory Some preliminaries Under these assumptions, we can consider { H w := min h > 0 : ( ψ(h) ) 1/2 } w(h) L(h) = optimal bandwidth associated to the modulus of continuity w. We can prove use the adaptive upper bound with W (h) = w(h), and the random rate is then w(h w ) for any f with modulus of continuity w. Idea: under a β-mixing assumption, we can say how L(h) concentrates around its expectation EL(h) (Bernstein s inequality), so a natural determistic equivalent to H w is { ( ψ(h) ) 1/2 } h w := min h > 0 : w(h). EL(h)
18 A link with the usual minimax theory Some preliminaries Easy to give the behaviour of h w under the following assumption on P X (using properties of regularly varying functions): Assumption (Local behaviour of P X ) There is η > 0 and γ 1 such for any h [0, η], we have where l X is slowly varying. P X ([x h, x + h]) = h γ+1 l X (h), This is an extension of the case where P X has a continuous density f X wrt Lebesgue measure such that f X (x) > 0. It is met when f X (y) = c y x γ for y close to x for instance. Lemma: We can write h w = n 1/(2s+γ+1) l 1(1/n) and w(h w ) = n s/(2s+γ+1) l 2(1/n) when n is large enough, where l 1 and l 2 are slowly varying functions.
19 A link with the usual minimax theory Proposition (deterministic equivelent) Grant the previous Assumptions on f and P X. If (X k ) is geometrically β-mixing, namely if there are b, c > 0 such that for any q 1: then we have: [ w(hw ) P 4 β q exp( (q/b) 1/c ), ] w(h w ) 4w(h w ) 1 exp( C 1n δ l 2(1/n)), where δ = 2s (2s + γ + 1)(c + 1) for n is large enough, where C 1 > 0 is a constant and l 1 is a slowly varying function, that depends on b, c, γ, s, σ and l X, l w. A similar result holds for arithmetical mixing, when β q (b/q) 1/c with c < 2s/(γ + 1).
20 A link with the usual minimax theory In the following simple nonparametric regregression setting: Then: f is s-hölder (w(h) = Lh s so l w (h) L) P X has a density f X which is finite and bounded away from zero at x (γ = 0) (X k ) k 0 is geometrically β-mixing, or arithmetically β-mixing with β q (b/q) 1/c for c < 2s. w(h w ) has the same order as [(log n)/n] s/(2s+1) with a large probability. Note that [(log n)/n] s/(2s+1) is the pointwise adaptive minimax rate in this case. The main theorem is consistent with the usual minimax theory of deterministic rates, in ergodic situations.
21 Conclusion The message of this work is twofold: The kernel estimator and Lepski s method are very robust to the statistical properties of the model: no need for ergodicity to be almost optimal For the theoretical assessment of an estimator, one can develop a theory involving random rates, that depends on the occupation time. The rate is almost observable, if the smoothness of f were known... (confidence bands?) Ergodicity shall be only used in a second step of this theory, to derive the asymptotic behaviour of the random rate. Open question: is W (H ) optimal? Applications in econometry Learning theory beyong the usual i.i.d. regression setting
22 Bibliography Berbee, H. C. P. (1979). Random walks with stationary increments and renewal theory, vol. 112 of Mathematical Centre Tracts. Mathematisch Centrum, Amsterdam. de la Peña, V. H. (1999). A general class of exponential inequalities for martingales and ratios. Ann. Probab., Doukhan, P. (1994). Mixing, vol. 85 of Lecture Notes in Statistics. Springer-Verlag, New York. Properties and examples. Kolmogorov, A. N. and Rozanov, J. A. (1960). On a strong mixing condition for stationary Gaussian processes. Teor. Verojatnost. i Primenen., Lepski, O. V. (1992). On problems of adaptive estimation in white gaussian noise. Advances in Soviet Mathematics, Pinelis, I. (1994). Optimum bounds for the distributions of martingales in Banach spaces. Ann. Probab.,
Habilitation à diriger les recherches. l Université Pierre et Marie Curie. Some contributions to statistics and machine learning
Habilitation à diriger les recherches Mention Mathématiques et Applications 1 Présentée devant l Université Pierre et Marie Curie Par Stéphane Gaïffas Ecole polytechnique CMAP Some contributions to statistics
More informationFast learning rates for plug-in classifiers under the margin condition
Fast learning rates for plug-in classifiers under the margin condition Jean-Yves Audibert 1 Alexandre B. Tsybakov 2 1 Certis ParisTech - Ecole des Ponts, France 2 LPMA Université Pierre et Marie Curie,
More informationBrownian motion. Samy Tindel. Purdue University. Probability Theory 2 - MA 539
Brownian motion Samy Tindel Purdue University Probability Theory 2 - MA 539 Mostly taken from Brownian Motion and Stochastic Calculus by I. Karatzas and S. Shreve Samy T. Brownian motion Probability Theory
More informationSome functional (Hölderian) limit theorems and their applications (II)
Some functional (Hölderian) limit theorems and their applications (II) Alfredas Račkauskas Vilnius University Outils Statistiques et Probabilistes pour la Finance Université de Rouen June 1 5, Rouen (Rouen
More informationStatistical inference on Lévy processes
Alberto Coca Cabrero University of Cambridge - CCA Supervisors: Dr. Richard Nickl and Professor L.C.G.Rogers Funded by Fundación Mutua Madrileña and EPSRC MASDOC/CCA student workshop 2013 26th March Outline
More informationD I S C U S S I O N P A P E R
I N S T I T U T D E S T A T I S T I Q U E B I O S T A T I S T I Q U E E T S C I E N C E S A C T U A R I E L L E S ( I S B A ) UNIVERSITÉ CATHOLIQUE DE LOUVAIN D I S C U S S I O N P A P E R 2014/06 Adaptive
More informationPractical conditions on Markov chains for weak convergence of tail empirical processes
Practical conditions on Markov chains for weak convergence of tail empirical processes Olivier Wintenberger University of Copenhagen and Paris VI Joint work with Rafa l Kulik and Philippe Soulier Toronto,
More informationModel Selection and Geometry
Model Selection and Geometry Pascal Massart Université Paris-Sud, Orsay Leipzig, February Purpose of the talk! Concentration of measure plays a fundamental role in the theory of model selection! Model
More informationGaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012
Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature
More informationPreliminary Exam: Probability 9:00am 2:00pm, Friday, January 6, 2012
Preliminary Exam: Probability 9:00am 2:00pm, Friday, January 6, 202 The exam lasts from 9:00am until 2:00pm, with a walking break every hour. Your goal on this exam should be to demonstrate mastery of
More informationConsistency of the maximum likelihood estimator for general hidden Markov models
Consistency of the maximum likelihood estimator for general hidden Markov models Jimmy Olsson Centre for Mathematical Sciences Lund University Nordstat 2012 Umeå, Sweden Collaborators Hidden Markov models
More informationPROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS
PROBABILITY: LIMIT THEOREMS II, SPRING 218. HOMEWORK PROBLEMS PROF. YURI BAKHTIN Instructions. You are allowed to work on solutions in groups, but you are required to write up solutions on your own. Please
More informationBeyond the color of the noise: what is memory in random phenomena?
Beyond the color of the noise: what is memory in random phenomena? Gennady Samorodnitsky Cornell University September 19, 2014 Randomness means lack of pattern or predictability in events according to
More informationPlug-in Approach to Active Learning
Plug-in Approach to Active Learning Stanislav Minsker Stanislav Minsker (Georgia Tech) Plug-in approach to active learning 1 / 18 Prediction framework Let (X, Y ) be a random couple in R d { 1, +1}. X
More informationEstimation of a quadratic regression functional using the sinc kernel
Estimation of a quadratic regression functional using the sinc kernel Nicolai Bissantz Hajo Holzmann Institute for Mathematical Stochastics, Georg-August-University Göttingen, Maschmühlenweg 8 10, D-37073
More informationA Note on the Central Limit Theorem for a Class of Linear Systems 1
A Note on the Central Limit Theorem for a Class of Linear Systems 1 Contents Yukio Nagahata Department of Mathematics, Graduate School of Engineering Science Osaka University, Toyonaka 560-8531, Japan.
More informationBickel Rosenblatt test
University of Latvia 28.05.2011. A classical Let X 1,..., X n be i.i.d. random variables with a continuous probability density function f. Consider a simple hypothesis H 0 : f = f 0 with a significance
More informationBayesian Regularization
Bayesian Regularization Aad van der Vaart Vrije Universiteit Amsterdam International Congress of Mathematicians Hyderabad, August 2010 Contents Introduction Abstract result Gaussian process priors Co-authors
More informationAnti-concentration Inequalities
Anti-concentration Inequalities Roman Vershynin Mark Rudelson University of California, Davis University of Missouri-Columbia Phenomena in High Dimensions Third Annual Conference Samos, Greece June 2007
More informationGARCH processes probabilistic properties (Part 1)
GARCH processes probabilistic properties (Part 1) Alexander Lindner Centre of Mathematical Sciences Technical University of Munich D 85747 Garching Germany lindner@ma.tum.de http://www-m1.ma.tum.de/m4/pers/lindner/
More informationOPTIMAL POINTWISE ADAPTIVE METHODS IN NONPARAMETRIC ESTIMATION 1
The Annals of Statistics 1997, Vol. 25, No. 6, 2512 2546 OPTIMAL POINTWISE ADAPTIVE METHODS IN NONPARAMETRIC ESTIMATION 1 By O. V. Lepski and V. G. Spokoiny Humboldt University and Weierstrass Institute
More information(2m)-TH MEAN BEHAVIOR OF SOLUTIONS OF STOCHASTIC DIFFERENTIAL EQUATIONS UNDER PARAMETRIC PERTURBATIONS
(2m)-TH MEAN BEHAVIOR OF SOLUTIONS OF STOCHASTIC DIFFERENTIAL EQUATIONS UNDER PARAMETRIC PERTURBATIONS Svetlana Janković and Miljana Jovanović Faculty of Science, Department of Mathematics, University
More informationConvergence rates in weighted L 1 spaces of kernel density estimators for linear processes
Alea 4, 117 129 (2008) Convergence rates in weighted L 1 spaces of kernel density estimators for linear processes Anton Schick and Wolfgang Wefelmeyer Anton Schick, Department of Mathematical Sciences,
More informationA Functional Central Limit Theorem for an ARMA(p, q) Process with Markov Switching
Communications for Statistical Applications and Methods 2013, Vol 20, No 4, 339 345 DOI: http://dxdoiorg/105351/csam2013204339 A Functional Central Limit Theorem for an ARMAp, q) Process with Markov Switching
More informationAsymptotics for posterior hazards
Asymptotics for posterior hazards Pierpaolo De Blasi University of Turin 10th August 2007, BNR Workshop, Isaac Newton Intitute, Cambridge, UK Joint work with Giovanni Peccati (Université Paris VI) and
More informationConditional Full Support for Gaussian Processes with Stationary Increments
Conditional Full Support for Gaussian Processes with Stationary Increments Tommi Sottinen University of Vaasa Kyiv, September 9, 2010 International Conference Modern Stochastics: Theory and Applications
More information1. Aufgabenblatt zur Vorlesung Probability Theory
24.10.17 1. Aufgabenblatt zur Vorlesung By (Ω, A, P ) we always enote the unerlying probability space, unless state otherwise. 1. Let r > 0, an efine f(x) = 1 [0, [ (x) exp( r x), x R. a) Show that p f
More informationLONG TIME BEHAVIOUR OF PERIODIC STOCHASTIC FLOWS.
LONG TIME BEHAVIOUR OF PERIODIC STOCHASTIC FLOWS. D. DOLGOPYAT, V. KALOSHIN AND L. KORALOV Abstract. We consider the evolution of a set carried by a space periodic incompressible stochastic flow in a Euclidean
More informationUnderstanding Regressions with Observations Collected at High Frequency over Long Span
Understanding Regressions with Observations Collected at High Frequency over Long Span Yoosoon Chang Department of Economics, Indiana University Joon Y. Park Department of Economics, Indiana University
More informationBrownian Motion. 1 Definition Brownian Motion Wiener measure... 3
Brownian Motion Contents 1 Definition 2 1.1 Brownian Motion................................. 2 1.2 Wiener measure.................................. 3 2 Construction 4 2.1 Gaussian process.................................
More informationInverse Statistical Learning
Inverse Statistical Learning Minimax theory, adaptation and algorithm avec (par ordre d apparition) C. Marteau, M. Chichignoud, C. Brunet and S. Souchet Dijon, le 15 janvier 2014 Inverse Statistical Learning
More information11. Further Issues in Using OLS with TS Data
11. Further Issues in Using OLS with TS Data With TS, including lags of the dependent variable often allow us to fit much better the variation in y Exact distribution theory is rarely available in TS applications,
More informationOn optimal stopping of autoregressive sequences
On optimal stopping of autoregressive sequences Sören Christensen (joint work with A. Irle, A. Novikov) Mathematisches Seminar, CAU Kiel September 24, 2010 Sören Christensen (CAU Kiel) Optimal stopping
More informationConvergence rates of spectral methods for statistical inverse learning problems
Convergence rates of spectral methods for statistical inverse learning problems G. Blanchard Universtität Potsdam UCL/Gatsby unit, 04/11/2015 Joint work with N. Mücke (U. Potsdam); N. Krämer (U. München)
More informationAdditive functionals of infinite-variance moving averages. Wei Biao Wu The University of Chicago TECHNICAL REPORT NO. 535
Additive functionals of infinite-variance moving averages Wei Biao Wu The University of Chicago TECHNICAL REPORT NO. 535 Departments of Statistics The University of Chicago Chicago, Illinois 60637 June
More informationCOPYRIGHTED MATERIAL CONTENTS. Preface Preface to the First Edition
Preface Preface to the First Edition xi xiii 1 Basic Probability Theory 1 1.1 Introduction 1 1.2 Sample Spaces and Events 3 1.3 The Axioms of Probability 7 1.4 Finite Sample Spaces and Combinatorics 15
More informationWeak quenched limiting distributions of a one-dimensional random walk in a random environment
Weak quenched limiting distributions of a one-dimensional random walk in a random environment Jonathon Peterson Cornell University Department of Mathematics Joint work with Gennady Samorodnitsky September
More informationErgodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R.
Ergodic Theorems Samy Tindel Purdue University Probability Theory 2 - MA 539 Taken from Probability: Theory and examples by R. Durrett Samy T. Ergodic theorems Probability Theory 1 / 92 Outline 1 Definitions
More informationDISCUSSION: COVERAGE OF BAYESIAN CREDIBLE SETS. By Subhashis Ghosal North Carolina State University
Submitted to the Annals of Statistics DISCUSSION: COVERAGE OF BAYESIAN CREDIBLE SETS By Subhashis Ghosal North Carolina State University First I like to congratulate the authors Botond Szabó, Aad van der
More informationLecture 17 Brownian motion as a Markov process
Lecture 17: Brownian motion as a Markov process 1 of 14 Course: Theory of Probability II Term: Spring 2015 Instructor: Gordan Zitkovic Lecture 17 Brownian motion as a Markov process Brownian motion is
More informationORTHOGONAL SERIES REGRESSION ESTIMATORS FOR AN IRREGULARLY SPACED DESIGN
APPLICATIONES MATHEMATICAE 7,3(000), pp. 309 318 W.POPIŃSKI(Warszawa) ORTHOGONAL SERIES REGRESSION ESTIMATORS FOR AN IRREGULARLY SPACED DESIGN Abstract. Nonparametric orthogonal series regression function
More informationA Quadratic ARCH( ) model with long memory and Lévy stable behavior of squares
A Quadratic ARCH( ) model with long memory and Lévy stable behavior of squares Donatas Surgailis Vilnius Institute of Mathematics and Informatics onatas Surgailis (Vilnius Institute of Mathematics A Quadratic
More informationThe Azéma-Yor Embedding in Non-Singular Diffusions
Stochastic Process. Appl. Vol. 96, No. 2, 2001, 305-312 Research Report No. 406, 1999, Dept. Theoret. Statist. Aarhus The Azéma-Yor Embedding in Non-Singular Diffusions J. L. Pedersen and G. Peskir Let
More informationPointwise convergence rates and central limit theorems for kernel density estimators in linear processes
Pointwise convergence rates and central limit theorems for kernel density estimators in linear processes Anton Schick Binghamton University Wolfgang Wefelmeyer Universität zu Köln Abstract Convergence
More informationA slow transient diusion in a drifted stable potential
A slow transient diusion in a drifted stable potential Arvind Singh Université Paris VI Abstract We consider a diusion process X in a random potential V of the form V x = S x δx, where δ is a positive
More informationEmpirical Risk Minimization as Parameter Choice Rule for General Linear Regularization Methods
Empirical Risk Minimization as Parameter Choice Rule for General Linear Regularization Methods Frank Werner 1 Statistical Inverse Problems in Biophysics Group Max Planck Institute for Biophysical Chemistry,
More informationSliced Inverse Regression
Sliced Inverse Regression Ge Zhao gzz13@psu.edu Department of Statistics The Pennsylvania State University Outline Background of Sliced Inverse Regression (SIR) Dimension Reduction Definition of SIR Inversed
More informationInference For High Dimensional M-estimates. Fixed Design Results
: Fixed Design Results Lihua Lei Advisors: Peter J. Bickel, Michael I. Jordan joint work with Peter J. Bickel and Noureddine El Karoui Dec. 8, 2016 1/57 Table of Contents 1 Background 2 Main Results and
More informationAsymptotic Equivalence and Adaptive Estimation for Robust Nonparametric Regression
Asymptotic Equivalence and Adaptive Estimation for Robust Nonparametric Regression T. Tony Cai 1 and Harrison H. Zhou 2 University of Pennsylvania and Yale University Abstract Asymptotic equivalence theory
More informationTranslation Invariant Experiments with Independent Increments
Translation Invariant Statistical Experiments with Independent Increments (joint work with Nino Kordzakhia and Alex Novikov Steklov Mathematical Institute St.Petersburg, June 10, 2013 Outline 1 Introduction
More informationEstimation of the functional Weibull-tail coefficient
1/ 29 Estimation of the functional Weibull-tail coefficient Stéphane Girard Inria Grenoble Rhône-Alpes & LJK, France http://mistis.inrialpes.fr/people/girard/ June 2016 joint work with Laurent Gardes,
More informationInference for Non-stationary Time Series Auto Regression. By Zhou Zhou 1 University of Toronto January 21, Abstract
Inference for Non-stationary Time Series Auto Regression By Zhou Zhou 1 University of Toronto January 21, 2013 Abstract The paper considers simultaneous inference for a class of non-stationary autoregressive
More informationASYMPTOTIC EQUIVALENCE OF DENSITY ESTIMATION AND GAUSSIAN WHITE NOISE. By Michael Nussbaum Weierstrass Institute, Berlin
The Annals of Statistics 1996, Vol. 4, No. 6, 399 430 ASYMPTOTIC EQUIVALENCE OF DENSITY ESTIMATION AND GAUSSIAN WHITE NOISE By Michael Nussbaum Weierstrass Institute, Berlin Signal recovery in Gaussian
More informationTime Series. Anthony Davison. c
Series Anthony Davison c 2008 http://stat.epfl.ch Periodogram 76 Motivation............................................................ 77 Lutenizing hormone data..................................................
More informationThe Convergence Rate for the Normal Approximation of Extreme Sums
The Convergence Rate for the Normal Approximation of Extreme Sums Yongcheng Qi University of Minnesota Duluth WCNA 2008, Orlando, July 2-9, 2008 This talk is based on a joint work with Professor Shihong
More informationAsymptotic inference for a nonstationary double ar(1) model
Asymptotic inference for a nonstationary double ar() model By SHIQING LING and DONG LI Department of Mathematics, Hong Kong University of Science and Technology, Hong Kong maling@ust.hk malidong@ust.hk
More informationStochastic volatility models: tails and memory
: tails and memory Rafa l Kulik and Philippe Soulier Conference in honour of Prof. Murad Taqqu 19 April 2012 Rafa l Kulik and Philippe Soulier Plan Model assumptions; Limit theorems for partial sums and
More informationStrictly Stationary Solutions of Autoregressive Moving Average Equations
Strictly Stationary Solutions of Autoregressive Moving Average Equations Peter J. Brockwell Alexander Lindner Abstract Necessary and sufficient conditions for the existence of a strictly stationary solution
More informationLimit theorems for dependent regularly varying functions of Markov chains
Limit theorems for functions of with extremal linear behavior Limit theorems for dependent regularly varying functions of In collaboration with T. Mikosch Olivier Wintenberger wintenberger@ceremade.dauphine.fr
More informationIntegral approximation by kernel smoothing
Integral approximation by kernel smoothing François Portier Université catholique de Louvain - ISBA August, 29 2014 In collaboration with Bernard Delyon Topic of the talk: Given ϕ : R d R, estimation of
More informationUniversal examples. Chapter The Bernoulli process
Chapter 1 Universal examples 1.1 The Bernoulli process First description: Bernoulli random variables Y i for i = 1, 2, 3,... independent with P [Y i = 1] = p and P [Y i = ] = 1 p. Second description: Binomial
More informationDepartment of Econometrics and Business Statistics
ISSN 440-77X Australia Department of Econometrics and Business Statistics http://www.buseco.monash.edu.au/depts/ebs/pubs/wpapers/ Nonlinear Regression with Harris Recurrent Markov Chains Degui Li, Dag
More informationRegularity of the density for the stochastic heat equation
Regularity of the density for the stochastic heat equation Carl Mueller 1 Department of Mathematics University of Rochester Rochester, NY 15627 USA email: cmlr@math.rochester.edu David Nualart 2 Department
More informationPROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS
PROBABILITY: LIMIT THEOREMS II, SPRING 15. HOMEWORK PROBLEMS PROF. YURI BAKHTIN Instructions. You are allowed to work on solutions in groups, but you are required to write up solutions on your own. Please
More informationCOMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017
COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University FEATURE EXPANSIONS FEATURE EXPANSIONS
More informationLecture 8: Path Technology
Counting and Sampling Fall 07 Lecture 8: Path Technology Lecturer: Shayan Oveis Gharan October 0 Disclaimer: These notes have not been subjected to the usual scrutiny reserved for formal publications.
More informationRandom Process Lecture 1. Fundamentals of Probability
Random Process Lecture 1. Fundamentals of Probability Husheng Li Min Kao Department of Electrical Engineering and Computer Science University of Tennessee, Knoxville Spring, 2016 1/43 Outline 2/43 1 Syllabus
More informationEcon 424 Time Series Concepts
Econ 424 Time Series Concepts Eric Zivot January 20 2015 Time Series Processes Stochastic (Random) Process { 1 2 +1 } = { } = sequence of random variables indexed by time Observed time series of length
More informationComputer Intensive Methods in Mathematical Statistics
Computer Intensive Methods in Mathematical Statistics Department of mathematics johawes@kth.se Lecture 16 Advanced topics in computational statistics 18 May 2017 Computer Intensive Methods (1) Plan of
More informationOn Optimal Stopping Problems with Power Function of Lévy Processes
On Optimal Stopping Problems with Power Function of Lévy Processes Budhi Arta Surya Department of Mathematics University of Utrecht 31 August 2006 This talk is based on the joint paper with A.E. Kyprianou:
More informationErnesto Mordecki 1. Lecture III. PASI - Guanajuato - June 2010
Optimal stopping for Hunt and Lévy processes Ernesto Mordecki 1 Lecture III. PASI - Guanajuato - June 2010 1Joint work with Paavo Salminen (Åbo, Finland) 1 Plan of the talk 1. Motivation: from Finance
More information5th-order differentiation
ARBITRARY-ORDER REAL-TIME EXACT ROBUST DIFFERENTIATION A. Levant Applied Mathematics Dept., Tel-Aviv University, Israel E-mail: levant@post.tau.ac.il Homepage: http://www.tau.ac.il/~levant/ 5th-order differentiation
More informationLecture 7 Introduction to Statistical Decision Theory
Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7
More informationA generalization of Strassen s functional LIL
A generalization of Strassen s functional LIL Uwe Einmahl Departement Wiskunde Vrije Universiteit Brussel Pleinlaan 2 B-1050 Brussel, Belgium E-mail: ueinmahl@vub.ac.be Abstract Let X 1, X 2,... be a sequence
More informationStatistical Measures of Uncertainty in Inverse Problems
Statistical Measures of Uncertainty in Inverse Problems Workshop on Uncertainty in Inverse Problems Institute for Mathematics and Its Applications Minneapolis, MN 19-26 April 2002 P.B. Stark Department
More informationTime Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY
Time Series Analysis James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY & Contents PREFACE xiii 1 1.1. 1.2. Difference Equations First-Order Difference Equations 1 /?th-order Difference
More informationTime Series and Forecasting Lecture 4 NonLinear Time Series
Time Series and Forecasting Lecture 4 NonLinear Time Series Bruce E. Hansen Summer School in Economics and Econometrics University of Crete July 23-27, 2012 Bruce Hansen (University of Wisconsin) Foundations
More informationGeneralization theory
Generalization theory Daniel Hsu Columbia TRIPODS Bootcamp 1 Motivation 2 Support vector machines X = R d, Y = { 1, +1}. Return solution ŵ R d to following optimization problem: λ min w R d 2 w 2 2 + 1
More informationStochastic Proximal Gradient Algorithm
Stochastic Institut Mines-Télécom / Telecom ParisTech / Laboratoire Traitement et Communication de l Information Joint work with: Y. Atchade, Ann Arbor, USA, G. Fort LTCI/Télécom Paristech and the kind
More informationµ X (A) = P ( X 1 (A) )
1 STOCHASTIC PROCESSES This appendix provides a very basic introduction to the language of probability theory and stochastic processes. We assume the reader is familiar with the general measure and integration
More information1 Class Organization. 2 Introduction
Time Series Analysis, Lecture 1, 2018 1 1 Class Organization Course Description Prerequisite Homework and Grading Readings and Lecture Notes Course Website: http://www.nanlifinance.org/teaching.html wechat
More informationRegular Variation and Extreme Events for Stochastic Processes
1 Regular Variation and Extreme Events for Stochastic Processes FILIP LINDSKOG Royal Institute of Technology, Stockholm 2005 based on joint work with Henrik Hult www.math.kth.se/ lindskog 2 Extremes for
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationSTAT 200C: High-dimensional Statistics
STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 59 Classical case: n d. Asymptotic assumption: d is fixed and n. Basic tools: LLN and CLT. High-dimensional setting: n d, e.g. n/d
More informationNONPARAMETRIC DENSITY ESTIMATION WITH RESPECT TO THE LINEX LOSS FUNCTION
NONPARAMETRIC DENSITY ESTIMATION WITH RESPECT TO THE LINEX LOSS FUNCTION R. HASHEMI, S. REZAEI AND L. AMIRI Department of Statistics, Faculty of Science, Razi University, 67149, Kermanshah, Iran. ABSTRACT
More informationExponential tail inequalities for eigenvalues of random matrices
Exponential tail inequalities for eigenvalues of random matrices M. Ledoux Institut de Mathématiques de Toulouse, France exponential tail inequalities classical theme in probability and statistics quantify
More informationOn detection of unit roots generalizing the classic Dickey-Fuller approach
On detection of unit roots generalizing the classic Dickey-Fuller approach A. Steland Ruhr-Universität Bochum Fakultät für Mathematik Building NA 3/71 D-4478 Bochum, Germany February 18, 25 1 Abstract
More informationDISCRETE STOCHASTIC PROCESSES Draft of 2nd Edition
DISCRETE STOCHASTIC PROCESSES Draft of 2nd Edition R. G. Gallager January 31, 2011 i ii Preface These notes are a draft of a major rewrite of a text [9] of the same name. The notes and the text are outgrowths
More informationConvergence at first and second order of some approximations of stochastic integrals
Convergence at first and second order of some approximations of stochastic integrals Bérard Bergery Blandine, Vallois Pierre IECN, Nancy-Université, CNRS, INRIA, Boulevard des Aiguillettes B.P. 239 F-5456
More informationSupplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION. September 2017
Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION By Degui Li, Peter C. B. Phillips, and Jiti Gao September 017 COWLES FOUNDATION DISCUSSION PAPER NO.
More informationFast-slow systems with chaotic noise
Fast-slow systems with chaotic noise David Kelly Ian Melbourne Courant Institute New York University New York NY www.dtbkelly.com May 1, 216 Statistical properties of dynamical systems, ESI Vienna. David
More informationI forgot to mention last time: in the Ito formula for two standard processes, putting
I forgot to mention last time: in the Ito formula for two standard processes, putting dx t = a t dt + b t db t dy t = α t dt + β t db t, and taking f(x, y = xy, one has f x = y, f y = x, and f xx = f yy
More informationStatistical Properties of Numerical Derivatives
Statistical Properties of Numerical Derivatives Han Hong, Aprajit Mahajan, and Denis Nekipelov Stanford University and UC Berkeley November 2010 1 / 63 Motivation Introduction Many models have objective
More informationLARGE DEVIATION PROBABILITIES FOR SUMS OF HEAVY-TAILED DEPENDENT RANDOM VECTORS*
LARGE EVIATION PROBABILITIES FOR SUMS OF HEAVY-TAILE EPENENT RANOM VECTORS* Adam Jakubowski Alexander V. Nagaev Alexander Zaigraev Nicholas Copernicus University Faculty of Mathematics and Computer Science
More informationNotes 1 : Measure-theoretic foundations I
Notes 1 : Measure-theoretic foundations I Math 733-734: Theory of Probability Lecturer: Sebastien Roch References: [Wil91, Section 1.0-1.8, 2.1-2.3, 3.1-3.11], [Fel68, Sections 7.2, 8.1, 9.6], [Dur10,
More informationMod-φ convergence I: examples and probabilistic estimates
Mod-φ convergence I: examples and probabilistic estimates Valentin Féray (joint work with Pierre-Loïc Méliot and Ashkan Nikeghbali) Institut für Mathematik, Universität Zürich Summer school in Villa Volpi,
More informationExtreme Value Analysis and Spatial Extremes
Extreme Value Analysis and Department of Statistics Purdue University 11/07/2013 Outline Motivation 1 Motivation 2 Extreme Value Theorem and 3 Bayesian Hierarchical Models Copula Models Max-stable Models
More informationAn Introduction to Probability Theory and Its Applications
An Introduction to Probability Theory and Its Applications WILLIAM FELLER (1906-1970) Eugene Higgins Professor of Mathematics Princeton University VOLUME II SECOND EDITION JOHN WILEY & SONS Contents I
More informationSTA205 Probability: Week 8 R. Wolpert
INFINITE COIN-TOSS AND THE LAWS OF LARGE NUMBERS The traditional interpretation of the probability of an event E is its asymptotic frequency: the limit as n of the fraction of n repeated, similar, and
More informationKernels to detect abrupt changes in time series
1 UMR 8524 CNRS - Université Lille 1 2 Modal INRIA team-project 3 SSB group Paris joint work with S. Arlot, Z. Harchaoui, G. Rigaill, and G. Marot Computational and statistical trade-offs in learning IHES
More information