5.1 Approximation of convex functions

Size: px
Start display at page:

Download "5.1 Approximation of convex functions"

Transcription

1 Chapter 5 Convexity Very rough draft 5.1 Approximation of convex functions Convexity::approximation bound Expand Convexity simplifies arguments from Chapter 3. Reasons: local minima are global minima and pointwise convergence of a sequnce of convex functions implies uniform convergence on compacta. The first Lemma in this Section contains the standard result (Rockafellar 1970, Theorem 10.8) that pointwise convergence of a sequence of (deterministic) convex functions {H n } to a limit function B on an open set G implies uniform convergence on each compact subset of G. For a subset H of R k and δ > 0, write H δ for the set {y : d(y, H) δ}. For a bounded subset T of R k, call a finite subset T 0 T a δ-net for T if d(t, T 0 ) δ for ever t in T. Equivalently, the union of closed balls t T0 B(t, δ) covers T. Also equivalently, B(t, δ) T 0 for each t in T. Define T δ := {t R k : d(t, T ) δ}. <1> Lemma. (based on Pollard 1991) Let K be a compact,convex subset of R d. Let H( ) and A( ) be real-valued functions defined on K 4δ, with H convex and A satisfying the uniform continuity requirement A(s) A(t) η whenever s t δ and s, t K 4δ. Let T 0 be a δ-net for K 3δ. Then sup H(s) A(s) 4η + 3 max H(t) A(t) s K t T 0 Proof Write for the maximum of H(t) A(t) over the finite set T 0. The asserted inequality will follow from a pair of bounds for A: <> <3> H(s) A(s) + η + H(s) A(s) 4η 3 for all s in K δ for all s in K version: 9sept010 printed: 6 October 010 Asymptopia c David Pollard

2 Convexity Proof of upper bound (). Let s be a point of K δ. First I prove that s lies in the convex hull of the finite set T (s) := {t T : s t δ} = T 0 B(s, δ). Argue by contradiction. If s is not in the convex hull then, by the separating hyperplane theorem, there exists some unit vector u for which u s > u t for all t in T (s) The closed ball B(s, δ) around the point s = s+δu lies inside B(s, δ) K 4δ and it contains no points from T 0, which contradicts the δ-net property of T 0. It follows that s must be a convex combination of points in T s. That is, s = t T (s) α tt with α t 0 and t α t = 1. Then H(s) t α t H(t) by convexity of H t t α t (A(t) + ) α t (A(s) + η + ) definition of because s t δ for t T (s) A(s) + η +. Proof of lower bound <3>. Let s be a point of K. Find a point of T 0 with s t δ. Define s = t (s t). Then s T (s) K δ and t = (s + s )/. By convexity of H, H(t) (H(s) + H(s )) / From <>, and, by definition of, H(s ) A(s ) + η +, H(t) A(t) Combine the last three inequalities to get A(t) H(s) + A(s ) + η +. Both A(s ) and A(t) lie within η of A(s). Inequality <3> follows.

3 5.3 A limit theorem for M-estimators 3 5. Minimizers of convex functions Convexity::argmin growth<4> Suppose M is a strictly convex function on R k satisfing the growth condition M(t)/ t as t. Convexity::limit For each z in R k, the function M(t, z) := M(t) z t achieves its minimum value M (z) = inf t R k M(t) z t at a unique point, ψ M (z). I claim that the map z ψ M (z) is continuous. I would also like to show that if M(t, z) M (z)+ɛ then t must be close to ψ M (z) in some suitably uniform sense. What else would be needed to establish a more general form of Lemma <5>? 5.3 A limit theorem for M-estimators stoch.approximation <5> Lemma. Suppose {H n (t) : t R k } is a stochastic process with convex sample paths. Assume (i) H n (ˆt n ) inf t R k H n (t) + o p (1) for some random ˆt n (ii) there exists a sequence of random vectors {Z n } of order O p (1) for which H n (t) + Z nt 1 t in probability, for each fixed t R k. Then ˆt n = Z n + o p (1). Proof Define A n (t) = 1 t Z nt = 1 Z n t 1 Z n, so that H n (t) A n (t) 0 in probability, for each fixed t. Note that A n is minimized at Z n, which will imply that H n is minimized near Z n, with high probability. Suppose ɛ > 0 and η > 0 are given, small quantities. By (ii), there exists an R > 0 for which P{ Z n > R} < ɛ eventually. Choose a δ < 1/4 such that A n (s) A n (t) η when s t δ and Z n R and s, t B(0, R + 1)

4 4 Convexity Let T 0 be a δ-net for B(0, R+1). Then there exists a set Ω n with probability eventually greater than 1 ɛ, on which Z n R n := max t T0 H n (t) A n (t) η H n (ˆt n ) H n (Z n ) + η On Ω n, Lemma <1> implies that H n (t) A n (t) 7η for all t R. In particular, for all unit vectors v, H n (Z n ) 7η A n (Z n ) = A n (Z n + 6η) 18η H n (Z n + 6ηv) 11η Convexity of r H n (Z n + rv) then implies that H n (Z n ) + 4η H n (Z n + rv) for all r 6η and all unt vectors v, which forces ˆt n to lie within the ball of radius 6η around Z n. I would like to strengthen the Lemma by replacing (ii) by: there exists a sequence of random vectors {Z n } of order O p (1) for which H n (t) + Z nt M(t) in probability, for each fixed t R k, where M is a strictly convex function satisfying the growth condition <4>. I would like to conclude something like ˆt n = Ψ M (Z n ) + o p (1) If we had Z n Z then we would get ˆt n ψ M (Z), a very clean limit theorem. random.sum <6> Theorem. Let G n (θ) = i n g n,i(θ) be a random process with convex sample paths defined on R k. Suppose there exist points θ n in R k and positive definite matrices J n for which: (i) θ n is an estimator for which G n ( θ n ) inf θ Θ G n (θ) + o p (1)

5 5.4 Examples 5 (ii) the function θ PG n (θ) is minimized at θ n (iii) each {g n,i } has a linear approximation near θ n, g n,i (θ) = g n,i (θ n ) + (θ θ n ) D n,i + r n,i (θ), where the D n,i are random vectors with zero expected value (iv) PG n (θ n + Jn 1/ t) = 1 t + o(1) for each t ( ) (v) var i n r n,i(θ n + Jn 1/ t) = o(1) for each t (vi) Z n := Jn 1/ i n D n,i = O p (1) Then J 1/ n ( θ n θ n ) = Z n + o p (1). Try to replace (iv) by PG n (θ n + Jn 1/ t) = M(t) + o(1) for each t, with M as in Section. Maybe the Theorem will then be strong enough to cover the lasso (Example <17>). Proof The standardized random vector ˆt n = Jn 1/ ( θ n θ n ) comes within o p (1) of minimizing the process H n (t) := G n (θ n + J 1/ n t) G n (θ n ) = t Z n + i n r n,i(θ n + Jn 1/ t). For a fixed t, assumption (v) ensures that the sum contributed by the {r n,i } lies within o p (1) of its expectation, P i n r n,i(θ n + J 1/ n t) = PG n (θ n + J 1/ n t) PG n (θ n ) because PD n,i = 0 = 1 t + o(1) by (iv) Thus H n (t)+t Z n = 1 t +o p (1) for each fixed t. An appeal to Lemma <??> completes the proof. Remarks. The theorem is almost true if θ ranges over only a subset Θ of R d, instead of over the whole of R d.

6 6 Convexity Convexity::examples 5.4 Examples <7> Example. Suppose X 1, X,... are independent observations from a prob- ability distribution P on the real line with sample median θ 0. Suppose also that there is a neighborhood of θ 0 in which P has a continuous, strictly positive density (with respect to Lebesgue measure). Let θ n be the sample median. Show that f(θ 0 )( θ n θ 0 ) N(0, 1). median spatial.median LAD.regression <8> Example. Generalize Example <7> to cover argmin i t X i in higher dimensions. <9> Example. Theorem 1 of Pollard (1991) for L1 regression. exp.fam <10> Example. Modify the following result from Dou, Pollard, and Zhou (009), but with improved argument from talk at Wellner conference: psi3 Let {Q λ : λ R} be an exponential family of probability measures with densities dq λ /dq 0 = f λ (y) = exp (λy ψ(λ)). Remember that e ψ(λ) = Q 0 e λy and that the distribution Q λ has mean ψ (1) (λ) and variance ψ () (λ). We assume: (ψ3) There exists an increasing real function G on R + such that psi (ψ) ψ (3) (λ + h) ψ () (λ)g( h ) for all λ and h Without loss of generality we assume G(0) 1. For each ɛ > 0 there exists a finite constant C ɛ for which ψ () (λ) C ɛ exp(ɛλ ) for all λ R. Equivalently, ψ () (λ) exp ( o(λ ) ) as λ. As shown in Section??, these assumptions on the ψ function imply that hell<11> h (Q λ, Q λ+δ ) δ ψ () (λ) (1 + δ ) G( δ ) for all λ, δ R. Remark. We may assume that ψ () (λ) > 0 for every real λ. Otherwise we would have 0 = ψ () (λ 0 ) = var λ0 (y) = νf λ0 (y)(y ψ (1) (λ 0 )) for some λ 0, which would make y = ψ (1) (λ 0 ) for ν almost all y and Q λ Q λ0 for every λ.

7 5.4 Examples 7 The theory in this section combine ideas from Portnoy (1988) and from Hjort and Pollard (1993). We write our results in a notation that makes the applications in Section?? and?? more straightforward. The notational cost is that the parameters are indexed by {0, 1,..., N}. To avoid an excess of parentheses we write N + for N + 1. In the applications N changes with the sample size n and Q is replaced by Q n,a,b,n or Q n,a,b,n. Suppose ξ 1,..., ξ n are (nonrandom) vectors in R N +. Suppose Q = i n Q λi with λ i = ξ i γ for a fixed γ = (γ 0, γ 1,..., γ N ) in R N +. Under Q, the coordinate maps y 1,..., y n are independent random variables with y i Q λi. The log-likelihood for fitting the model is L n (g) = i n (ξ ig)y i ψ(ξ ig) for g R N +, which is maximized (over R N + ) at the MLE ĝ (= ĝ n ). Remark. As a small amount of extra bookkeeping in the following argument would show, we do not need ĝ to exactly maximize L n. It would suffice to have L n (ĝ) suitably close to sup g L n (g). In particular, we need not be concerned with questions regarding existence or uniqueness of the argmax. Define (i) J n = i n ξ iξ i ψ() (λ i ), an N + N + matrix (ii) w i := J 1/ n ξ i, an element of R N + (iii) W n = i n w i ( yi ψ (1) (λ i ) ), an element of R N + Notice that QW n = 0 and var Q (W n ) = i n w iw i ψ() (λ i ) = I N+ and Q W n = trace (var Q (W n )) = N +. mle.approx <1> Lemma. Suppose 0 < ɛ 1 1/ and 0 < ɛ < 1 and max i n w i ɛ 1ɛ G(1)N + with G as in Assumption (ψ3). Then ĝ = γ + J 1/ n (W n + r n ) with r n ɛ 1 on the set { W n N + /ɛ }, which has Q-probability greater than 1 ɛ.

8 8 Convexity Proof The equality Q W n = N + and Tchebychev give Q{ W n > N + /ɛ } ɛ. Reparametrize by defining t = Jn 1/ (g γ). The concave function L n (t) := L n (γ + Jn 1/ t) L n (γ) = y iw i n it + ψ(λ i ) ψ(λ i + w it) is maximized at t n = Jn 1/ (ĝ γ). It has derivative L n (t) = ( ) w i y i ψ (1) (λ i + w i n it). For a fixed unit vector u R N + and a fixed t R N +, consider the real-valued function of the real variable s, H(s) := u Ln (st) = ) i n u w i (y i ψ (1) (λ i + sw it), which has derivatives Ḣ(s) = i n (u w i )(w it)ψ () (λ i + sw it) Ḧ(s) = i n (u w i )(w it) ψ (3) (λ i + sw it). Notice that H(0) = u W n and Ḣ(0) = u i n w iw i ψ() (λ i )t = u t. Write M n for max i n w i. By virtue of Assumption (ψ3), Ḧ(s) i n u w i (w it) ψ () (λ i )G ( sw it ) M n G (M n st ) t i n w iw iψ () (λ i )t = M n G (M n st ) t. By Taylor expansion, for some 0 < s < 1, H(1) H(0) Ḣ(0) 1 Ḧ(s ) 1 M ng (M n t ) t. u ll<13> That is, ( ) u Ln (t) W n + t 1 M ng (M n t ) t. Approximation <13> will control the behavior of L(s) := L n (W n +su), a concave function of the real argument s, for each unit vector u. By concavity, the derivative L(s) = u Ln (W n + su) = s + R(s)

9 5.4 Examples 9 is a decreasing function of s with Thus R(s) 1 M ng (M n W n + su ) W n + su On the set { W n N + /ɛ } we have implying W n ± ɛ 1 u N + /ɛ + ɛ 1. M n W n ± ɛ 1 u ɛ 1ɛ G(1)N + ( N+ /ɛ + ɛ 1 ) < 1, R(±ɛ 1 ) 1 M ng(1) W n ± ɛ 1 u ɛ 1ɛ ( N+ /ɛ + ɛ ) 1 G(1)N + ɛ 1 ( 1 + ɛ 1 ɛ /N + ) < 5 8 ɛ 1. Deduce that L(ɛ 1 ) = ɛ 1 + R(ɛ 1 ) 3 8 ɛ 1 L( ɛ 1 ) = ɛ 1 + R( ɛ 1 ) 3 8 ɛ 1 The concave function s L n (W n +su) must achieve its maximum for some s in the interval [ ɛ 1, ɛ 1 ], for each unit vector u. It follows that t n W n ɛ 1. AnBn <14> Corollary. Suppose ξ i = Dη i for some nonsingular matrix D, so that J n = nda n D wherea n := 1 n η iη i n iψ () (λ i ). assumptiona<15> If B n is another nonsingular matrix for which A n B n ( B n 1 ) 1 and if max.eta<16> max i n η i ɛ n/n + G(1) 3 Bn 1 for some 0 < ɛ < 1 then for each set of vectors κ 0,..., κ N in R N + there is a set Y κ,ɛ with QY c κ,ɛ < ɛ on which 0 j N κ j(ĝ γ) 6 B n 1 nɛ 0 j N D 1 κ j.

10 10 Convexity Remark. For our applications of the Corollary in Sections?? and??, we need D = diag(d 0, D 1,..., D N ) and κ j = e j, the unit vector with a 1 in its jth position, for j m and κ j = 0 for j > m. In our companion paper we will need the more general κ j s. Proof First we establish a bound on the spectral distance between A 1 n and Bn 1. Define H = Bn 1 A n I. Then H B n 1 A n B n 1/, which justifies the expansion A 1 n Bn 1 = ( (I + H) 1 I ) Bn 1 j 1 H k B n 1 B n 1. As a consequence, A 1 n B 1 n. Choose ɛ 1 = 1/ and ɛ = ɛ in Lemma 1. The bound on max i n η i gives the bound on max i n w i needed by the Lemma: n w i = η id(j n /n) 1 Dη i = η ia 1 n η i A 1 n η i. Define K j := Jn 1/ κ j, so that κ j (ĝ γ) (K j W n) + (K j r n). By Cauchy-Schwarz, j (K jr n ) K j r n = U κ r n j where U κ := j κ jj 1 n κ j = j n 1 (D 1 κ j ) A 1 n D 1 κ j n 1 B 1 j D 1 κ j. n For the contribution V κ := j K j W n the Cauchy-Schwarz bound is too crude. Instead, notice that QV κ = U κ, which ensures that the complement of the set Y κ,ɛ := { W n N + /ɛ} {V κ U κ /ɛ} has Q probability less that ɛ. On the set Y κ,ɛ, 0 j N κ j(ĝ γ) V κ + U κ r n 3U κ /ɛ. The asserted bound follows. lasso <17> Example. Try to apply Theorem <6> to cover Theorem of Knight and Fu (000). Comment that more recent results with p = p n are much closer to the way the lasso is used.

11 Notes for Chapter 5 11 Notes Convexity: Huber (1964) Brown (1985) and Appendix A of Maritz (1981). Andersen and Gill (198) Haberman (1989) Niemiro (199) Bickel, Klaassen, Ritov, and Wellner (1993, pp 38, 473, 519) Ritov (1987) Jurečková (1977) References Andersen, P. K. and R. D. Gill (198). Cox s regression model for counting processes: a large sample study. Annals of Statistics 10, Bickel, P. J., C. A. J. Klaassen, Y. Ritov, and J. A. Wellner (1993). Efficient and Adaptive Estimation for Semiparametric Models. Baltimore: Johns Hopkins University Press. Brown, B. M. (1985). Multiparameter linearization theorems. Journal of the Royal Statistical Society, Series B 47, Dou, W., D. Pollard, and H. H. Zhou (009). Functional regression for general exponential families. Technical report, Yale University Statistics Department. Haberman, S. J. (1989). Concavity and estimation. Annals of Statistics 17, Hjort, N. L. and D. Pollard (1993). Asymptotics for minimisers of convex processes. Technical report, Yale University. Huber, P. J. (1964). Robust estimation of a location parameter. Annals of Mathematical Statistics 35, Jurečková, J. (1977). Asymptotic relations of m-estimates and r-estimates in linear regression model. Annals of Statistics 5, Knight, K. and W. Fu (000). Asymptotics for lasso-type estimators. Annals of Statistics 8 (5), Maritz, J. S. (1981). Distribution-Free Statistical Methods. Chapman and Hall. Niemiro, W. (199). Asymptotics for m-estimators defined by convex minimization. Annals of Statistics 0,

12 1 Convexity Pollard, D. (1991). Asymptotics for least absolute deviation regression estimators. Econometric Theory 7, Portnoy, S. (1988). Asymptotic behavior of likelihood methods for exponential families when the number of parameters tends to infinity. Annals of Statistics 16 (1), Ritov, Y. (1987). Tightness of monotone random fields. Journal of the Royal Statistical Society, Series B 49, Rockafellar, R. T. (1970). Convex Analysis. Princeton, New Jersey: Princeton Univ. Press.

FUNCTIONAL REGRESSION FOR GENERAL EXPONENTIAL FAMILIES

FUNCTIONAL REGRESSION FOR GENERAL EXPONENTIAL FAMILIES Submitted to the Annals of Statistics FUNCTIONAL REGRESSION FOR GENERAL EXPONENTIAL FAMILIES BY WEI DOU, DAVID POLLARD AND HARRISON H. ZHOU Yale University The paper derives a minimax lower bound for rates

More information

Classical regularity conditions

Classical regularity conditions Chapter 3 Classical regularity conditions Preliminary draft. Please do not distribute. The results from classical asymptotic theory typically require assumptions of pointwise differentiability of a criterion

More information

Local Asymptotic Normality

Local Asymptotic Normality Chapter 8 Local Asymptotic Normality 8.1 LAN and Gaussian shift families N::efficiency.LAN LAN.defn In Chapter 3, pointwise Taylor series expansion gave quadratic approximations to to criterion functions

More information

Chapter 4: Asymptotic Properties of the MLE

Chapter 4: Asymptotic Properties of the MLE Chapter 4: Asymptotic Properties of the MLE Daniel O. Scharfstein 09/19/13 1 / 1 Maximum Likelihood Maximum likelihood is the most powerful tool for estimation. In this part of the course, we will consider

More information

The properties of L p -GMM estimators

The properties of L p -GMM estimators The properties of L p -GMM estimators Robert de Jong and Chirok Han Michigan State University February 2000 Abstract This paper considers Generalized Method of Moment-type estimators for which a criterion

More information

Closest Moment Estimation under General Conditions

Closest Moment Estimation under General Conditions Closest Moment Estimation under General Conditions Chirok Han and Robert de Jong January 28, 2002 Abstract This paper considers Closest Moment (CM) estimation with a general distance function, and avoids

More information

1 Local Asymptotic Normality of Ranks and Covariates in Transformation Models

1 Local Asymptotic Normality of Ranks and Covariates in Transformation Models Draft: February 17, 1998 1 Local Asymptotic Normality of Ranks and Covariates in Transformation Models P.J. Bickel 1 and Y. Ritov 2 1.1 Introduction Le Cam and Yang (1988) addressed broadly the following

More information

M- and Z- theorems; GMM and Empirical Likelihood Wellner; 5/13/98, 1/26/07, 5/08/09, 6/14/2010

M- and Z- theorems; GMM and Empirical Likelihood Wellner; 5/13/98, 1/26/07, 5/08/09, 6/14/2010 M- and Z- theorems; GMM and Empirical Likelihood Wellner; 5/13/98, 1/26/07, 5/08/09, 6/14/2010 Z-theorems: Notation and Context Suppose that Θ R k, and that Ψ n : Θ R k, random maps Ψ : Θ R k, deterministic

More information

Closest Moment Estimation under General Conditions

Closest Moment Estimation under General Conditions Closest Moment Estimation under General Conditions Chirok Han Victoria University of Wellington New Zealand Robert de Jong Ohio State University U.S.A October, 2003 Abstract This paper considers Closest

More information

Fall, 2007 Nonlinear Econometrics. Theory: Consistency for Extremum Estimators. Modeling: Probit, Logit, and Other Links.

Fall, 2007 Nonlinear Econometrics. Theory: Consistency for Extremum Estimators. Modeling: Probit, Logit, and Other Links. 14.385 Fall, 2007 Nonlinear Econometrics Lecture 2. Theory: Consistency for Extremum Estimators Modeling: Probit, Logit, and Other Links. 1 Example: Binary Choice Models. The latent outcome is defined

More information

Efficiency of Profile/Partial Likelihood in the Cox Model

Efficiency of Profile/Partial Likelihood in the Cox Model Efficiency of Profile/Partial Likelihood in the Cox Model Yuichi Hirose School of Mathematics, Statistics and Operations Research, Victoria University of Wellington, New Zealand Summary. This paper shows

More information

LECTURE 15: COMPLETENESS AND CONVEXITY

LECTURE 15: COMPLETENESS AND CONVEXITY LECTURE 15: COMPLETENESS AND CONVEXITY 1. The Hopf-Rinow Theorem Recall that a Riemannian manifold (M, g) is called geodesically complete if the maximal defining interval of any geodesic is R. On the other

More information

CONDITIONED POISSON DISTRIBUTIONS AND THE CONCENTRATION OF CHROMATIC NUMBERS

CONDITIONED POISSON DISTRIBUTIONS AND THE CONCENTRATION OF CHROMATIC NUMBERS CONDITIONED POISSON DISTRIBUTIONS AND THE CONCENTRATION OF CHROMATIC NUMBERS JOHN HARTIGAN, DAVID POLLARD AND SEKHAR TATIKONDA YALE UNIVERSITY ABSTRACT. The paper provides a simpler method for proving

More information

PCA with random noise. Van Ha Vu. Department of Mathematics Yale University

PCA with random noise. Van Ha Vu. Department of Mathematics Yale University PCA with random noise Van Ha Vu Department of Mathematics Yale University An important problem that appears in various areas of applied mathematics (in particular statistics, computer science and numerical

More information

Least squares under convex constraint

Least squares under convex constraint Stanford University Questions Let Z be an n-dimensional standard Gaussian random vector. Let µ be a point in R n and let Y = Z + µ. We are interested in estimating µ from the data vector Y, under the assumption

More information

NONLINEAR LEAST-SQUARES ESTIMATION 1. INTRODUCTION

NONLINEAR LEAST-SQUARES ESTIMATION 1. INTRODUCTION NONLINEAR LEAST-SQUARES ESTIMATION DAVID POLLARD AND PETER RADCHENKO ABSTRACT. The paper uses empirical process techniques to study the asymptotics of the least-squares estimator for the fitting of a nonlinear

More information

Set, functions and Euclidean space. Seungjin Han

Set, functions and Euclidean space. Seungjin Han Set, functions and Euclidean space Seungjin Han September, 2018 1 Some Basics LOGIC A is necessary for B : If B holds, then A holds. B A A B is the contraposition of B A. A is sufficient for B: If A holds,

More information

The International Journal of Biostatistics

The International Journal of Biostatistics The International Journal of Biostatistics Volume 1, Issue 1 2005 Article 3 Score Statistics for Current Status Data: Comparisons with Likelihood Ratio and Wald Statistics Moulinath Banerjee Jon A. Wellner

More information

Probability and Measure

Probability and Measure Part II Year 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2018 84 Paper 4, Section II 26J Let (X, A) be a measurable space. Let T : X X be a measurable map, and µ a probability

More information

Section 8: Asymptotic Properties of the MLE

Section 8: Asymptotic Properties of the MLE 2 Section 8: Asymptotic Properties of the MLE In this part of the course, we will consider the asymptotic properties of the maximum likelihood estimator. In particular, we will study issues of consistency,

More information

The Canonical Gaussian Measure on R

The Canonical Gaussian Measure on R The Canonical Gaussian Measure on R 1. Introduction The main goal of this course is to study Gaussian measures. The simplest example of a Gaussian measure is the canonical Gaussian measure P on R where

More information

Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016

Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016 Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016 1 Entropy Since this course is about entropy maximization,

More information

An introduction to Mathematical Theory of Control

An introduction to Mathematical Theory of Control An introduction to Mathematical Theory of Control Vasile Staicu University of Aveiro UNICA, May 2018 Vasile Staicu (University of Aveiro) An introduction to Mathematical Theory of Control UNICA, May 2018

More information

2. The Concept of Convergence: Ultrafilters and Nets

2. The Concept of Convergence: Ultrafilters and Nets 2. The Concept of Convergence: Ultrafilters and Nets NOTE: AS OF 2008, SOME OF THIS STUFF IS A BIT OUT- DATED AND HAS A FEW TYPOS. I WILL REVISE THIS MATE- RIAL SOMETIME. In this lecture we discuss two

More information

1. Theorem. (Archimedean Property) Let x be any real number. There exists a positive integer n greater than x.

1. Theorem. (Archimedean Property) Let x be any real number. There exists a positive integer n greater than x. Advanced Calculus I, Dr. Block, Chapter 2 notes. Theorem. (Archimedean Property) Let x be any real number. There exists a positive integer n greater than x. 2. Definition. A sequence is a real-valued function

More information

Lecture Characterization of Infinitely Divisible Distributions

Lecture Characterization of Infinitely Divisible Distributions Lecture 10 1 Characterization of Infinitely Divisible Distributions We have shown that a distribution µ is infinitely divisible if and only if it is the weak limit of S n := X n,1 + + X n,n for a uniformly

More information

Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed

Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed 18.466 Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed 1. MLEs in exponential families Let f(x,θ) for x X and θ Θ be a likelihood function, that is, for present purposes,

More information

5 Measure theory II. (or. lim. Prove the proposition. 5. For fixed F A and φ M define the restriction of φ on F by writing.

5 Measure theory II. (or. lim. Prove the proposition. 5. For fixed F A and φ M define the restriction of φ on F by writing. 5 Measure theory II 1. Charges (signed measures). Let (Ω, A) be a σ -algebra. A map φ: A R is called a charge, (or signed measure or σ -additive set function) if φ = φ(a j ) (5.1) A j for any disjoint

More information

regression Lie Wang Abstract In this paper, the high-dimensional sparse linear regression model is considered,

regression Lie Wang Abstract In this paper, the high-dimensional sparse linear regression model is considered, L penalized LAD estimator for high dimensional linear regression Lie Wang Abstract In this paper, the high-dimensional sparse linear regression model is considered, where the overall number of variables

More information

LARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS. S. G. Bobkov and F. L. Nazarov. September 25, 2011

LARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS. S. G. Bobkov and F. L. Nazarov. September 25, 2011 LARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS S. G. Bobkov and F. L. Nazarov September 25, 20 Abstract We study large deviations of linear functionals on an isotropic

More information

Real Analysis Notes. Thomas Goller

Real Analysis Notes. Thomas Goller Real Analysis Notes Thomas Goller September 4, 2011 Contents 1 Abstract Measure Spaces 2 1.1 Basic Definitions........................... 2 1.2 Measurable Functions........................ 2 1.3 Integration..............................

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Chapter 8 Maximum Likelihood Estimation 8. Consistency If X is a random variable (or vector) with density or mass function f θ (x) that depends on a parameter θ, then the function f θ (X) viewed as a function

More information

Lecture 2: Uniform Entropy

Lecture 2: Uniform Entropy STAT 583: Advanced Theory of Statistical Inference Spring 218 Lecture 2: Uniform Entropy Lecturer: Fang Han April 16 Disclaimer: These notes have not been subjected to the usual scrutiny reserved for formal

More information

Z-estimators (generalized method of moments)

Z-estimators (generalized method of moments) Z-estimators (generalized method of moments) Consider the estimation of an unknown parameter θ in a set, based on data x = (x,...,x n ) R n. Each function h(x, ) on defines a Z-estimator θ n = θ n (x,...,x

More information

Constrained Optimization and Lagrangian Duality

Constrained Optimization and Lagrangian Duality CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may

More information

Lecture 4 Lebesgue spaces and inequalities

Lecture 4 Lebesgue spaces and inequalities Lecture 4: Lebesgue spaces and inequalities 1 of 10 Course: Theory of Probability I Term: Fall 2013 Instructor: Gordan Zitkovic Lecture 4 Lebesgue spaces and inequalities Lebesgue spaces We have seen how

More information

Complex Analysis Qualifying Exam Solutions

Complex Analysis Qualifying Exam Solutions Complex Analysis Qualifying Exam Solutions May, 04 Part.. Let log z be the principal branch of the logarithm defined on G = {z C z (, 0]}. Show that if t > 0, then the equation log z = t has exactly one

More information

STAT 992 Paper Review: Sure Independence Screening in Generalized Linear Models with NP-Dimensionality J.Fan and R.Song

STAT 992 Paper Review: Sure Independence Screening in Generalized Linear Models with NP-Dimensionality J.Fan and R.Song STAT 992 Paper Review: Sure Independence Screening in Generalized Linear Models with NP-Dimensionality J.Fan and R.Song Presenter: Jiwei Zhao Department of Statistics University of Wisconsin Madison April

More information

ROBUST - September 10-14, 2012

ROBUST - September 10-14, 2012 Charles University in Prague ROBUST - September 10-14, 2012 Linear equations We observe couples (y 1, x 1 ), (y 2, x 2 ), (y 3, x 3 ),......, where y t R, x t R d t N. We suppose that members of couples

More information

NETWORK-REGULARIZED HIGH-DIMENSIONAL COX REGRESSION FOR ANALYSIS OF GENOMIC DATA

NETWORK-REGULARIZED HIGH-DIMENSIONAL COX REGRESSION FOR ANALYSIS OF GENOMIC DATA Statistica Sinica 213): Supplement NETWORK-REGULARIZED HIGH-DIMENSIONAL COX REGRESSION FOR ANALYSIS OF GENOMIC DATA Hokeun Sun 1, Wei Lin 2, Rui Feng 2 and Hongzhe Li 2 1 Columbia University and 2 University

More information

Lecture 9 Metric spaces. The contraction fixed point theorem. The implicit function theorem. The existence of solutions to differenti. equations.

Lecture 9 Metric spaces. The contraction fixed point theorem. The implicit function theorem. The existence of solutions to differenti. equations. Lecture 9 Metric spaces. The contraction fixed point theorem. The implicit function theorem. The existence of solutions to differential equations. 1 Metric spaces 2 Completeness and completion. 3 The contraction

More information

POLARS AND DUAL CONES

POLARS AND DUAL CONES POLARS AND DUAL CONES VERA ROSHCHINA Abstract. The goal of this note is to remind the basic definitions of convex sets and their polars. For more details see the classic references [1, 2] and [3] for polytopes.

More information

Analysis Finite and Infinite Sets The Real Numbers The Cantor Set

Analysis Finite and Infinite Sets The Real Numbers The Cantor Set Analysis Finite and Infinite Sets Definition. An initial segment is {n N n n 0 }. Definition. A finite set can be put into one-to-one correspondence with an initial segment. The empty set is also considered

More information

Much of this Chapter is unedited or incomplete.

Much of this Chapter is unedited or incomplete. Chapter 9 Chaining methods Much of this Chapter is unedited or incomplete. 1. The chaining strategy The previous Chapter gave several ways to bound the tails of max i N X i. The chaining method applies

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

FULL LIKELIHOOD INFERENCES IN THE COX MODEL October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach

More information

E. The Hahn-Banach Theorem

E. The Hahn-Banach Theorem E. The Hahn-Banach Theorem This Appendix contains several technical results, that are extremely useful in Functional Analysis. The following terminology is useful in formulating the statements. Definitions.

More information

1 Glivenko-Cantelli type theorems

1 Glivenko-Cantelli type theorems STA79 Lecture Spring Semester Glivenko-Cantelli type theorems Given i.i.d. observations X,..., X n with unknown distribution function F (t, consider the empirical (sample CDF ˆF n (t = I [Xi t]. n Then

More information

Robust estimation, efficiency, and Lasso debiasing

Robust estimation, efficiency, and Lasso debiasing Robust estimation, efficiency, and Lasso debiasing Po-Ling Loh University of Wisconsin - Madison Departments of ECE & Statistics WHOA-PSI workshop Washington University in St. Louis Aug 12, 2017 Po-Ling

More information

g 2 (x) (1/3)M 1 = (1/3)(2/3)M.

g 2 (x) (1/3)M 1 = (1/3)(2/3)M. COMPACTNESS If C R n is closed and bounded, then by B-W it is sequentially compact: any sequence of points in C has a subsequence converging to a point in C Conversely, any sequentially compact C R n is

More information

Testing Restrictions and Comparing Models

Testing Restrictions and Comparing Models Econ. 513, Time Series Econometrics Fall 00 Chris Sims Testing Restrictions and Comparing Models 1. THE PROBLEM We consider here the problem of comparing two parametric models for the data X, defined by

More information

Continuous Functions on Metric Spaces

Continuous Functions on Metric Spaces Continuous Functions on Metric Spaces Math 201A, Fall 2016 1 Continuous functions Definition 1. Let (X, d X ) and (Y, d Y ) be metric spaces. A function f : X Y is continuous at a X if for every ɛ > 0

More information

Semiparametric posterior limits

Semiparametric posterior limits Statistics Department, Seoul National University, Korea, 2012 Semiparametric posterior limits for regular and some irregular problems Bas Kleijn, KdV Institute, University of Amsterdam Based on collaborations

More information

High-dimensional regression:

High-dimensional regression: High-dimensional regression: How to pick the objective function in high-dimension UC Berkeley March 11, 2013 Joint work with Noureddine El Karoui, Peter Bickel, Chingwhay Lim, and Bin Yu 1 / 12 Notation.

More information

An exponential family of distributions is a parametric statistical model having densities with respect to some positive measure λ of the form.

An exponential family of distributions is a parametric statistical model having densities with respect to some positive measure λ of the form. Stat 8112 Lecture Notes Asymptotics of Exponential Families Charles J. Geyer January 23, 2013 1 Exponential Families An exponential family of distributions is a parametric statistical model having densities

More information

MATH 722, COMPLEX ANALYSIS, SPRING 2009 PART 5

MATH 722, COMPLEX ANALYSIS, SPRING 2009 PART 5 MATH 722, COMPLEX ANALYSIS, SPRING 2009 PART 5.. The Arzela-Ascoli Theorem.. The Riemann mapping theorem Let X be a metric space, and let F be a family of continuous complex-valued functions on X. We have

More information

. Then V l on K l, and so. e e 1.

. Then V l on K l, and so. e e 1. Sanov s Theorem Let E be a Polish space, and define L n : E n M E to be the empirical measure given by L n x = n n m= δ x m for x = x,..., x n E n. Given a µ M E, denote by µ n the distribution of L n

More information

The Skorokhod problem in a time-dependent interval

The Skorokhod problem in a time-dependent interval The Skorokhod problem in a time-dependent interval Krzysztof Burdzy, Weining Kang and Kavita Ramanan University of Washington and Carnegie Mellon University Abstract: We consider the Skorokhod problem

More information

1 Lyapunov theory of stability

1 Lyapunov theory of stability M.Kawski, APM 581 Diff Equns Intro to Lyapunov theory. November 15, 29 1 1 Lyapunov theory of stability Introduction. Lyapunov s second (or direct) method provides tools for studying (asymptotic) stability

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Chapter 7 Maximum Likelihood Estimation 7. Consistency If X is a random variable (or vector) with density or mass function f θ (x) that depends on a parameter θ, then the function f θ (X) viewed as a function

More information

Convex Geometry. Carsten Schütt

Convex Geometry. Carsten Schütt Convex Geometry Carsten Schütt November 25, 2006 2 Contents 0.1 Convex sets... 4 0.2 Separation.... 9 0.3 Extreme points..... 15 0.4 Blaschke selection principle... 18 0.5 Polytopes and polyhedra.... 23

More information

Ergodic Theorems. 1 Strong laws of large numbers

Ergodic Theorems. 1 Strong laws of large numbers Ergodic Theorems 1 Strong laws of large numbers.................. 1 2 Heuristics and proof for the Ergodic theorem......... 3 3 Stationary processes (Theorem ).............. 5 4 The strong law of large

More information

Model Selection and Geometry

Model Selection and Geometry Model Selection and Geometry Pascal Massart Université Paris-Sud, Orsay Leipzig, February Purpose of the talk! Concentration of measure plays a fundamental role in the theory of model selection! Model

More information

In particular, if A is a square matrix and λ is one of its eigenvalues, then we can find a non-zero column vector X with

In particular, if A is a square matrix and λ is one of its eigenvalues, then we can find a non-zero column vector X with Appendix: Matrix Estimates and the Perron-Frobenius Theorem. This Appendix will first present some well known estimates. For any m n matrix A = [a ij ] over the real or complex numbers, it will be convenient

More information

Convergence of generalized entropy minimizers in sequences of convex problems

Convergence of generalized entropy minimizers in sequences of convex problems Proceedings IEEE ISIT 206, Barcelona, Spain, 2609 263 Convergence of generalized entropy minimizers in sequences of convex problems Imre Csiszár A Rényi Institute of Mathematics Hungarian Academy of Sciences

More information

Sharp Sobolev Strichartz estimates for the free Schrödinger propagator

Sharp Sobolev Strichartz estimates for the free Schrödinger propagator Sharp Sobolev Strichartz estimates for the free Schrödinger propagator Neal Bez, Chris Jeavons and Nikolaos Pattakos Abstract. We consider gaussian extremisability of sharp linear Sobolev Strichartz estimates

More information

The expansion of random regular graphs

The expansion of random regular graphs The expansion of random regular graphs David Ellis Introduction Our aim is now to show that for any d 3, almost all d-regular graphs on {1, 2,..., n} have edge-expansion ratio at least c d d (if nd is

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 12: Glivenko-Cantelli and Donsker Results

Introduction to Empirical Processes and Semiparametric Inference Lecture 12: Glivenko-Cantelli and Donsker Results Introduction to Empirical Processes and Semiparametric Inference Lecture 12: Glivenko-Cantelli and Donsker Results Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics

More information

Problem Set. Problem Set #1. Math 5322, Fall March 4, 2002 ANSWERS

Problem Set. Problem Set #1. Math 5322, Fall March 4, 2002 ANSWERS Problem Set Problem Set #1 Math 5322, Fall 2001 March 4, 2002 ANSWRS i All of the problems are from Chapter 3 of the text. Problem 1. [Problem 2, page 88] If ν is a signed measure, is ν-null iff ν () 0.

More information

Doléans measures. Appendix C. C.1 Introduction

Doléans measures. Appendix C. C.1 Introduction Appendix C Doléans measures C.1 Introduction Once again all random processes will live on a fixed probability space (Ω, F, P equipped with a filtration {F t : 0 t 1}. We should probably assume the filtration

More information

Information in a Two-Stage Adaptive Optimal Design

Information in a Two-Stage Adaptive Optimal Design Information in a Two-Stage Adaptive Optimal Design Department of Statistics, University of Missouri Designed Experiments: Recent Advances in Methods and Applications DEMA 2011 Isaac Newton Institute for

More information

X n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2)

X n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2) 14:17 11/16/2 TOPIC. Convergence in distribution and related notions. This section studies the notion of the so-called convergence in distribution of real random variables. This is the kind of convergence

More information

Asymptotic normality of the L k -error of the Grenander estimator

Asymptotic normality of the L k -error of the Grenander estimator Asymptotic normality of the L k -error of the Grenander estimator Vladimir N. Kulikov Hendrik P. Lopuhaä 1.7.24 Abstract: We investigate the limit behavior of the L k -distance between a decreasing density

More information

The 123 Theorem and its extensions

The 123 Theorem and its extensions The 123 Theorem and its extensions Noga Alon and Raphael Yuster Department of Mathematics Raymond and Beverly Sackler Faculty of Exact Sciences Tel Aviv University, Tel Aviv, Israel Abstract It is shown

More information

Online Bounds for Bayesian Algorithms

Online Bounds for Bayesian Algorithms Online Bounds for Bayesian Algorithms Sham M. Kakade Computer and Information Science Department University of Pennsylvania Andrew Y. Ng Computer Science Department Stanford University Abstract We present

More information

Math 172 Problem Set 8 Solutions

Math 172 Problem Set 8 Solutions Math 72 Problem Set 8 Solutions Problem. (i We have (Fχ [ a,a] (ξ = χ [ a,a] e ixξ dx = a a e ixξ dx = iξ (e iax e iax = 2 sin aξ. ξ (ii We have (Fχ [, e ax (ξ = e ax e ixξ dx = e x(a+iξ dx = a + iξ where

More information

Generalized Neyman Pearson optimality of empirical likelihood for testing parameter hypotheses

Generalized Neyman Pearson optimality of empirical likelihood for testing parameter hypotheses Ann Inst Stat Math (2009) 61:773 787 DOI 10.1007/s10463-008-0172-6 Generalized Neyman Pearson optimality of empirical likelihood for testing parameter hypotheses Taisuke Otsu Received: 1 June 2007 / Revised:

More information

Functional Analysis HW #3

Functional Analysis HW #3 Functional Analysis HW #3 Sangchul Lee October 26, 2015 1 Solutions Exercise 2.1. Let D = { f C([0, 1]) : f C([0, 1])} and define f d = f + f. Show that D is a Banach algebra and that the Gelfand transform

More information

Calculus of Variations. Final Examination

Calculus of Variations. Final Examination Université Paris-Saclay M AMS and Optimization January 18th, 018 Calculus of Variations Final Examination Duration : 3h ; all kind of paper documents (notes, books...) are authorized. The total score of

More information

The Codimension of the Zeros of a Stable Process in Random Scenery

The Codimension of the Zeros of a Stable Process in Random Scenery The Codimension of the Zeros of a Stable Process in Random Scenery Davar Khoshnevisan The University of Utah, Department of Mathematics Salt Lake City, UT 84105 0090, U.S.A. davar@math.utah.edu http://www.math.utah.edu/~davar

More information

We denote the space of distributions on Ω by D ( Ω) 2.

We denote the space of distributions on Ω by D ( Ω) 2. Sep. 1 0, 008 Distributions Distributions are generalized functions. Some familiarity with the theory of distributions helps understanding of various function spaces which play important roles in the study

More information

1 Topology Definition of a topology Basis (Base) of a topology The subspace topology & the product topology on X Y 3

1 Topology Definition of a topology Basis (Base) of a topology The subspace topology & the product topology on X Y 3 Index Page 1 Topology 2 1.1 Definition of a topology 2 1.2 Basis (Base) of a topology 2 1.3 The subspace topology & the product topology on X Y 3 1.4 Basic topology concepts: limit points, closed sets,

More information

SUPPLEMENTARY MATERIAL FOR: ADAPTIVE ROBUST VARIABLE SELECTION

SUPPLEMENTARY MATERIAL FOR: ADAPTIVE ROBUST VARIABLE SELECTION SUPPLEMENTARY MATERIAL FOR: ADAPTIVE ROBUST VARIABLE SELECTION By Jianqing Fan, Yingying Fan and Emre Barut Princeton University, University of Southern California and IBM T.J. Watson Research Center APPENDIX

More information

Probability and Measure

Probability and Measure Probability and Measure Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USA Convergence of Random Variables 1. Convergence Concepts 1.1. Convergence of Real

More information

Notions such as convergent sequence and Cauchy sequence make sense for any metric space. Convergent Sequences are Cauchy

Notions such as convergent sequence and Cauchy sequence make sense for any metric space. Convergent Sequences are Cauchy Banach Spaces These notes provide an introduction to Banach spaces, which are complete normed vector spaces. For the purposes of these notes, all vector spaces are assumed to be over the real numbers.

More information

A User s Guide to Measure Theoretic Probability Errata and comments

A User s Guide to Measure Theoretic Probability Errata and comments A User s Guide to Measure Theoretic Probability Errata and comments Chapter 2. page 25, line -3: Upper limit on sum should be 2 4 n page 34, line -10: case of a probability measure page 35, line 20 23:

More information

Corrections. ir j r mod n 2 ). (i r j r ) mod n should be d. p. 83, l. 2 [p. 80, l. 5]: d. p. 92, l. 1 [p. 87, l. 2]: Λ Z d should be Λ Z d.

Corrections. ir j r mod n 2 ). (i r j r ) mod n should be d. p. 83, l. 2 [p. 80, l. 5]: d. p. 92, l. 1 [p. 87, l. 2]: Λ Z d should be Λ Z d. Corrections We list here typos and errors that have been found in the published version of the book. We welcome any additional corrections you may find! We indicate page and line numbers for the published

More information

On Acceleration with Noise-Corrupted Gradients. + m k 1 (x). By the definition of Bregman divergence:

On Acceleration with Noise-Corrupted Gradients. + m k 1 (x). By the definition of Bregman divergence: A Omitted Proofs from Section 3 Proof of Lemma 3 Let m x) = a i On Acceleration with Noise-Corrupted Gradients fxi ), u x i D ψ u, x 0 ) denote the function under the minimum in the lower bound By Proposition

More information

MATH 131A: REAL ANALYSIS (BIG IDEAS)

MATH 131A: REAL ANALYSIS (BIG IDEAS) MATH 131A: REAL ANALYSIS (BIG IDEAS) Theorem 1 (The Triangle Inequality). For all x, y R we have x + y x + y. Proposition 2 (The Archimedean property). For each x R there exists an n N such that n > x.

More information

Submitted to the Brazilian Journal of Probability and Statistics

Submitted to the Brazilian Journal of Probability and Statistics Submitted to the Brazilian Journal of Probability and Statistics Multivariate normal approximation of the maximum likelihood estimator via the delta method Andreas Anastasiou a and Robert E. Gaunt b a

More information

GRAPH PARTITIONING USING SINGLE COMMODITY FLOWS [KRV 06] 1. Preliminaries

GRAPH PARTITIONING USING SINGLE COMMODITY FLOWS [KRV 06] 1. Preliminaries GRAPH PARTITIONING USING SINGLE COMMODITY FLOWS [KRV 06] notes by Petar MAYMOUNKOV Theme The algorithmic problem of finding a sparsest cut is related to the combinatorial problem of building expander graphs

More information

Least Squares Based Self-Tuning Control Systems: Supplementary Notes

Least Squares Based Self-Tuning Control Systems: Supplementary Notes Least Squares Based Self-Tuning Control Systems: Supplementary Notes S. Garatti Dip. di Elettronica ed Informazione Politecnico di Milano, piazza L. da Vinci 32, 2133, Milan, Italy. Email: simone.garatti@polimi.it

More information

Global Maxwellians over All Space and Their Relation to Conserved Quantites of Classical Kinetic Equations

Global Maxwellians over All Space and Their Relation to Conserved Quantites of Classical Kinetic Equations Global Maxwellians over All Space and Their Relation to Conserved Quantites of Classical Kinetic Equations C. David Levermore Department of Mathematics and Institute for Physical Science and Technology

More information

Some Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model

Some Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model Some Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model 1. Introduction Varying-coefficient partially linear model (Zhang, Lee, and Song, 2002; Xia, Zhang, and Tong, 2004;

More information

The Heine-Borel and Arzela-Ascoli Theorems

The Heine-Borel and Arzela-Ascoli Theorems The Heine-Borel and Arzela-Ascoli Theorems David Jekel October 29, 2016 This paper explains two important results about compactness, the Heine- Borel theorem and the Arzela-Ascoli theorem. We prove them

More information

Problem Set 5: Solutions Math 201A: Fall 2016

Problem Set 5: Solutions Math 201A: Fall 2016 Problem Set 5: s Math 21A: Fall 216 Problem 1. Define f : [1, ) [1, ) by f(x) = x + 1/x. Show that f(x) f(y) < x y for all x, y [1, ) with x y, but f has no fixed point. Why doesn t this example contradict

More information

Introductory Analysis I Fall 2014 Homework #9 Due: Wednesday, November 19

Introductory Analysis I Fall 2014 Homework #9 Due: Wednesday, November 19 Introductory Analysis I Fall 204 Homework #9 Due: Wednesday, November 9 Here is an easy one, to serve as warmup Assume M is a compact metric space and N is a metric space Assume that f n : M N for each

More information

Concentration behavior of the penalized least squares estimator

Concentration behavior of the penalized least squares estimator Concentration behavior of the penalized least squares estimator Penalized least squares behavior arxiv:1511.08698v2 [math.st] 19 Oct 2016 Alan Muro and Sara van de Geer {muro,geer}@stat.math.ethz.ch Seminar

More information

ANALYSIS QUALIFYING EXAM FALL 2017: SOLUTIONS. 1 cos(nx) lim. n 2 x 2. g n (x) = 1 cos(nx) n 2 x 2. x 2.

ANALYSIS QUALIFYING EXAM FALL 2017: SOLUTIONS. 1 cos(nx) lim. n 2 x 2. g n (x) = 1 cos(nx) n 2 x 2. x 2. ANALYSIS QUALIFYING EXAM FALL 27: SOLUTIONS Problem. Determine, with justification, the it cos(nx) n 2 x 2 dx. Solution. For an integer n >, define g n : (, ) R by Also define g : (, ) R by g(x) = g n

More information

PROBLEMS. (b) (Polarization Identity) Show that in any inner product space

PROBLEMS. (b) (Polarization Identity) Show that in any inner product space 1 Professor Carl Cowen Math 54600 Fall 09 PROBLEMS 1. (Geometry in Inner Product Spaces) (a) (Parallelogram Law) Show that in any inner product space x + y 2 + x y 2 = 2( x 2 + y 2 ). (b) (Polarization

More information

Inference For High Dimensional M-estimates: Fixed Design Results

Inference For High Dimensional M-estimates: Fixed Design Results Inference For High Dimensional M-estimates: Fixed Design Results Lihua Lei, Peter Bickel and Noureddine El Karoui Department of Statistics, UC Berkeley Berkeley-Stanford Econometrics Jamboree, 2017 1/49

More information