NONLINEAR LEAST-SQUARES ESTIMATION 1. INTRODUCTION
|
|
- Aileen Walsh
- 5 years ago
- Views:
Transcription
1 NONLINEAR LEAST-SQUARES ESTIMATION DAVID POLLARD AND PETER RADCHENKO ABSTRACT. The paper uses empirical process techniques to study the asymptotics of the least-squares estimator for the fitting of a nonlinear regression function. By combining and extending ideas of Wu and Van de Geer, it establishes new consistency and central limit theorems that hold under only second moment assumptions on the errors. An application to a delicate example of Wu s illustrates the use of the new theorems, leading to a normal approximation to the least-squares estimator with unusual logarithmic rescalings. 1. INTRODUCTION Consider the model where we observe y i for i =1,...,nwith 1 y i = f i θ+u i where θ Θ. The unobserved f i can be random or deterministic functions. The unobserved errors u i are random with zero means and finite variances. The index set Θ might be infinite dimensional. Later in the paper it will prove convenient to also consider triangular arrays of observations. Think of fθ = f 1 θ,...,f n θ and u = u 1,...,u n as points in R n. The model specifies a surface M Θ = {fθ : θ Θ} in R n. The vector of observations Date: Original version April 1995; revised March 2002; revised 2 December Mathematics Subject Classification. Primary 62E20. Secondary: 60F05, 62G08, 62G20. Key words and phrases. Nonlinear least squares, empirical processes, subgaussian, consistency, central limit theorem. 1
2 NONLINEAR LEAST-SQUARES ESTIMATION 2 y =y 1,...,y n is a random point in R n. The least squares estimator LSE θ n is defined to minimize the distance of y to M Θ, θ n = argmin θ Θ y fθ 2, where denotes the usual Euclidean norm on R n. Many authors have considered the behavior of θ n as n when the y i are generated by the model for a fixed θ 0 in Θ. When the f i are deterministic, it is natural to express assertions about convergence of θ n in terms of the n-dimensional Euclidean distance κ n θ 1,θ 2 := fθ 1 fθ 2. For example, Jennrich 1969 took Θ to be a compact subset of R p, the errors {u i } to be iid with zero mean and finite variance, and the f i to be continuous functions in θ. He proved strong consistency of the least squares estimator under the assumption that n 1 κ n θ 1,θ 2 2 converges uniformly to a continuous function that is zero if and only if θ 1 = θ 2.Healsogave conditions for asymptotic normality. Under similar assumptions Wu 1981, Theorem 1 proved that existence of a consistent estimator for θ 0 implies that 2 κ n θ :=κ n θ, θ 0 at each θ θ 0. If Θ is finite, the divergence 2 is also a sufficient condition for the existence of a consistent estimator Wu 1981, Theorem 2. His main consistency result his Theorem 3 may be reexpressed as a general convergence assertion. Theorem 1. Suppose the {f i } are deterministic functions indexed by a subset Θ of R p. Suppose also that sup i varu i < and κ n θ at each θ θ 0.LetSbe a bounded subset of Θ\{θ 0 } and let R n := inf θ S κ n θ. Suppose there exist constants {L i } such that i sup θ S f i θ f i θ 0 L i for each i; ii f i θ 1 f i θ 2 L i θ 1 θ 2 for all θ 1,θ 2 S; iii i n L2 i = ORn α for some α<4. Then P{ θ n / S eventually } =1.
3 NONLINEAR LEAST-SQUARES ESTIMATION 3 Remark. Assumption i implies i n L2 i κ nθ 2 for each θ in S, which forces R n. If Θ is compact and if for each θ θ 0 there is a neighborhood S = S θ satisfying the conditions of the Lemma then θ n θ 0 almost surely. Wu s paper was the starting point for several authors. For example, both Lai 1994 and Skouras 2000 generalized Wu s consistency results by taking the functions f i θ = f i θ, ω as random processes indexed by θ. They took the {u i } as a martingale difference sequence, with {f i } a predictable sequence of functions with respect to a filtration {F i }. Another line of development is typified by the work of Van de Geer 1990 and Van de Geer and Wegkamp They took f i θ =fx i,θ, where F = {f θ :Θ} is a set of deterministic functions and the x i,u i are iid pairs. In fact they identified Θ with the index set F. Under a stronger assumption about the errors, they established rates of convergence of κ n θ n in terms of L 2 entropy conditions on F, using empirical process methods that were developed after Wu s work. The stronger assumption was that the errors are uniformly subgaussian. In general, we say that a random variable W has a subgaussian distribution if there exists some finite τ such that P exptw exp 1 2 τ 2 t 2 for all t R. We write τw for the smallest such τ. Van de Geer assumed that sup i τu i <. Remark. Notice that we must have PW =0when W is subgaussian because the linear term in the expansion of P exptw must vanish. When PW =0, subgaussianity is equivalent to existence of a finite constant β for which P{ W x} 2 exp x 2 /β 2 for all x 0. In our paper we try to bring together the two lines of development. Our main motivation for working on nonlinear least squares was an example presented by Wu 1981, page 507. He noted that his consistency theorem has difficulties with a simple model, 3 f i θ =λi µ for θ =λ, µ Θ, a compact subset of R R +.
4 NONLINEAR LEAST-SQUARES ESTIMATION 4 For example, condition 2 does not hold for θ 0 =0, 0 at any θ with µ>1/2. When θ 0 =λ 0, 1/2, Wu s method fails in a more subtle way, but Van de Geer 1990 s method would work if the errors satisfied the subgaussian assumption. In Section 4, under only second moment assumptions on the errors, we establish weak consistency and a central limit theorem. The main idea behind all the proofs ours, as well as those of Wu and Van de Geer is quite simple. The LSE also minimizes the random function 4 G n θ := y fθ 2 u 2 = κ n θ 2 2Z n θ where Z n θ :=u fθ u fθ 0. In particular, G n θ n G n θ 0 =0,thatis, 1 2 κ n θ n 2 Z n θ n. For every subset S of Θ, 5 P{ θ n S} P{ θ S : Z n θ 1κ 2 nθ 2 } 4Psup Z n θ 2 / inf κ nθ 4. θ S θ S The final bound calls for a maximal inequality for Z n. Our methods for controlling Z n are similar in spirit to those of Van de Geer. Under her subgaussian assumption, for every class of real functions {g θ : θ Θ}, the process 6 Xθ = i n u ig i θ has subgaussian increments. Indeed, if τu i τ for all i then 2 2 τ Xθ 1 Xθ 2 i n τu i 2 g i θ 1 g i θ 2 τ 2 gθ 1 gθ 2 2. That is, the tails of Xθ 1 Xθ 2 are controlled by the n-dimensional Euclidean distance between the vectors gθ 1 and gθ 2. This property allowed her to invoke a chaining bound similar to our Theorem 2 for the tail probabilities of sup θ S Z n θ for various annuli S = {θ : R κ n θ < 2R}. Under the weaker second moment assumption on the errors, we apply symmetrization arguments to transform to a problem involving a new process Znθ with conditionally subgaussian increments. We avoid Van de Geer s subgaussianity assumption at the cost of extra Lipschitz conditions on the f i θ, analogous to Assumption ii of Theorem 1,
5 NONLINEAR LEAST-SQUARES ESTIMATION 5 which lets us invoke chaining bounds for conditional second moments of sup θ S Znθ for various S. In Section 3 we prove a new consistency theorem Theorem 3 and a new central limit theorem Theorem 4, generalizing Wu s Theorem 5 for nonlinear LSEs. More precisely, our consistency theorem corresponds to an explicit bound for P{κ n θ n R}, but we state the result in a form that makes comparison with Theorem 1 easier. Our Theorem does not imply almost sure convergence, but our techniques could easily be adapted to that task. We regard the consistency as a preliminary to the next level of asymptotics and not as an end in itself. We describe the local asymptotic behavior with another approximation result, Theorem 4, which can easily be transformed into a central limit theorem under a variety of mild assumptions on the {u i } errors. For example, in Section 4 we apply the Theorem to the model 3, to sharpen the consistency result at θ 0 =1, 1/2 into the approximation 7 l 1/2 n λ n 1,l 3/2 n 1 2 µ n = i n u iζ i,n + o p 1 where l n := log n and ζ i,n = i 1/2 l 1/2 n l i /l n The sum on the right-hand side of 7 is of order O p 1 when sup i varu i <. Ifthe{u i } are also identically distributed, the sum has a limiting multivariate normal distribution. 2. MAXIMAL INEQUALITIES Assumption ii of Theorem 1 ensures that the increments Z n θ 1 Z n θ 2 are controlled by the ordinary Euclidean distance in Θ; we allow for control by more general metrics. Wu invoked a maximal inequality for sums of random continuous processes, a result derived from a bound on the covering numbers for M θ as a subset of R n under the usual Euclidean distance; we work with covering numbers for other metrics.
6 NONLINEAR LEAST-SQUARES ESTIMATION 6 Definition 1. Let T,d be a metric space. The covering number Nδ, S, d is defined as the size of the smallest δ-net for S, that is, the smallest N for which there are points t 1,...,t N in T with min i ds, t i δ for every s in S. Remark. The definition is the same for a pseudometric space, that is, a space where dθ 1,θ 2 =0need not imply θ 1 = θ 2. In fact, all results in our paper that refer to metric spaces also apply to pseudometric spaces. The slight increase in generality is sometimes convenient when dealing with metrics defined by L p norms on functions. Standard chaining arguments see, for example, Pollard 1989, give maximal inequalities for processes with subgaussian increments controlled by a metric on the index set. Theorem 2. Let {W t : t T } be a stochastic process, indexed by a metric space T,d, with subgaussian increments. Let T δ be a δ-net for T. Suppose: i there is a constant K such that τw s W t Kds, t for all s, t T ; ii J δ := δ 1 ρny, S, d dy <, where ρn := + log N. 0 Then there is a universal constant c 1 such that 1 P supt W t c 2 KJ δ + ρnδ, T, d max s Tδ τw s. 1 Remark. We should perhaps work with outer expectations because, in general, there is no guarantee that a supremum of uncountably many random variables is measurable. For concrete examples, such as the one discussed in Section 4, measurability can usually be established by routine separability arguments. Accordingly, we will ignore the issue in this paper. Under the assumption that varu i σ 2,theX process from 6 need not have subgaussian increments. However, it can be bounded in a stochastic sense by a symmetrized process X θ := i n ɛ iu i g i θ, where the 2n random variables ɛ 1,...,ɛ n,u 1,...,u n are mutually independent with P{ɛ i =+1} =1/2 =P{ɛ i = 1}. In fact, for each subset S of the index set Θ, 8 P sup θ S Xθ 2 4P sup θ S X θ 2.
7 NONLINEAR LEAST-SQUARES ESTIMATION 7 For a proof see, for example, van der Vaart and Wellner 1996, Lemma The process X has conditionally subgaussian increments with τ u Xθ 1 Xθ 2 i n u2 i g i θ 1 g i θ 2. The subscript u indicates the conditioning on u. Corollary 1. Let S δ be a δ-net for S and let X be as in 6. Suppose i Pu i =0and varu i σ 2 for i =1,...,n ii there is a metric d for which J δ := δ ρny, S, d dy < 0 iii there are constants L 1,...,L n for which g i θ 1 g i θ 2 L i dθ 1,θ 2 for all i and all θ 1,θ 2 S iv there are constants b 1,...,b n for which g i θ b i for all i and all θ in S. Then there is a universal constant c 2 such that where L := i L2 i and B := i b2 i. P sup X θ 2 c 2 2σ 2 LJ δ + BρNδ, S, d 2 θ S Proof. From 9, τ u X θ 1 X θ 2 L u dθ 1,θ 2 where L u := i n L2 i u2 i and τ u X θ B u := i n b2 i u2 i Apply Theorem 2 conditionally to the process X to bound P u sup θ S X θ 2. Then invoke inequality 8, using the fact that PL 2 u σ 2 L 2 and PB 2 u σ 2 B 2.
8 NONLINEAR LEAST-SQUARES ESTIMATION 8 3. LIMIT THEOREMS Inequality 5 and Corollary 1, with g i θ =f i θ f i θ 0, give us some probabilistic control over θ n. Theorem 3. Let S be a subset of Θ equipped with a pseudometric d. Let {L i : i = 1,...,n}, {b i : i =1,...,n}, and δ be positive constants such that i f i θ 1 f i θ 2 L i dθ 1,θ 2 for all θ 1,θ 2 S ii f i θ f i θ 0 b i for all θ S iii J δ := δ Ny, ρ S, d dy < 0 Then 2/R P{ θ n S} 4c 2 2σ Bρ 2 Nδ, S, d + LJ 4 δ, where R := inf{κθ :θ S}, and L 2 = i L2 i, and B 2 := i b2 i. The Theorem becomes more versatile in its application if we partition S into a countable union of subsets S k, each equipped with its own pseudometric and Lipschitz constants. We then have P{ θ n k S k } smaller than a sum over k of bounds analogous to those in the Theorem. As shown in Section 4, this method works well for the Wu example if we take S k = {θ : R k κ n θ <R k+1 }, for an {R k } sequence increasing geometrically. A similar appeal to Corollary 1, with the g i θ as partial derivatives of f i θ functions, gives us enough local control over Z n to go beyond consistency. To accommodate the application in Section 4, we change notation slightly by working with a triangular array: for each n, y in = f in θ 0 +u in, for i =1, 2,...,n, where the {u in : i =1,...,n} are unobserved independent random variables with mean zero and variance bounded by σ 2. Theorem 4. Suppose θ n θ 0 in probability, with θ 0 an interior point of Θ, a subset of R p. Suppose also:
9 NONLINEAR LEAST-SQUARES ESTIMATION 9 i Each f in is continuously differentiable in a neighborhood N of θ 0 with derivatives D in θ = f in θ / θ. ii γ 2 n := i n D inθ 0 2 as n. iii There are constants {M in } with i n M 2 in = Oγ 2 n andametricd on N for which D in θ 1 D in θ 2 M in dθ 1,θ 2 for θ 1,θ 2 N. iv The smallest eigenvalue of the matrix V n away from zero for n large enough. v 1 Ny, ρ N,d dy < 0 vi dθ, θ 0 0 as θ θ 0. Then κ n θ n =O p 1 and where ξ i,n = γn 1 Vn 1 D in θ 0. = γ 2 n γ n θ n θ 0 = i n ξ i,nu in + o p 1 = O p 1. i n D inθ 0 D in θ 0 is bounded Proof. Let D be the p n matrix with ith column D in θ 0, so that γ 2 n = tracedd and V n = γn 2 DD. The main idea of the proof is to replace fθ by fθ 0 +D θ θ 0, thereby approximating θ n by the least-squares solution θ n := θ 0 +DD 1 Du = argmin θ R p y fθ 0 D θ θ 0. To simplify notation, assume with no loss of generality, that fθ 0 =0and θ 0 =0.Also, drop extra n subscripts when the meaning is clear. The assertion of the Theorem is that θ n = θ n + o p γn 1. Without loss of generality, suppose the smallest eigenvalue of V n is larger than a fixed constant c 2 0 > 0. Then from which it follows that γ 2 n = tracedd sup t 1 D t 2 inf t 1 D t 2 = c 2 0γ 2 n, 10 c 0 t D t /γ n t for all t R p.
10 NONLINEAR LEAST-SQUARES ESTIMATION 10 Similarly, P Du 2 = trace DPuu D σ 2 γn, 2 implying that Du = O p γ n and θ n = γ 2 n Vn 1 Du = O p γn 1. In particular, P{θ n N} 1, because θ 0 is an interior point of Θ. Note also that P i n ξ iu i 2 σ 2 trace i n ξ iξ i=σ 2 tracev 1 n <. Consequently i n ξ iu i = O p 1. From the assumed consistency, we know that there is a sequence of balls N n N that shrink to {0} for which P{ θ n N n } 1. From vi and v, it follows that both r n := sup{dθ, 0 : θ N n } and J rn = r n ρ Ny, N,d dy converge to zero as 0 n. 11 The n 1 remainder vector Rθ :=fθ D θ has ith component 1 R i θ =f i θ D i 0 θ = θ D i tθ D i 0 dt. Uniformly in the neighborhood N n we have 1/2 Rθ θ M in 2 rn = o θ γ n, i n which, together with the upper bound from inequality 10, implies 0 12 fθ 2 = D θ 2 + oγ 2 n θ 2 =Oγ 2 n θ 2 as θ 0. In the neighborhood N n, via 11 we also have, u Rθ θ sup s Nn i u i D i s D i 0. From Corollary 1 with g i θ =D i θ D i 0 deduce that P sup s Nn u i D i s D i 0 2 c 2 2σ 2 Jr 2 i n i M in 2 = oγn, 2 which implies 13 u Rθ = o p γ n θ uniformly for θ N n.
11 NONLINEAR LEAST-SQUARES ESTIMATION 11 Approximations 12 and 13 give us uniform approximations for the criterion functions in the shrinking neighborhoods N n : G n θ = u fθ 2 u 2 14 = 2u fθ+ fθ 2 = 2u D θ + D θ 2 + o p γ n θ +o p γ 2 n θ 2 = u D θ n 2 u 2 + D θ θ n 2 + o p γ n θ +o p γn θ 2 2. The uniform smallness of the remainder terms lets us approximate G n at random points that are known to lie in N n. The rest of the argument is similar to that of Chernoff When θ n N n we have G n θ n G n 0, implying D θ n θ n 2 + o p γ n θ n +o p γ 2 n θ n 2 D θ n 2. Invoke 10 again, simplifying the last approximation to c 2 0 γ n θn γ n θ n 2 O p 1 + o p γ n θn + γ n θn 2. It follows that θ n = O p γ 1 n and, via 12, κ n θ n = f θ n = O p 1. We may also assume that N n shrinks slowly enough to ensure that P{θ n N n } 1. When both θ n and θ n lie in N n the inequality G n θ n G n θ n and approximation 14 give It follows that θ n = θ n + o p γ 1 n. D θ n θ n 2 + o p 1 o p 1. Remark. If the errors are iid and max ξ i,n = o1 then the distribution of i n ξ i,nu in is asymptotically N 0,σ 2 Vn 1.
12 NONLINEAR LEAST-SQUARES ESTIMATION ANALYSISOFMODEL3: WU S EXAMPLE The results in this section illustrate the work of our limit theorems in a particular case where Wu s method fails. We prove both consistency and a central limit theorem for the model 3 with θ 0 =λ 0, 1/2. In fact, without loss of generality, λ 0 =1. As before, let l n = log n. Remember θ =λ, µ with a R and 0 µ C µ for a finite constant C µ greater than 1/2, which ensures that θ 0 =1, 1/2 is an interior point of the parameter space. Taking C µ =1/2 would complicate the central limit theorem only slightly. The behavior of θ n is determined by the behavior of the function or its standardized version G n γ := i n i 1+γ for γ 1, g n β :=G n β/l n /G n 0 = i n i 1 /G n 0 exp βl i /l n, which is the moment generating function of the probability distribution that puts mass i 1 /G n 0 at l i /l n, for i =1,...,n. For large n, the function g n is well approximated by the increasing, nonnegative function e β 1/β for β 0 gβ =, 1 for β =0 the moment generating function of the uniform distribution on 0, 1. More precisely, comparison of the sum with the integral n 1 x 1+γ dx gives 15 G n γ = l n gγl n +r n γ with 0 r n γ 1 for γ 1. The distributions corresponding to both g n and g are concentrated on [0, 1]. Both functions have the properties described in the following lemma. Lemma 1. Suppose hγ =P expγx, the moment generating function of a probability distribution concentrated on [0, 1]. Then i log h is convex
13 NONLINEAR LEAST-SQUARES ESTIMATION 13 ii hγ 2 /h2γ is unimodal: increasing for γ<0, decreasing for γ>0, achieving its maximum value of 1 at γ =0 iii h γ hγ Proof. Assertion i is just the well known fact that the logarithm of a moment generating function is convex. Thus h /h, the derivative of log h, is an increasing function, which implies ii because d dγ log hγ 2 h2γ =2 h γ hγ 2γ 2h h2γ. Property iii comes from the representation h γ =P xe γx. Remark. Direct calculation shows that gγ 2 /g2γ is a symmetric function. Reparametrize by putting β =1 2µl n, with 1 2C µ l n β l n,andα = λ G n β/l n. Notice that fθ = α and that θ 0 corresponds to α 0 = G n 0 l n and β 0 =0.Also f i θ =αν i β/l n where ν i γ :=i 1/2 expγl i /2/ G n γ, and 16 κ n θ 2 = G n 0 λ 2 g n β 2λg n β/ We define ν i := sup γ 1 ν i γ. Lemma 2. For all α, β corresponding to θ =λ, µ R [0,C µ ]: i κ n θ G n 0 α κ n θ+ G n 0 ii i n ν2 i = O log log n iii dν i β/l n /dβ 1ν 2 iβ/l n iv f i α 1,β 1 f i α 2,β 2 α 1 α α 2 2 β 1 β 2 v f i θ f i θ 0 i 1/2 + α ν i ν i Proof. Inequalities i and v follow from the triangle inequality.
14 NONLINEAR LEAST-SQUARES ESTIMATION 14 For inequality ii, first note that ν Fori 2, separate out contributions from three ranges: νi 2 = max sup ν i γ 2, sup ν i γ 2, sup ν i γ 2. 1 γ 1/l n γ <1/l n γ 1/l n For γ 1/l n, invoke 15 to get a tractable upper bound: ν i γ 2 i 1 expγl i l n gγl n expγl i exp log γ + γ logi/n i 1 γ i 1. expγl n 1 1 e 1 The last expression achieves its maximum over [1/l n, 1] at 1/ logn/i if 1 i n/e γ 0 :=, 1 if n/e i n which gives 17 sup 1 γ 1/l n ν i γ 2 e 1 1 H n i n/e n where Hx :=1/ x log1/x. Similarly, if 1 <γl n < 1, ν i γ 2 expγl i il n gγl n expl i/l n il n g 1 e/g 1. il n The last term is smaller than a constant multiple of the bound from 17. Finally, if γ = δ 1/l n and i 2 then ν i γ 2 i 1 exp δl i δ 1 exp δl n In summary, for some universal constant C, i n/e νi 2 C max n 1 H n Bounding sums by integrals we thus have C 1 n νi 2 i=2 1/e 1/n Hx dx + H1/e/n + exp log δ δl i i 1 e 1 /1 e 1. 1 e 1 il i, 1 i log i n 2 if 2 i n. 1 x log x dx = O log log n.
15 NONLINEAR LEAST-SQUARES ESTIMATION 15 For iii note that 2 d dβ ν iβ/l n =2 d 1/2 dβ exp 1 βl li 2 i/l n G n 0g n β = g nβ ν i β, l n g n β which is bounded in absolute value by ν i β because 0 g nβ g n β. For iv f i α 1,β 1 f i α 2,β 2 α 1 α 2 ν i β 1 /l n + α 2 ν i β 1 /l n ν i β 2 /l n α 1 α 2 ν i + α 2 β 1 β ν i, the bound for the second term coming from the mean-value theorem and iii. Lemma 3. For ɛ>0, let N ɛ = {θ : max λ 1, β ɛ. Ifɛ is small enough, there exists a constant C ɛ > 0 such that inf{κ n θ :θ/ N ɛ } C ɛ ln when n is large enough. Proof. Suppose β ɛ. Remember that G n 0 l n. Minimize over λ the lower bound 16 for κ n θ 2 by choosing λ = g n β/2/g n β, then invoke Lemma 1ii. κ n θ 2 1 g nβ/2 2 gn ɛ/2 2 1 max, g n ɛ/2 2 1 gɛ/22 > 0. l n g n β g n ɛ g n ɛ gɛ If β ɛ and ɛ is small enough to make 1 ɛe ɛ/2 < 1 < 1 + ɛe ɛ/2,use κ n θ 2 = i n i 1 λ expβl i /2l n 1 2. If λ 1+ɛ bound each summand from below by i ɛe ɛ/2 1 2.Ifλ 1 ɛ bound each summand from below by i ɛe ɛ/ Consistency. On the annulus S R := {R κ n θ < 2R} we have Note that i n a K R := 2R + G n 0 f i θ 1 f i θ 2 K R ν i d R θ 1,θ 2 f i θ f i θ 0 b i := i 1/2 + K R ν i. where d R θ 1,θ 2 := α 1 α 2 /K R β 1 β 2 i 1/2 + K R ν i 2 = Oln + K 2 R log l n =OK 2 RL n where L n := log log n.
16 NONLINEAR LEAST-SQUARES ESTIMATION 16 The rectangle { α K R, β cl n } can be partitioned into Oy 1 l n /y subrectangles of d R -diameter at most y. Thus Ny, S R,d R C 0 l n /y 2 for a constant C 0 that depends only on C µ, which gives 1 0 ρ Ny, S R,d R dy = O Ln. Apply Theorem 3 with δ =1to conclude that P{ θ n S R } C 1 K 2 RL 2 n/r 4 C 2 R 2 + l n L 2 n/r 4. Put R = C 3 2 k l n L 2 n 1/4 then sum over k to deduce that P{κ n θ n C 3 l n L 2 n 1/4 } ɛ eventually if the constant C 3 is large enough. That is κ n θ n =O p l n L 2 n 1/4 and, via Lemma 3, λ n 1 = o p 1 and 2l n µ n µ 0 = β = o p Central limit theorem. This time work with the λ, β reparametrization, with D i λ, β = fi λ,β λ f i λ, β, f iλ,β β = λi 1/2+β/2ln = 1/λ, l i /2l n f i λ, β and θ 0 =λ 0,β 0 =1, 0. Take d as the usual two-dimensional Euclidean distance in the λ, β space. For simplicity of notation, we omit some n subscripts, even though the relationship between θ and λ, β changes with n. We have just shown that the LSE λ n, β n is consistent. Comparison of sums with analogous integrals gives the approximations 18 i n i 1 l p 1 i = l p n/p + r p with r p 1 for p =0, 1, 2,... In consequence, γ 2 n = i n D iλ 0,β 0 2 = i n i 1 1+l 2 i /4l 2 n = l n + O1
17 References 17 and V n = γ 2 n i n i 1 1 l i/2l n = V + O1/l n where V = 1 l i /2l n l 2 i /4l 2 n The smaller eigenvalue of V n converges to the smaller eigenvalue of the positive definite matrix V, which is strictly positive. Within the neighborhood N ɛ := {max λ 1, β f i λ, β and D i λ, β are bounded by a multiple of i 1/2. Thus ɛ}, for a fixed ɛ 1/2, both D i θ 1 D i θ 1 λ 1 1 λ 1 2 f i θ 1 +3 f i θ 1 f i θ 2 C ɛ i 1/2 dθ 1,θ 2. That is, we may take M i as a multiple of i 1/2, which gives i n M 2 i = Ol n. All the conditions of Theorem 4 are satisfied. We have ln λ n 1, β n = i n u ii 1/2 ln 1/2 1,l i /2l n V 1 + o p 1. REFERENCES Chernoff, H On the distribution of the likelihood ratio. Annals of Mathematical Statistics 25, Jennrich, R. I Asymptotic properties of non-linear least squares estimators. Annals of Mathematical Statistics 40, Lai, T. L Asymptotic properties of nonlinear least-squares estimates in stochastic regression models. Annals of Statistics 22, Pollard, D Asymptotics via empirical processes with discussion. Statistical Science 4, Skouras, K Strong consistency in nonlinear stochastic regression models. Annals of Statistics 28, Van de Geer, S Estimating a regression function. Annals of Statistics 18,
18 References 18 Van de Geer, S. and M. Wegkamp Consistency for the least squares estimator in nonparametric regression. Annals of Statistics 24, van der Vaart, A. W. and J. A. Wellner Weak Convergence and Empirical Process: With Applications to Statistics. Springer-Verlag. Wu, C.-F Asymptotic theory of nonlinear least squares estimation. Annals of Statistics 9, STATISTICS DEPARTMENT, YALE UNIVERSITY, BOX YALE STATION, NEW HAVEN, CT address: URL: pollard/; pvr4
Closest Moment Estimation under General Conditions
Closest Moment Estimation under General Conditions Chirok Han and Robert de Jong January 28, 2002 Abstract This paper considers Closest Moment (CM) estimation with a general distance function, and avoids
More informationClosest Moment Estimation under General Conditions
Closest Moment Estimation under General Conditions Chirok Han Victoria University of Wellington New Zealand Robert de Jong Ohio State University U.S.A October, 2003 Abstract This paper considers Closest
More informationEmpirical Processes: General Weak Convergence Theory
Empirical Processes: General Weak Convergence Theory Moulinath Banerjee May 18, 2010 1 Extended Weak Convergence The lack of measurability of the empirical process with respect to the sigma-field generated
More information5.1 Approximation of convex functions
Chapter 5 Convexity Very rough draft 5.1 Approximation of convex functions Convexity::approximation bound Expand Convexity simplifies arguments from Chapter 3. Reasons: local minima are global minima and
More informationMuch of this Chapter is unedited or incomplete.
Chapter 9 Chaining methods Much of this Chapter is unedited or incomplete. 1. The chaining strategy The previous Chapter gave several ways to bound the tails of max i N X i. The chaining method applies
More informationClassical regularity conditions
Chapter 3 Classical regularity conditions Preliminary draft. Please do not distribute. The results from classical asymptotic theory typically require assumptions of pointwise differentiability of a criterion
More informationThe properties of L p -GMM estimators
The properties of L p -GMM estimators Robert de Jong and Chirok Han Michigan State University February 2000 Abstract This paper considers Generalized Method of Moment-type estimators for which a criterion
More informationThe Uniform Weak Law of Large Numbers and the Consistency of M-Estimators of Cross-Section and Time Series Models
The Uniform Weak Law of Large Numbers and the Consistency of M-Estimators of Cross-Section and Time Series Models Herman J. Bierens Pennsylvania State University September 16, 2005 1. The uniform weak
More informationis a Borel subset of S Θ for each c R (Bertsekas and Shreve, 1978, Proposition 7.36) This always holds in practical applications.
Stat 811 Lecture Notes The Wald Consistency Theorem Charles J. Geyer April 9, 01 1 Analyticity Assumptions Let { f θ : θ Θ } be a family of subprobability densities 1 with respect to a measure µ on a measurable
More informationStatistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation
Statistics 62: L p spaces, metrics on spaces of probabilites, and connections to estimation Moulinath Banerjee December 6, 2006 L p spaces and Hilbert spaces We first formally define L p spaces. Consider
More informationMAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9
MAT 570 REAL ANALYSIS LECTURE NOTES PROFESSOR: JOHN QUIGG SEMESTER: FALL 204 Contents. Sets 2 2. Functions 5 3. Countability 7 4. Axiom of choice 8 5. Equivalence relations 9 6. Real numbers 9 7. Extended
More informationThe deterministic Lasso
The deterministic Lasso Sara van de Geer Seminar für Statistik, ETH Zürich Abstract We study high-dimensional generalized linear models and empirical risk minimization using the Lasso An oracle inequality
More informationProofs for Large Sample Properties of Generalized Method of Moments Estimators
Proofs for Large Sample Properties of Generalized Method of Moments Estimators Lars Peter Hansen University of Chicago March 8, 2012 1 Introduction Econometrica did not publish many of the proofs in my
More informationSupplementary Material for Nonparametric Operator-Regularized Covariance Function Estimation for Functional Data
Supplementary Material for Nonparametric Operator-Regularized Covariance Function Estimation for Functional Data Raymond K. W. Wong Department of Statistics, Texas A&M University Xiaoke Zhang Department
More informationSOME CONVERSE LIMIT THEOREMS FOR EXCHANGEABLE BOOTSTRAPS
SOME CONVERSE LIMIT THEOREMS OR EXCHANGEABLE BOOTSTRAPS Jon A. Wellner University of Washington The bootstrap Glivenko-Cantelli and bootstrap Donsker theorems of Giné and Zinn (990) contain both necessary
More informationDistance between multinomial and multivariate normal models
Chapter 9 Distance between multinomial and multivariate normal models SECTION 1 introduces Andrew Carter s recursive procedure for bounding the Le Cam distance between a multinomialmodeland its approximating
More informationGoodness-of-Fit Tests for Time Series Models: A Score-Marked Empirical Process Approach
Goodness-of-Fit Tests for Time Series Models: A Score-Marked Empirical Process Approach By Shiqing Ling Department of Mathematics Hong Kong University of Science and Technology Let {y t : t = 0, ±1, ±2,
More informationSome Background Material
Chapter 1 Some Background Material In the first chapter, we present a quick review of elementary - but important - material as a way of dipping our toes in the water. This chapter also introduces important
More informationStatistics 300B Winter 2018 Final Exam Due 24 Hours after receiving it
Statistics 300B Winter 08 Final Exam Due 4 Hours after receiving it Directions: This test is open book and open internet, but must be done without consulting other students. Any consultation of other students
More informationSet, functions and Euclidean space. Seungjin Han
Set, functions and Euclidean space Seungjin Han September, 2018 1 Some Basics LOGIC A is necessary for B : If B holds, then A holds. B A A B is the contraposition of B A. A is sufficient for B: If A holds,
More informationGeneralized Neyman Pearson optimality of empirical likelihood for testing parameter hypotheses
Ann Inst Stat Math (2009) 61:773 787 DOI 10.1007/s10463-008-0172-6 Generalized Neyman Pearson optimality of empirical likelihood for testing parameter hypotheses Taisuke Otsu Received: 1 June 2007 / Revised:
More informationTHE LASSO, CORRELATED DESIGN, AND IMPROVED ORACLE INEQUALITIES. By Sara van de Geer and Johannes Lederer. ETH Zürich
Submitted to the Annals of Applied Statistics arxiv: math.pr/0000000 THE LASSO, CORRELATED DESIGN, AND IMPROVED ORACLE INEQUALITIES By Sara van de Geer and Johannes Lederer ETH Zürich We study high-dimensional
More informationStatistics 581 Revision of Section 4.4: Consistency of Maximum Likelihood Estimates Wellner; 11/30/2001
Statistics 581 Revision of Section 4.4: Consistency of Maximum Likelihood Estimates Wellner; 11/30/2001 Some Uniform Strong Laws of Large Numbers Suppose that: A. X, X 1,...,X n are i.i.d. P on the measurable
More informationNotes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed
18.466 Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed 1. MLEs in exponential families Let f(x,θ) for x X and θ Θ be a likelihood function, that is, for present purposes,
More informationModel Selection and Geometry
Model Selection and Geometry Pascal Massart Université Paris-Sud, Orsay Leipzig, February Purpose of the talk! Concentration of measure plays a fundamental role in the theory of model selection! Model
More informationStatistical Properties of Numerical Derivatives
Statistical Properties of Numerical Derivatives Han Hong, Aprajit Mahajan, and Denis Nekipelov Stanford University and UC Berkeley November 2010 1 / 63 Motivation Introduction Many models have objective
More informationCONDITIONED POISSON DISTRIBUTIONS AND THE CONCENTRATION OF CHROMATIC NUMBERS
CONDITIONED POISSON DISTRIBUTIONS AND THE CONCENTRATION OF CHROMATIC NUMBERS JOHN HARTIGAN, DAVID POLLARD AND SEKHAR TATIKONDA YALE UNIVERSITY ABSTRACT. The paper provides a simpler method for proving
More informationAdditive functionals of infinite-variance moving averages. Wei Biao Wu The University of Chicago TECHNICAL REPORT NO. 535
Additive functionals of infinite-variance moving averages Wei Biao Wu The University of Chicago TECHNICAL REPORT NO. 535 Departments of Statistics The University of Chicago Chicago, Illinois 60637 June
More informationLocal Asymptotic Normality
Chapter 8 Local Asymptotic Normality 8.1 LAN and Gaussian shift families N::efficiency.LAN LAN.defn In Chapter 3, pointwise Taylor series expansion gave quadratic approximations to to criterion functions
More information1 Lyapunov theory of stability
M.Kawski, APM 581 Diff Equns Intro to Lyapunov theory. November 15, 29 1 1 Lyapunov theory of stability Introduction. Lyapunov s second (or direct) method provides tools for studying (asymptotic) stability
More information1 Topology Definition of a topology Basis (Base) of a topology The subspace topology & the product topology on X Y 3
Index Page 1 Topology 2 1.1 Definition of a topology 2 1.2 Basis (Base) of a topology 2 1.3 The subspace topology & the product topology on X Y 3 1.4 Basic topology concepts: limit points, closed sets,
More informationBALANCING GAUSSIAN VECTORS. 1. Introduction
BALANCING GAUSSIAN VECTORS KEVIN P. COSTELLO Abstract. Let x 1,... x n be independent normally distributed vectors on R d. We determine the distribution function of the minimum norm of the 2 n vectors
More informationASYMPTOTIC EQUIVALENCE OF DENSITY ESTIMATION AND GAUSSIAN WHITE NOISE. By Michael Nussbaum Weierstrass Institute, Berlin
The Annals of Statistics 1996, Vol. 4, No. 6, 399 430 ASYMPTOTIC EQUIVALENCE OF DENSITY ESTIMATION AND GAUSSIAN WHITE NOISE By Michael Nussbaum Weierstrass Institute, Berlin Signal recovery in Gaussian
More informationTheory and Applications of Stochastic Systems Lecture Exponential Martingale for Random Walk
Instructor: Victor F. Araman December 4, 2003 Theory and Applications of Stochastic Systems Lecture 0 B60.432.0 Exponential Martingale for Random Walk Let (S n : n 0) be a random walk with i.i.d. increments
More informationMath 259: Introduction to Analytic Number Theory Functions of finite order: product formula and logarithmic derivative
Math 259: Introduction to Analytic Number Theory Functions of finite order: product formula and logarithmic derivative This chapter is another review of standard material in complex analysis. See for instance
More informationMath 259: Introduction to Analytic Number Theory Functions of finite order: product formula and logarithmic derivative
Math 259: Introduction to Analytic Number Theory Functions of finite order: product formula and logarithmic derivative This chapter is another review of standard material in complex analysis. See for instance
More informationarxiv:submit/ [math.st] 6 May 2011
A Continuous Mapping Theorem for the Smallest Argmax Functional arxiv:submit/0243372 [math.st] 6 May 2011 Emilio Seijo and Bodhisattva Sen Columbia University Abstract This paper introduces a version of
More informationStochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions
International Journal of Control Vol. 00, No. 00, January 2007, 1 10 Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions I-JENG WANG and JAMES C.
More informationReal Analysis Problems
Real Analysis Problems Cristian E. Gutiérrez September 14, 29 1 1 CONTINUITY 1 Continuity Problem 1.1 Let r n be the sequence of rational numbers and Prove that f(x) = 1. f is continuous on the irrationals.
More informationLaplace s Equation. Chapter Mean Value Formulas
Chapter 1 Laplace s Equation Let be an open set in R n. A function u C 2 () is called harmonic in if it satisfies Laplace s equation n (1.1) u := D ii u = 0 in. i=1 A function u C 2 () is called subharmonic
More informationConcentration behavior of the penalized least squares estimator
Concentration behavior of the penalized least squares estimator Penalized least squares behavior arxiv:1511.08698v2 [math.st] 19 Oct 2016 Alan Muro and Sara van de Geer {muro,geer}@stat.math.ethz.ch Seminar
More informationBayesian Regularization
Bayesian Regularization Aad van der Vaart Vrije Universiteit Amsterdam International Congress of Mathematicians Hyderabad, August 2010 Contents Introduction Abstract result Gaussian process priors Co-authors
More informationMAT 257, Handout 13: December 5-7, 2011.
MAT 257, Handout 13: December 5-7, 2011. The Change of Variables Theorem. In these notes, I try to make more explicit some parts of Spivak s proof of the Change of Variable Theorem, and to supply most
More informationLeast squares under convex constraint
Stanford University Questions Let Z be an n-dimensional standard Gaussian random vector. Let µ be a point in R n and let Y = Z + µ. We are interested in estimating µ from the data vector Y, under the assumption
More informationTools from Lebesgue integration
Tools from Lebesgue integration E.P. van den Ban Fall 2005 Introduction In these notes we describe some of the basic tools from the theory of Lebesgue integration. Definitions and results will be given
More informationEstimation and Inference of Quantile Regression. for Survival Data under Biased Sampling
Estimation and Inference of Quantile Regression for Survival Data under Biased Sampling Supplementary Materials: Proofs of the Main Results S1 Verification of the weight function v i (t) for the lengthbiased
More informationThe concentration of the chromatic number of random graphs
The concentration of the chromatic number of random graphs Noga Alon Michael Krivelevich Abstract We prove that for every constant δ > 0 the chromatic number of the random graph G(n, p) with p = n 1/2
More informationDA Freedman Notes on the MLE Fall 2003
DA Freedman Notes on the MLE Fall 2003 The object here is to provide a sketch of the theory of the MLE. Rigorous presentations can be found in the references cited below. Calculus. Let f be a smooth, scalar
More informationA uniform central limit theorem for neural network based autoregressive processes with applications to change-point analysis
A uniform central limit theorem for neural network based autoregressive processes with applications to change-point analysis Claudia Kirch Joseph Tadjuidje Kamgaing March 6, 20 Abstract We consider an
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued
Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations Research
More informationExistence and Uniqueness
Chapter 3 Existence and Uniqueness An intellect which at a certain moment would know all forces that set nature in motion, and all positions of all items of which nature is composed, if this intellect
More informationMetric Spaces and Topology
Chapter 2 Metric Spaces and Topology From an engineering perspective, the most important way to construct a topology on a set is to define the topology in terms of a metric on the set. This approach underlies
More informationChapter 4: Asymptotic Properties of the MLE
Chapter 4: Asymptotic Properties of the MLE Daniel O. Scharfstein 09/19/13 1 / 1 Maximum Likelihood Maximum likelihood is the most powerful tool for estimation. In this part of the course, we will consider
More informationRefined Bounds on the Empirical Distribution of Good Channel Codes via Concentration Inequalities
Refined Bounds on the Empirical Distribution of Good Channel Codes via Concentration Inequalities Maxim Raginsky and Igal Sason ISIT 2013, Istanbul, Turkey Capacity-Achieving Channel Codes The set-up DMC
More informationA GENERAL THEOREM ON APPROXIMATE MAXIMUM LIKELIHOOD ESTIMATION. Miljenko Huzak University of Zagreb,Croatia
GLASNIK MATEMATIČKI Vol. 36(56)(2001), 139 153 A GENERAL THEOREM ON APPROXIMATE MAXIMUM LIKELIHOOD ESTIMATION Miljenko Huzak University of Zagreb,Croatia Abstract. In this paper a version of the general
More informationSemiparametric posterior limits
Statistics Department, Seoul National University, Korea, 2012 Semiparametric posterior limits for regular and some irregular problems Bas Kleijn, KdV Institute, University of Amsterdam Based on collaborations
More informationLECTURE 15: COMPLETENESS AND CONVEXITY
LECTURE 15: COMPLETENESS AND CONVEXITY 1. The Hopf-Rinow Theorem Recall that a Riemannian manifold (M, g) is called geodesically complete if the maximal defining interval of any geodesic is R. On the other
More informationFrom now on, we will represent a metric space with (X, d). Here are some examples: i=1 (x i y i ) p ) 1 p, p 1.
Chapter 1 Metric spaces 1.1 Metric and convergence We will begin with some basic concepts. Definition 1.1. (Metric space) Metric space is a set X, with a metric satisfying: 1. d(x, y) 0, d(x, y) = 0 x
More informationPCA with random noise. Van Ha Vu. Department of Mathematics Yale University
PCA with random noise Van Ha Vu Department of Mathematics Yale University An important problem that appears in various areas of applied mathematics (in particular statistics, computer science and numerical
More informationAn introduction to Mathematical Theory of Control
An introduction to Mathematical Theory of Control Vasile Staicu University of Aveiro UNICA, May 2018 Vasile Staicu (University of Aveiro) An introduction to Mathematical Theory of Control UNICA, May 2018
More informationLAW OF LARGE NUMBERS FOR THE SIRS EPIDEMIC
LAW OF LARGE NUMBERS FOR THE SIRS EPIDEMIC R. G. DOLGOARSHINNYKH Abstract. We establish law of large numbers for SIRS stochastic epidemic processes: as the population size increases the paths of SIRS epidemic
More informationFunctional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability...
Functional Analysis Franck Sueur 2018-2019 Contents 1 Metric spaces 1 1.1 Definitions........................................ 1 1.2 Completeness...................................... 3 1.3 Compactness......................................
More informationAn elementary proof of the weak convergence of empirical processes
An elementary proof of the weak convergence of empirical processes Dragan Radulović Department of Mathematics, Florida Atlantic University Marten Wegkamp Department of Mathematics & Department of Statistical
More informationRandom Bernstein-Markov factors
Random Bernstein-Markov factors Igor Pritsker and Koushik Ramachandran October 20, 208 Abstract For a polynomial P n of degree n, Bernstein s inequality states that P n n P n for all L p norms on the unit
More informationConvergence of generalized entropy minimizers in sequences of convex problems
Proceedings IEEE ISIT 206, Barcelona, Spain, 2609 263 Convergence of generalized entropy minimizers in sequences of convex problems Imre Csiszár A Rényi Institute of Mathematics Hungarian Academy of Sciences
More information9 Brownian Motion: Construction
9 Brownian Motion: Construction 9.1 Definition and Heuristics The central limit theorem states that the standard Gaussian distribution arises as the weak limit of the rescaled partial sums S n / p n of
More informationLARGE DEVIATION PROBABILITIES FOR SUMS OF HEAVY-TAILED DEPENDENT RANDOM VECTORS*
LARGE EVIATION PROBABILITIES FOR SUMS OF HEAVY-TAILE EPENENT RANOM VECTORS* Adam Jakubowski Alexander V. Nagaev Alexander Zaigraev Nicholas Copernicus University Faculty of Mathematics and Computer Science
More informationObserver design for a general class of triangular systems
1st International Symposium on Mathematical Theory of Networks and Systems July 7-11, 014. Observer design for a general class of triangular systems Dimitris Boskos 1 John Tsinias Abstract The paper deals
More informationAsymptotic efficiency of simple decisions for the compound decision problem
Asymptotic efficiency of simple decisions for the compound decision problem Eitan Greenshtein and Ya acov Ritov Department of Statistical Sciences Duke University Durham, NC 27708-0251, USA e-mail: eitan.greenshtein@gmail.com
More informationSTAT 200C: High-dimensional Statistics
STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 57 Table of Contents 1 Sparse linear models Basis Pursuit and restricted null space property Sufficient conditions for RNS 2 / 57
More informationSTAT 992 Paper Review: Sure Independence Screening in Generalized Linear Models with NP-Dimensionality J.Fan and R.Song
STAT 992 Paper Review: Sure Independence Screening in Generalized Linear Models with NP-Dimensionality J.Fan and R.Song Presenter: Jiwei Zhao Department of Statistics University of Wisconsin Madison April
More informationMid Term-1 : Practice problems
Mid Term-1 : Practice problems These problems are meant only to provide practice; they do not necessarily reflect the difficulty level of the problems in the exam. The actual exam problems are likely to
More informationFor iid Y i the stronger conclusion holds; for our heuristics ignore differences between these notions.
Large Sample Theory Study approximate behaviour of ˆθ by studying the function U. Notice U is sum of independent random variables. Theorem: If Y 1, Y 2,... are iid with mean µ then Yi n µ Called law of
More informationInstitut für Mathematik
U n i v e r s i t ä t A u g s b u r g Institut für Mathematik Martin Rasmussen, Peter Giesl A Note on Almost Periodic Variational Equations Preprint Nr. 13/28 14. März 28 Institut für Mathematik, Universitätsstraße,
More informationPointwise convergence rates and central limit theorems for kernel density estimators in linear processes
Pointwise convergence rates and central limit theorems for kernel density estimators in linear processes Anton Schick Binghamton University Wolfgang Wefelmeyer Universität zu Köln Abstract Convergence
More informationSpectra of adjacency matrices of random geometric graphs
Spectra of adjacency matrices of random geometric graphs Paul Blackwell, Mark Edmondson-Jones and Jonathan Jordan University of Sheffield 22nd December 2006 Abstract We investigate the spectral properties
More informationAPPROXIMATING CONTINUOUS FUNCTIONS: WEIERSTRASS, BERNSTEIN, AND RUNGE
APPROXIMATING CONTINUOUS FUNCTIONS: WEIERSTRASS, BERNSTEIN, AND RUNGE WILLIE WAI-YEUNG WONG. Introduction This set of notes is meant to describe some aspects of polynomial approximations to continuous
More informationVerifying Regularity Conditions for Logit-Normal GLMM
Verifying Regularity Conditions for Logit-Normal GLMM Yun Ju Sung Charles J. Geyer January 10, 2006 In this note we verify the conditions of the theorems in Sung and Geyer (submitted) for the Logit-Normal
More informationA COUNTEREXAMPLE TO AN ENDPOINT BILINEAR STRICHARTZ INEQUALITY TERENCE TAO. t L x (R R2 ) f L 2 x (R2 )
Electronic Journal of Differential Equations, Vol. 2006(2006), No. 5, pp. 6. ISSN: 072-669. URL: http://ejde.math.txstate.edu or http://ejde.math.unt.edu ftp ejde.math.txstate.edu (login: ftp) A COUNTEREXAMPLE
More informationTopics in Theoretical Computer Science: An Algorithmist's Toolkit Fall 2007
MIT OpenCourseWare http://ocw.mit.edu 18.409 Topics in Theoretical Computer Science: An Algorithmist's Toolkit Fall 2007 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
More informationTheorems. Theorem 1.11: Greatest-Lower-Bound Property. Theorem 1.20: The Archimedean property of. Theorem 1.21: -th Root of Real Numbers
Page 1 Theorems Wednesday, May 9, 2018 12:53 AM Theorem 1.11: Greatest-Lower-Bound Property Suppose is an ordered set with the least-upper-bound property Suppose, and is bounded below be the set of lower
More informationFrom Boltzmann Equations to Gas Dynamics: From DiPerna-Lions to Leray
From Boltzmann Equations to Gas Dynamics: From DiPerna-Lions to Leray C. David Levermore Department of Mathematics and Institute for Physical Science and Technology University of Maryland, College Park
More informationX n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2)
14:17 11/16/2 TOPIC. Convergence in distribution and related notions. This section studies the notion of the so-called convergence in distribution of real random variables. This is the kind of convergence
More informationẋ = f(x, y), ẏ = g(x, y), (x, y) D, can only have periodic solutions if (f,g) changes sign in D or if (f,g)=0in D.
4 Periodic Solutions We have shown that in the case of an autonomous equation the periodic solutions correspond with closed orbits in phase-space. Autonomous two-dimensional systems with phase-space R
More informationMaximum Likelihood Estimation
Chapter 7 Maximum Likelihood Estimation 7. Consistency If X is a random variable (or vector) with density or mass function f θ (x) that depends on a parameter θ, then the function f θ (X) viewed as a function
More informationA REPRESENTATION FOR THE KANTOROVICH RUBINSTEIN DISTANCE DEFINED BY THE CAMERON MARTIN NORM OF A GAUSSIAN MEASURE ON A BANACH SPACE
Theory of Stochastic Processes Vol. 21 (37), no. 2, 2016, pp. 84 90 G. V. RIABOV A REPRESENTATION FOR THE KANTOROVICH RUBINSTEIN DISTANCE DEFINED BY THE CAMERON MARTIN NORM OF A GAUSSIAN MEASURE ON A BANACH
More informationBERNOULLI ACTIONS AND INFINITE ENTROPY
BERNOULLI ACTIONS AND INFINITE ENTROPY DAVID KERR AND HANFENG LI Abstract. We show that, for countable sofic groups, a Bernoulli action with infinite entropy base has infinite entropy with respect to every
More informationSHARP BOUNDARY TRACE INEQUALITIES. 1. Introduction
SHARP BOUNDARY TRACE INEQUALITIES GILES AUCHMUTY Abstract. This paper describes sharp inequalities for the trace of Sobolev functions on the boundary of a bounded region R N. The inequalities bound (semi-)norms
More informationNotes on Gaussian processes and majorizing measures
Notes on Gaussian processes and majorizing measures James R. Lee 1 Gaussian processes Consider a Gaussian process {X t } for some index set T. This is a collection of jointly Gaussian random variables,
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 12: Glivenko-Cantelli and Donsker Results
Introduction to Empirical Processes and Semiparametric Inference Lecture 12: Glivenko-Cantelli and Donsker Results Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics
More informationNonparametric regression with martingale increment errors
S. Gaïffas (LSTA - Paris 6) joint work with S. Delattre (LPMA - Paris 7) work in progress Motivations Some facts: Theoretical study of statistical algorithms requires stationary and ergodicity. Concentration
More informationNOWHERE LOCALLY UNIFORMLY CONTINUOUS FUNCTIONS
NOWHERE LOCALLY UNIFORMLY CONTINUOUS FUNCTIONS K. Jarosz Southern Illinois University at Edwardsville, IL 606, and Bowling Green State University, OH 43403 kjarosz@siue.edu September, 995 Abstract. Suppose
More informationBrownian motion. Samy Tindel. Purdue University. Probability Theory 2 - MA 539
Brownian motion Samy Tindel Purdue University Probability Theory 2 - MA 539 Mostly taken from Brownian Motion and Stochastic Calculus by I. Karatzas and S. Shreve Samy T. Brownian motion Probability Theory
More informationMAJORIZING MEASURES WITHOUT MEASURES. By Michel Talagrand URA 754 AU CNRS
The Annals of Probability 2001, Vol. 29, No. 1, 411 417 MAJORIZING MEASURES WITHOUT MEASURES By Michel Talagrand URA 754 AU CNRS We give a reformulation of majorizing measures that does not involve measures,
More informationLecture 35: December The fundamental statistical distances
36-705: Intermediate Statistics Fall 207 Lecturer: Siva Balakrishnan Lecture 35: December 4 Today we will discuss distances and metrics between distributions that are useful in statistics. I will be lose
More informationEconometrica Supplementary Material
Econometrica Supplementary Material SUPPLEMENT TO USING INSTRUMENTAL VARIABLES FOR INFERENCE ABOUT POLICY RELEVANT TREATMENT PARAMETERS Econometrica, Vol. 86, No. 5, September 2018, 1589 1619 MAGNE MOGSTAD
More informationA CRITERION FOR THE EXISTENCE OF TRANSVERSALS OF SET SYSTEMS
A CRITERION FOR THE EXISTENCE OF TRANSVERSALS OF SET SYSTEMS JERZY WOJCIECHOWSKI Abstract. Nash-Williams [6] formulated a condition that is necessary and sufficient for a countable family A = (A i ) i
More informationf(x)dx = lim f n (x)dx.
Chapter 3 Lebesgue Integration 3.1 Introduction The concept of integration as a technique that both acts as an inverse to the operation of differentiation and also computes areas under curves goes back
More informationOn the Existence of Almost Periodic Solutions of a Nonlinear Volterra Difference Equation
International Journal of Difference Equations (IJDE). ISSN 0973-6069 Volume 2 Number 2 (2007), pp. 187 196 Research India Publications http://www.ripublication.com/ijde.htm On the Existence of Almost Periodic
More informationTheoretical Statistics. Lecture 14.
Theoretical Statistics. Lecture 14. Peter Bartlett Metric entropy. 1. Chaining: Dudley s entropy integral 1 Recall: Sub-Gaussian processes Definition: A stochastic process θ X θ with indexing set T is
More information