Testing near or at the Boundary of the Parameter Space (Job Market Paper)

Size: px
Start display at page:

Download "Testing near or at the Boundary of the Parameter Space (Job Market Paper)"

Transcription

1 Testing near or at the Boundary of the Parameter Space (Jo Market Paper) Philipp Ketz Brown University Novemer 7, 24 Statistical inference aout a scalar parameter is often performed using the two-sided t- test. In extremum prolems, where the estimator satisfies the restrictions on the parameter space - such as the nonnegativity of a variance parameter -, the test suffers from size distortions when the true parameter vector is near or at the oundary of the parameter space. Nevertheless, the two-sided t-test continues to e used when estimates are found to e close to the oundary. This can e attriuted to a lack of inference procedures that appropriately account for oundary effects on the asymptotic distriution of the estimator. To address this issue, we propose an estimator that is asymptotically normally distriuted, even when the true parameter vector is near or at the oundary, and the ojective function is not defined outside the parameter space. The novel estimator allows the implementation of several existing testing procedures and a new test ased on the Conditional Likelihood Ratio statistic (CLR). Compared to the existing procedures, the new test is easy to implement and has good power properties. Moreover, it offers power advantages over the two-sided t-test, when the latter controls size. We also show the test to e admissile when inference is performed with respect to a scalar parameter. We apply the test to the random coefficients logit model using data on the European car market and find more evidence of heterogeneity in consumer preferences than suggested y the two-sided t-test. Keywords: Boundary, asymptotic normality, testing, admissiility, random coefficients. I am very grateful to my advisor Frank Kleiergen as well as Adam McCloskey, Blaise Melly, and Eric Renault for their guidance, their constant support and encouragement, and many helpful comments and discussions. I also thank Andrew Elzinga, Joachim Freyerger, Bruno Gasperini, Hyojin Han, Pepe Montiel Olea, Daniela Scida, and seminar participants at Brown University for helpful comments and suggestions. Department of Economics, Brown University, 64 Waterman Street, Providence, RI philipp ketz@rown.edu.

2 Introduction Statistical inference aout a scalar parameter is often performed using the two-sided t-test, which relies on the asymptotic normality of the underlying estimator. In extremum prolems, however, the commonly employed estimator is not asymptotically normally distriuted when the true parameter vector is near or at the oundary of the parameter space (Andrews, 999). In that case, the two-sided t-test suffers from size distortions. While the test can suffer from overrejection (under some conditions), underrejection constitutes the more prevalent prolem. Due to the associated loss in power, a researcher is, for example, more likely to falsely conclude that a parameter is equal to zero, which has significant consequences in the context of model selection exercises. The random coefficients logit model (Berry, Levinsohn, and Pakes, 995), where variance parameters are restricted to e nonnegative, constitutes a prominent example of models in which estimates are frequently found to e close to the oundary, indicating that the assumption that the true parameter vector lies in the interior is violated. Nevertheless, the two-sided t-test continues to e used in practice (e.g., Nevo, 2; Goeree, 28). This can e attriuted to a lack of inference procedures that appropriately account for oundary effects on the asymptotic distriution of the estimator, which result from the restrictions on the parameter space to which the estimator is confined. 2 In this paper, we address this gap in the literature y introducing a modified extremum estimator that is asymptotically normally distriuted even when the true parameter vector is near or at the oundary of the parameter space. The novel estimator does not require the original extremum ojective function to e defined outside the parameter space and is, therefore, availale in a wide range of nonlinear models. The estimator is given y the unconstrained minimizer of a quadratic approximation to the original ojective function. It is easy to implement and otained y a single step of the Newton-Raphson algorithm starting at the constrained extremum estimator. Although the two-sided t-test ased on the novel estimator constitutes a valid and sizecorrect test, the nonstandard nature of the testing prolem can e utilized to construct tests that are more powerful. The testing prolem is nonstandard in that it is characterized y the presence of nuisance parameters that satisfy certain restrictions. We introduce two tests that take these restrictions into account. The tests are ased on the generalized Likelihood Ratio statistic (glr) and the Conditional Likelihood Ratio statistic (CLR), which previously have Other models with similar restrictions on the parameter space are given y random coefficients regression models (Andrews, 999), censored panel data models with slope heterogeneity (Arevaya and Shen, 24), multinomial discrete response models with random coefficients (Hausman and Wise, 978) or random effects (McFadden, 989). 2 Andrews and Guggenerger (2) show that ootstrap procedures are also inconsistent.

3 not een considered for the testing prolem at hand. The tests are easy to implement and, in many cases, lead to tighter confidence intervals than the enchmark, i.e., the two-sided t-test ased on the constrained extremum estimator. The null distriution of the glr depends on the true value of the nuisance parameter. In order to construct a valid test ased on the glr we consider the least favorale configuration approach. When the dimension of the nuisance parameter is small, the resulting test displays good power properties. In particular, it offers power advantages over the enchmark for a wide range of alternatives. However, the conservativeness of the test, resulting from the use of the least favorale configuration, leads to low power in large parts of the parameter space when the numer of nuisance parameters is large. 3 Conditional on a sufficient statistic for the nuisance parameter, the (conditional) null distriution of the CLR is independent of the true value of that parameter. As a result, the test ased on the CLR displays good power properties regardless of the dimension of the nuisance parameter. Furthermore, we show the test to e admissile when inference is performed with respect a scalar parameter. Another appealing feature is that the test reduces to the two-sided t-test when the true parameter vector lies in the interior of the parameter space. Consequently, confidence intervals otained y inverting the test can e interpreted as a natural extension of standard confidence intervals to the nonstandard setting where the true parameter vector might e near or at the oundary. Based on our novel estimator, other tests recently proposed in the literature ecome availale. Elliott, Müller, and Watson (23) propose tests that maximize weighted average power (WAP), while Montiel Olea (23) proposes tests that maximize WAP suject to a similarity constraint. WAP maximizing tests are attractive when the researcher has a particular weight function in mind, where the weight function specifies the alternatives towards which the test directs power. The tests proposed in this paper are ad hoc in that they are not designed to maximize WAP (or minimize a general loss function). However, from the results in Montiel Olea (23) it follows that there exist weights with respect to which the test ased on the CLR maximizes WAP suject to a similarity constraint. For the testing prolem with a scalar nuisance parameter, we conduct a power comparison of the tests ased on the glr, the CLR, and the two tests proposed y Elliott, Müller, and Watson (23) and Montiel Olea (23) for certain choices of the weight function. 4 All tests are found to have comparale power properties, displaying advantages over some parts of the parameters 3 Alternative methods that do not rely on the use of the least favorale configuration can also e implemented, see e.g., McCloskey (22). 4 The weight functions are taken from the two papers. Elliott, Müller, and Watson (23) use uniform weights that assign equal weights to all alternatives. Montiel Olea (23) uses weights that yield a simple closed form solution for his test statistic. 2

4 space, while lacking in power over others. Regardless of whether the researcher has a particular weight function in mind, the choice of a test is also guided y computational feasiility. The tests proposed y Elliott, Müller, and Watson (23) and Montiel Olea (23) ecome computationally expensive as the numer of nuisance parameters increases. 5 The computational costs associated with the tests introduced in this paper, on the other hand, are invariant to the numer of nuisance parameters, making them more attractive in practice. In order to illustrate their usefulness, we apply the proposed testing procedures to the random coefficients logit model (Berry, Levinsohn, and Pakes, 995), which is widely used in the industrial organization and marketing literatures to model demand for differentiated products. The random coefficients in this model are typically parameterized y a vector of means and variances and allow for heterogeneity in consumer preferences with respect to different product characteristics. In many applications, it is a priori unknown which of the product characteristics interact with a random coefficient, i.e., which variance parameters are non-zero. As a result, the empirical analysis often starts with a aseline model that allows for a random coefficient on all product characteristics. Then, a powerful test such as the one ased on the CLR can prove useful in determining a good model specification. In an application of the random coefficients model to the European car market using data from Reynaert and Veroven (24), we find evidence of consumer heterogeneity with respect to price (divided y income), horse power (divided y weight), and height of the car when using the CLR. 6 The two-sided t-test, on the other hand, which represents common practice, only suggests the presence of consumer heterogeneity with respect to horse power. The plan of this paper is as follows. Section 2 introduces the testing prolem, the two test statistics, the glr and the CLR, and the tests ased on them. This section also shows that the test ased on the CLR is admissile for testing hypotheses aout a scalar parameter. Section 3 provides asymptotic theory for general extremum estimators and introduces a novel estimator which is shown to e asymptotically normal. This section also contains details on how to implement the proposed testing procedures when using an asymptotically normal estimator. Section 4 contains a power comparison of our testing procedures, along with other tests recently proposed in the literature, in the context of several leading examples. Section 5 contains an application to the random coefficients logit model. In this section, we perform a small Monte Carlo study to illustrate the finite sample ehavior of our testing procedures 5 For certain choices of the weight function, the test statistic proposed y Montiel Olea (23) can e otained in closed form. Then the computational cost is equivalent to that of the test ased on the CLR and the glr. However, for a general weight function, the test statistic needs to e evaluated numerically, leading to an increase in computational cost. 6 The glr is not the preferred choice in this setting due to the large numer of nuisance parameters. 3

5 and present the empirical findings. Section 6 concludes. Throughout this paper, (a, ) denotes the vector (a, ), where a and are column vectors. R, R +, and R ++ denote (, ), [, ), and (, ), respectively. Furthermore, if A denotes an interval in R, then A N denotes A A with N N copies. 2 Testing In this section, we first descrie the testing prolem and motivate its relevance. Then, we introduce two testing procedures that have previously not een considered for the testing prolem at hand. They are ased on the generalized Likelihood Ratio statistic (glr) and the Conditional Likelihood Ratio statistic (CLR). The section concludes y showing that the test ased on the CLR is admissile for the testing prolem at hand. 2. Testing prolem This paper studies the prolem of testing hypotheses aout a scalar parameter, β. We consider the case where β is an element of a (K +) dimensional unknown parameter vector, θ = (β, δ), that enters a general extremum ojective function, Q n (θ), whose dependence on the data {W i : i n} is suppressed for notational convenience. Section 3 contains detailed information on what kind of ojective functions are permitted in our framework. The K dimensional vector δ is not specified under the null hypothesis and, therefore, constitutes a nuisance parameter in the testing prolem. The parameter space for θ is assumed to e restricted, and we are interested in testing hypotheses aout β when the true parameter vector is near or at the oundary of the parameter space. This is modeled y means of drifting sequences of true parameters, θ n. For the purpose of illustration, assume θ R K +. Then, the drifting sequences of interest would e such that nθ n µ, where µ denotes a localization parameter with µ <. Such drifting sequences are essential in order to derive asymptotic theory that provides good approximations to the finite sample ehavior of an estimator or a test statistic, when the true parameter vector is close to the oundary relative to the sample size. Throughout this paper, we use near and close interchangealy. Furthermore, the case where the true parameter vector is at the oundary, µ k = for some k {,..., K + }, constitutes a special case of the true parameter vector eing near the oundary and, hereafter, is sometimes referred to implicitly. Motivated y the recent literature, we introduce the hypothesis testing prolem in the limit experiment, or limiting prolem (see e.g., Van der Vaart, 2; Elliott, Müller, and Watson, 23; Müller, 2). In particular, we assume that we oserve a draw from a random 4

6 variale, Y, whose distriution is fully determined y a (K + ) dimensional unknown parameter vector µ. This µ corresponds to the localization parameter of the original testing prolem and is partitioned as µ = (, d), where is scalar and d is a K dimensional vector. We are interested in testing hypotheses aout, while treating d as a nuisance parameter. Similar testing prolems are also considered in Elliott, Müller, and Watson (23) and Montiel Olea (23). Our hypothesis testing prolem of interest is H : = B, d D vs. H :, B, d D, () where B equals (, ], [, ), or (, ), and D is a Cartesian product of intervals that equal (, ] or [, ). Let M = B D denote the parameter space for µ. Although the shape of the parameter space is restrictive, it arises in many models of interest. Random coefficient models, where variance parameters are restricted to e nonnegative, constitute a leading example. Another example is given y (semi-) parametric regression models, where some coefficients are known to satisfy sign restrictions. The motivation for why D does not contain intervals of the form (, ) is given y an invariance argument. In Section 3, we allow for an additional nuisance parameter in θ, whose true value is assumed to lie in the interior of the parameter space and which can e thought of as eing partialled out. The end points of the intervals in M are equal to zero, when they are not infinite, without loss of generality. When the original parameter space for (β, δ) is given y a Cartesian product of intervals, M can always e written as aove y appropriately recentering (β, δ). Similarly, the focus on two-sided testing prolems is without loss of generality. In this paper, we introduce tests for the hypothesis testing prolem given in () ased on the Gaussian shift experiment, i.e., ( Y = Y Y d ) (( ) ) N, Σ, (2) d where Σ is known and positive definite. Alternatively, we could derive tests ased on the limiting prolem that is otained under the sequence of (constrained) extremum estimators commonly employed in practice. The corresponding Y, say Y c, is distriuted as the projection of (2) onto a linear suspace of R K+, see Section 3.. However, tests ased on (2) are not only easier to derive, ut also susume tests ased on Y c, i.e., any test ased on Y c can also e implemented given (2), while the reverse is not true. Intuitively, a draw from a normal random variale is more informative (aout µ) than a draw from a truncated normal. Other tests recently proposed in the literature are also ased on the Gaussian shift experiment (see e.g., Elliott, Müller, and Watson, 23; Montiel Olea, 23). A main con- 5

7 triution of this paper is to propose an asymptotically normal estimator, which is availale under no additional assumptions eyond those made to derive the asymptotic distriution of a constrained extremum estimator, such that tests ased on the Gaussian shift experiment ecome availale in a wide range of nonlinear models. In standard settings, the support of µ = (, d) is R K+, and there are good reasons to ignore Y d when making inference aout. For example, for the two-sided testing prolem, H : = vs. H :, the test that rejects when Y > cv, where cv denotes the Σββ critical value of a standard normal distriution, is the uniformly most powerful uniased test. In the nonstandard setting considered in this paper, Y d contains information aout that can e exploited when testing (). Before introducing our testing procedures, we illustrate how tests developed for () ased on (2) can e used under drifting sequences of true parameters when an asymptotically normal estimator is availale. To that end, we consider a simple example taken from Andrews and Guggenerger (2). Example : We consider the testing prolem where a scalar nuisance parameter is near the oundary. Suppose {X i R 2 : i n} forms a triangular array of i.i.d. random vectors with X i = ( X i, X i,2 ), θ n E(X i ) = ( β n δ n ), and Σ Var(X i ), where Σ is positive definite. θ n is a drifting sequence of true parameters. The parameter space is given y Θ = B D, where B = (, ) and D = [, ), i.e., we know a priori that the mean of X i,2 is nonnegative. We assume that the drifting sequence of true parameters satisfies nβ n, where <, and nδ n d <. We are interested in testing H : = while leaving d unspecified under H. 7 The sample averages, X and X 2, are estimators of β n and δ n, respectively, and y a central limit theorem (CLT) for triangular arrays satisfy n ( X β n X 2 δ n ) d N(, Σ) or, equivalently, n ( X X 2 ) d N (( d ) ), Σ. Therefore, under the aove drifting sequences the scaled estimator, n X, is asymptotically distriuted as Y given in equation (2). Furthermore, the parameter spaces for and d are 7 Testing H : = can e understood as testing H : β n = / n at a given sample size, n. 6

8 given y B = (, ) and D = [, ), respectively. 8 As a result, any test derived for the testing prolem given in () and (2) can e applied here y replacing Y y its sample analogue, n X, and y replacing Σ y a consistent estimator, such as the sample variance of X i. 2.2 Testing procedures In the prolem without a nuisance parameter, i.e., without Y d in (2), and B = [, ), Feldman and Cousins (998) propose the use of the generalized Likelihood Ratio statistic (glr) in order to make inference. To the est of our knowledge, there has een no attempt yet to use the glr to make inference in the general testing prolem given in () and (2). Let φ(y, µ, Σ) denote the pdf of a (K + ) dimensional normal random vector with mean, µ, and positive definite variance, Σ. Then, the glr is defined as follows ( sup B glr(, y, y d ) = 2 log,d D φ((y ), y d ), (, d), Σ) sup d D φ((y, y d ), (, d), Σ) ( ) ( ) ( ) ( ) y = inf Σ y y inf Σ y. d D y d d y d d B,d D y d d y d d The distriution of glr(, Y, Y d ) depends on the true value of d, which is not specified under the null. One possiility to construct a level α test ased on the glr is y means of the least favorale configuration. The least favorale configuration, say d LFC, is such that inf d D P (glr(, Y, Y d ) < cv dlfc ) = α, where cv dlfc denotes the ( α) quantile of the glr(, Y, Y dlfc ) when the true parameter equals (, d LFC ). The corresponding test rejects whenever glr(, y, y d ) > cv dlfc. Another possile testing procedure is ased on the Conditional (generalized) Likelihood Ratio statistic (CLR), which was originally suggested y Moreira (23) for the linear Instrumental Variales regression model with weak instruments. Variates thereof have also een used in the context of weak identification in the Generalized Method of Moments (GMM) (see e.g., Kleiergen, 25). The test utilizes the glr without relying on the least favorale configuration. We consider the following transformation of Y ( Y Y d Σ δβ Σ ββ Y ) N 8 This relies on < and d <. (( d Σ δβ Σ ββ ), [ Σ ββ Σ δδ Σ δβ Σ ββ Σ βδ ]), 7

9 where [ Σ ββ Σ δβ Σ βδ Σ δδ ] denotes a conformale partition of Σ. Let X Y d Σ δβ Σ ββ Y. Note that the distriution of X depends on d, while that of Y does not. Since X and Y are independent, X is a sufficient statistic for d. In fact, X is a complete sufficient statistic for d, where completeness follows from Theorem 4.3. in Lehmann and Romano (25). Now define CLR(, y, x) = inf d D ( inf B,d D y x + Σ δβ Σ ββ y d ( y The CLR test, ϕ CLR (y, x), is given y x + Σ δβ Σ ββ y d ) ( ) Σ y x + Σ δβ Σ ββ y d ) ( Σ y x + Σ δβ Σ ββ y d ). (3) if CLR(, y, x) > cv(α, x) ϕ CLR (y, x) =, (4) otherwise where cv(α, x) denotes the conditional ( α) quantile of the distriution of CLR(, Y, x), where Y N(, Σ ββ ). Unlike the (unconditional) distriution of the glr statistic, the (conditional) distriution of the CLR statistic does not depend on the true value of d due to the conditioning on X = x, which is a sufficient statistic for d. The critical values, cv dlfc e otained y means of simulation. and cv(α, x), although not availale in closed form, can easily Another test statistic of interest is the two-sided t-statistic ased on the constrained maximum likelihood estimator for, t ML hereafter, where the constraint is given y µ M. The power function of the test that compares the t ML to the standard critical value of a normal random variale matches, under certain conditions, the local asymptotic power function of the two-sided t-test ased on a constrained extremum estimator when the true parameter vector is near or at the oundary. The latter corresponds to common practice where potential oundary effects on the distriutions of the underlying estimator are ignored. Therefore, the performance of the test ased on t ML is of particular interest and analyzed, in detail, elow. In what follows, glr, CLR, and t ML can refer to the statistic or the respective test ased on that statistic with the understanding that the glr uses the least favorale configuration approach, while the t ML uses the critical value of a standard normal random variale. Before 8

10 turning to the asymptotic theory, we show that the CLR is admissile in the aove testing prolem. 2.3 Admissiility of the CLR In order to show that the CLR is admissile in the class of all tests for the testing prolem at hand, we first introduce some additional notation. Define M and M such that the testing prolem given in equation () can e written as H : µ M vs. H : µ M, where y definition M = M \M. Let C denote the class of all tests. A test is defined as a measurale function ϕ : Y [, ]. 9 Here, Y = R K+ denotes the support of Y. Similarly, let X = R K denote the support of X. ϕ(y) is to e understood as the proaility of rejecting the null hypothesis given a realization of the data, y. The type I error of ϕ for µ M defined as E µ [ϕ(y )] = Y ϕ(y)f(y µ)dy, where f(y µ) denotes the pdf of Y, which in the model on hand is the pdf of a multivariate normal. Since Σ is known, we suppress the dependence of f(y µ) on Σ. The type I error is only defined for parameter values in the null set, M, and signifies the proaility with which the null hypothesis is falsely rejected. The type II error of ϕ for µ M E µ [ϕ(y )] = Y ϕ(y)f(y µ)dy. is is defined as The type II error is only defined for parameter values in the alternative set, M, and signifies the proaility of falsely failing to reject the null hypothesis. Define the risk function associated with ϕ as E µ [ϕ(y )] if µ M R ϕ (µ) = E µ [ϕ(y )] if µ M The risk function allows the comparison of tests. In particular, a test ϕ is said to dominate ϕ, if R ϕ (µ) R ϕ (µ) for all µ M with strict inequality, R ϕ (µ) < R ϕ (µ), for some µ M. A test ϕ is called admissile in a class of tests, C C, if there exists no test ϕ C such that ϕ dominates ϕ. Admissiility is a minimal optimality requirement for a test within a certain class of tests. If a test is admissile, it cannot uniformly e improved 9 The CLR aove is defined as a function of Y and X, ut since Y is a one-to-one function of Y and X, the CLR could equivalently e expressed as a function of Y.. 9

11 upon, i.e., the proaility of making an incorrect decision cannot e lowered somewhere in the parameter space without an increase in that proaility elsewhere. Before stating the admissiility result for the CLR, we introduce the concept of similarity, which is used in its derivation. A test, ϕ, is said to e conditionally similar if E µ [ϕ(y ) X = x] = α x X and µ M and similar if E µ [ϕ(y )] = α µ M. The CLR is y construction conditionally similar. It follows, y the law of iterated expectations, that it is also (unconditionally) similar. The following Theorem asserts that the CLR introduced in Section 2.2 is admissile in the class of all tests pertaining to the testing prolem given in () and (2). Theorem. The ϕ CLR defined in (4) is admissile in the class of all tests, C, pertaining to the testing prolem given in () and (2). The proof of Theorem is given in Appendix A.. It consists of two parts. First, it is shown that any similar test with convex acceptance sections is admissile. Second, it is shown that the CLR has convex acceptance sections. A test is said to have convex acceptance sections if for any x X the acceptance region of the test as a function of y is closed and convex. The acceptance region of a test is the part of the sample space, for which the test fails to reject the null hypothesis. The first part of the proof follows from Theorem 3. in Matthes and Truax (967). For the prolem at hand, the Theorem asserts that the class of similar tests with convex acceptance sections is complete. A class of tests, C C, is complete, if for any ϕ C, there exists a ϕ C such that ϕ dominates ϕ. Admissiility of a similar test with convex acceptance sections can then e derived when Y is scalar. The result relies on X eing a complete sufficient statistic for d. The second part of the proof follows from the definition of the CLR statistic given in equation (3). The testing prolem at hand satisfies assumption F-F2 in Montiel Olea (23). Therefore, his Theorem applies, which states that any admissile and similar test is an extended Efficient Conditionally Similar (ECS) test. The appeal of an extended ECS test is that there exist weights (with full support on M ) with respect to which the test is aritrarily close to the weighted average power (WAP) maximizer suject to a similarity constraint. Put differently, there exists weights such that the CLR is the essential WAP maximizing test (with respect to those weights) suject to a similarity constraint. A priori, it is not clear whether similarity implies good power properties. For example, in moment inequality models similar tests have een shown to have poor power (Andrews, 22). The power analysis in Section 4 shows that the CLR has good power properties for the testing prolem at hand. The definition of convex acceptance sections given in equation (3.) in Matthes and Truax (967) is slightly different, since it allows for randomized tests. Here, we can restrict ourselves to non-randomized tests, since Y is a continuous random variale.

12 As mentioned aove, the proof of Theorem depends crucially on Y eing scalar. For testing prolems, where Y is vector-valued, the CLR can e shown to e admissile in the conditional prolem, ut it appears to e an open question whether it is admissile in the unconditional prolem. 3 Asymptotic Theory In this section, we introduce a general class of extremum prolems. In Section 3., we show that constrained extremum estimators, which y construction satisfy the restrictions on the parameter space, are not asymptotically normally distriuted when the true parameter vector is near or at the oundary. Consequently, the glr and the CLR, as well as other tests defined in the Gaussian shift experiment cannot e implemented ased on such estimators. Unconstrained estimators are often unavailale, ecause the ojective function is not defined outside the parameter space, e.g., a likelihood function may not e defined for negative values of a variance parameter. In Section 3.2, we propose a modified extremum estimator that is asymptotically normally distriuted and that does not require the ojective function to e defined outside the parameter space. This novel estimator consideraly roadens the applicaility of tests defined in the Gaussian shift experiment. Details on how to implement such tests ased on an asymptotically normal estimator are provided in Section 3.3. Throughout this section, we orrow notation from Andrews and Cheng (22a) (AC hereafter) and Andrews and Cheng (22) (AC2 hereafter). The criterion function of the extremum prolem is denoted Q n (θ). The class of extremum prolems is large and includes (quasi) Maximum Likelihood (ML), Generalized Method of Moments (GMM) and Minimum Distance (MD) prolems among others. i.e., The constrained estimator ˆθ n is defined as the approximate minimizer of Q n (θ) over Θ, Q n (ˆθ n ) = inf θ Θ Q n(θ) + o p (/n), (5) where Θ denotes the true parameter space. In AC, Θ denotes the optimization parameter space, and it is assumed that the true parameter space is a strict suset of the optimization parameter space. This is done to assume away oundary effects on the asymptotic distriution of the estimator. Here, we make the assumption that the true parameter space and the optimization parameter space are identical, since we are interested in the ehavior of the estimator near the oundary. We assume that θ can e partitioned as follows: θ = (β, δ, ξ), where β denotes the scalar parameter of interest, δ denotes a K dimensional nuisance parameter, and ξ denotes

13 an additional L dimensional nuisance parameter. Our setup differs from that of AC as none of the parameters determine the identification strength of any other parameter. In fact, all parameters are assumed to e well identified. The difference etween δ and ξ is that δ is modeled as close to the oundary, whereas ξ is modeled as in the interior of the parameter space. The ojective function Q n (θ) depends on data {W i : i n}, which may e i.i.d., independent and nonidentically distriuted, or temporally dependent. In most applications the distriution of the data is not fully specified y the vector θ, ut it depends on an additional, commonly infinite-dimensional, parameter, φ. For example in conditional maximum likelihood prolems, φ denotes the distriution of the data on which we condition. The parameter γ = (θ, φ) is assumed to fully specify the distriution of the data. The true parameter space is assumed to e of the following form Γ = {γ = (θ, φ) : θ Θ, φ Φ(θ)}, (6) where the true parameter space for θ, Θ R K+L+, is compact. In particular, we assume that Θ equals a Cartesian product of intervals equal to [ c, ], [, c] or [ c, c] for c R +, i.e., some of the parameters are ounded elow or aove y, where the normalization to is without loss of generality. 2 The oundary we are interested in is at, and not at c. 3 The form of the parameter space is restrictive, ut it is otained for many models of interest, most notaly in the context of random coefficients models, see e.g., Berry, Levinsohn, and Pakes (995) or Arevaya and Shen (24). 4 As in AC, we assume that Φ(θ) Φ θ Θ for some compact metric space Φ with a metric that induces weak convergence of the ivariate distriution (W i, W i+m ) for all i, m. When the true parameter vector lies in the interior of the parameter space, standard asymptotic theory provides good approximations to the finite sample ehavior of the estimator for large enough sample sizes. But, at any given sample size, the true parameter vector might e too close to the oundary for standard asymptotic theory to provide good approximations. Intuitively, standard asymptotic theory provides poor approximations if the estimates are within a few standard errors from the oundary. In that case, modeling the true parameter vector as close to the oundary relative to the sample size provides etter finite sample approximations. This is achieved y means of drifting sequences of true param- 2 The use of c as common endpoint for all intervals is merely for notational convenience. The endpoints are free to vary etween all K + L + sets in Θ. 3 In fact, here we implicitly assume that the true parameter space is a strict suset of the optimization parameter space, since we do not allow the true parameter to e equal to c, ut crucially we do not assume that for the oundary at. 4 With the exception of random coefficients models which allow the random coefficients to e correlated, see e.g., Andrews (2). 2

14 eters, which approach the oundary at a rate inversely related to the sample size, n, where the rate is chosen such that the distance to the oundary shows up in the asymptotic distriution. A drifting sequence of true parameters is denoted γ n = (θ n, φ n ). The set of all such drifting sequences is given y Γ(γ ) = {{γ n Γ : n } : γ n γ Γ}. The drifting sequences of primary interest are given y Γ(γ,, d) = {{γ n Γ(γ ) : n } : nβ n B and nδ n d D }, (7) where B equals (, ], [, ), or (, ) when the first coordinate of Θ equals [ c, ], [, c], or [ c, c], and D equals the product space of sets equaling (, ] or [, ) in accordance with the coordinates of Θ, which equal [ c, ] or [, c]. Thus, the set of drifting sequences given in (7) implicitly imposes < and d <. Throughout this paper, we use the terminology under {γ n } Γ(γ ) to mean when the true parameters are γ n Γ(γ ) for any γ Γ and under {γ n } Γ(γ,, d) to mean when the true parameters are γ n Γ(γ,, d) for any γ Γ with β =, δ =, R, and d R K. In order to help fix ideas, we introduce a running example: Example 2 : Consider the following random coefficients model y i = x i η i + u i, where for expository purposes x i is assumed to e scalar. (x i, y i ) is assumed to e i.i.d.. We assume further that η i N(µ η, ση) 2 and u i N(, σu) 2 such that η i u i. Then, the model can e written as y i = x i µ η + u i + x i v i σ η = x i µ η + ε i, where v i N(, ) and ε i = u i + x i v i σ η. Note that ε i x i N(, σu 2 + x 2 i ση). 2 The conditional individual log likelihood function is given y l(µ η, σu, 2 ση y 2 i, x i ) = 2 log(2π) 2 log ( ) σu 2 + x 2 i ση 2 (y i x i µ η ) 2 2 ( ). σu 2 + x 2 i σ2 η The (scaled) conditional log likelihood function, which for notational consistency is denoted 3

15 Q n (θ), is given y Q n (θ) = n n l(µ η, σu, 2 ση y 2 i, x i ). i= We can change the order of the elements in θ to conform with the aove notation according to which parameter we are interested in. The parameter space for µ η, ση, 2 and σu 2 is given y [ c, c], [, c], and [a, c], respectively, where < a < c. We assume that σ u 2 > a, i.e., only the variance parameter σ η 2 may e at the oundary. 5 A reparameterization from σu 2 to σu 2 a results in a parameter space of the form [ c, c], which conforms with the aove definitions. There are three possile orderings of the elements in θ, depending on which of the three parameters is the parameter of interest: ) β = µ η, δ = ση, 2 and ξ = σu 2 a, 2) β = ση, 2 δ empty, and ξ = (µ η, σu 2 a), and 3) β = σu 2 a, δ = ση, 2 and ξ = µ η. The parameter space for φ is given y Φ = {φ : E φ x i 8+ɛ C}, (8) where ɛ R ++ and C R ++. Note that here, Φ does not depend on θ. Next, we introduce the assumptions underlying the asymptotic distriution theory derived in this paper. The assumptions are stronger than those in Andrews (999), A hereafter, as they to not allow for non-stationary time series or complicated shapes of the parameter space. However, the assumptions allow for drifting sequences of true parameters. We make the high level assumption that ˆθ n is consistent for θ. Assumption. ˆθ n = θ + o p () γ Γ. Note that Assumption implies that ˆθ n = θ n + o p (). This follows trivially as θ n θ. Thus, ˆθ n can e thought of as a consistent estimator for the drifting sequence of true parameters, θ n. A sufficient condition for Assumption is given y the following. Assumption. (a) Under {γ n } Γ(γ ), sup θ Θ Q n (θ) Q(θ; γ ) = o p () for some non-stochastic realvalued function Q(θ; γ ). () Q(θ; γ ) is continuous on Θ γ Γ. 5 As aove the superscript denotes the limit of the drifting sequence of true parameters. 4

16 (c) Q(θ; γ ) is uniquely minimized y θ γ Γ. (d) Θ is compact. As in A and AC, we consider a quadratic approximation to the ojective function. In particular, we consider an approximation around the true value as in A with the difference that here the true value is given y a drifting sequence and, thus, depends on the sample size, n. We do not have to consider the approximation around the point of discontinuity as in AC, since we assume that identification does not depend on the true parameter value. The quadratic approximation is given y Q n (θ) = Q n (θ n ) + θ Q n(θ n ) (θ θ n ) + 2 (θ θ n) 2 θ θ Q n(θ n )(θ θ n ) + R n (θ), (9) where Q θ n(θ n ) and 2 Q θ θ n (θ n ) are defined in the following assumption, which assures that the quadratic approximation exists and that the remainder term is asymptotically negligile. Assumption 2. (a) Q n (θ) has continuous left/right (l/r) partial derivatives of order two on Θ n with proaility. () Under {γ n } Γ(γ ), for all constants ɛ n, sup θ Θ: θ θ n ɛ n 2 θ θ Q n(θ) 2 θ θ Q n(θ n ) = o p(), where ( / θ)q n (θ) and ( 2 / θ θ )Q n (θ) denote the (K + L + ) vector and (K + L + ) (K + L + ) matrix of l/r partial derivatives of Q n (θ) of orders one and two, respectively. In many cases, Assumption 2 () can e verified using a uniform LLN, see e.g., Andrews (992). 6 The last two assumptions concern the asymptotic ehavior of the first and second order partial derivatives of the ojective function under drifting sequences of true parameters. Assumption 3. Under {γ n } Γ(γ ), J n = and symmetric. 2 Q θ θ n (θ n ) p J(γ ), where J(γ ) is nonsingular Assumption 4. (i) Under {γ n } Γ(γ ), n θ Q n(θ n ) d N(, V (γ )) for some symmetric V (γ ). (ii) V (γ ) is positive definite γ Γ. 6 Assumption 2 corresponds to Assumption 2 2 in A and Assumption Q in AC2. Assumption 2 2 in A is sufficient for Assumption 2 in A, while Assumption Q in AC2 is sufficient for D in AC. 5

17 Assumption 3 can often e verified using a uniform LLN for triangular arrays, while Assumption 4 typically follows from a CLT for triangular arrays. 7 A.4. Example 2 (continued): The verification of Assumptions -4 can e found in Appendix We introduce some additional notation. Let Z n = Jn n θ Q n(θ n ), such that Z n d Z(γ ), where Z(γ ) N(, J(γ ) V (γ )J(γ ) ). () With J n and Z n thus defined, the quadratic approximation to the ojective function given in equation (9) can e written as Q n (θ) = Q n (θ n ) 2n Z nj n Z n + 2n q n( n(θ θ n )) + R n (θ), () where q n (λ) = (λ Z n ) J n (λ Z n ). Under the assumptions made aove, the remainder, R n (θ), is small enough such that the asymptotic distriution of the centered and scaled minimizer of Q n (θ), n(ˆθ n θ n ), is asymptotically equivalent to the asymptotic distriution of the centered and scaled minimizer of Q n (θ) R n (θ). The latter function only depends on θ through the function q n ( ), which is quadratic in θ. The distriution of the minimizer of a quadratic function is much easier to characterize than the distriution of the minimizer of Q n (θ) explaining the use of rewriting (9) as (). The asymptotic distriution of the minimizer of q n (λ) is given y the distriution of where ˆλ = arg min q(λ), (2) λ Λ(,d) q(λ) = (λ Z(γ )) J(γ )(λ Z(γ )) and where Λ(, d) denotes the limit of the shifted and scaled parameter space, n(θ θ n ). 7 Assumptions 3 and 4 correspond to Assumptions D2 and D3 in AC, respectively, and to Assumption 3 in A. 6

18 Below, we formally show that n(ˆθ n θ n ) is asymptotically distriuted as ˆλ. The distriution of ˆλ crucially depends on Λ(, d) and is given y the projection of a normal random variale, Z(γ ), onto Λ(, d) with respect to the norm q(λ) 2. In the standard case, where the limit of drifting sequence of true parameters, θ, lies in the interior of the parameter space, we have that n(θ θ n ) Λ = R K+L+ such that ˆλ = Z(γ ), recall equation (2). This illustrates that the approach of quadratically approximating the ojective function, which conceptually differs slightly from the typical linear approximation to the first order condition, constitutes another way of otaining the standard asymptotic normality result for n(ˆθ n θ n ) when θ lies in the interior of the parameter space. 3. Asymptotic distriution of constrained extremum estimator In this section, we present the asymptotic distriution result for n(ˆθ n θ n ) when the true parameter vector is near or at the oundary. Although the result is not new (Andrews, 999), it is helpful in estalishing uniformity results, as it is derived using drifting sequences of true parameters as in Andrews and Cheng (22a). We also discuss conditions under which the two-sided t-test ased on a constrained extremum estimator controls asymptotic size. For all sequences {γ n } Γ(γ,, d), we have that n(θ θ n ) Λ(, d), where Λ(, d) denotes a cone with nonzero vertex. In what follows, we also refer to Λ(, d) as the local parameter space. We illustrate the shape of Λ(, d) in the context of our running example. Example 2 (continued): We consider ordering numer ) β = µ η, δ = ση, 2 and ξ = σu 2 a such that B = [ c, c], D = [, c], and Ξ = [ c, c]. Furthermore, let nβ n, where <, nδ n d <, and ξ n ξ, where c < ξ < c. Then, n(θ θ n ) Λ(, d) = (, ) [ d, ) (, ), which is a cone with vertex (, d, ). 8 Here, Λ(, d) does not depend on. More generally, Λ(, d) is equal to a product set of intervals. In particular it takes the form (, ], [, ) or (, ) when B equals [ c, ], [, c] or [ c, c] times a K dimensional product set, where the k th set equals (, d k ] or [ d k, ) when D k equals [ c, ] or [, c] for k =,..., K, times a L dimensional product set, where each set equals (, ). 8 Note, that since Λ(, d) is not a proper cone the vertex is not uniquely defined with respect to the first and the third element. We choose without loss of generality. 7

19 Proposition. Under Assumptions -4 and under {γ n } Γ(γ,, d), n(ˆθ n θ n ) d ˆλ, where ˆλ is defined in (2) with Λ(, d) defined as in the preceding paragraph. The proof of Proposition follows from the arguments in A. Details can e found in Appendix A.2. The results in Section 6 of A concerning the asymptotic distriution of suvectors of n(ˆθ n θ n ) apply here as well with the slight modification that here Λ(, d) is a cone with non zero vertex. Since the asymptotic distriution of n(ˆθ n θ n ) is not the main interest of this paper, we refrain from reproducing the results here. As mentioned in Section 2.2, the asymptotic distriution of n( ˆβ n β n ) is given y the distriution of the maximum likelihood estimator for in (2) suject to µ M under some condition. This condition is given y V (γ ) = aj(γ ) for some constant a R +. Put differently, the correspondence is otained if the matrix defining the norm q(λ) 2 is proportional to the variance matrix of Z(γ ). To gain some intuition for this, we consider a simple example. Let K = and L = with B = [ c, c] and D = [, c]. Then, letting a denote asymptotically distriuted as, it can e shown that n ˆβn a Y + J J 2 min(, Y d ) (3) where Y and Y d are scalar and J = J(γ ) = [ J J 2 J 2 J 22 ]. Generally, the variance matrix of Y = (Y, Y d ), Σ, is given y J(γ ) V (γ )J(γ ). However, if V (γ ) = cj(γ ), Σ simplifies, and the expression in equation (3) reduces to n ˆβn a Y Σ βδ Σ δδ min(, Y d). It is easy to see that the distriution of Y Σ βδ Σ δδ min(, Y d) is also the distriution of the maximum likelihood estimator for in the corresponding Gaussian shift experiment. Since the t ML controls size, as illustrated in Section 4 elow, it follows that the two-sided t-test ased on ˆβ n controls asymptotic size. V (γ ) = aj(γ ) holds, for instance, when, in the context of GMM, the efficient weighting matrix is employed or when, in the context of ML, the likelihood function is correctly specified. If V (γ ) cj(γ ), the two-sided t-statistic ased on ˆβ n can suffer from overrejection. Note, however, that overrejection does not occur in the part of the sample space where the estimate is not restricted or, put differently, not at the oundary. One way to think of this is that, if the estimate is not found to e restricted, then it could have een otained using 8

20 an unrestricted estimator, which, if used in the construction of the two-sided t-test, allows for size-correct inference. Next, we illustrate the asymptotic distriution result given in Proposition in the context of our running example. Example 2 (continued): Since our model is a well-specified likelihood model, we make use of the fact that the information equality holds. In particular, this implies that J(γ ) = V (γ ), such that Z(γ ) N(, V (γ ) ). We consider the three different orderings of elements in θ separately. For the last two orderings the asymptotic distriution is not normal. In order to investigate how well the asymptotic distriution approximates the finite sample distriution, we provide Monte Carlo results for these two orderings. The asymptotic distriution and the finite sample distriution are oth evaluated using. Monte Carlo draws. We choose n = 4. x i is drawn from a U(, 2) distriution. The parameter values are given y µ η =, σ 2 u =, and σ 2 η is varied as indicated elow. ) β = µ η, δ = ση, 2 and ξ = σu 2 a: Since β n and δ n, we have that θ = (,, ξ ) = (,, σu 2 a). The Fisher Information, V (γ ), is given y 9 E φ σu 2 x2 i x 4 2 (σu 2 ) 2 i x 2 2 (σu 2 ) 2 i x 2 2 (σu 2 ) 2 i 2 (σu 2 ) 2 where E φ [ ] denotes the expectation with respect to the distriution of x i under φ. Due to the information orthogonality the asymptotic distriution of n( ˆβ n β n ) does not depend on δ, and we otain n( ˆβn β n ) d N(, Σ ββ ),, where Σ ββ denotes σu 2 E φ [x 2]. i 2) β = σ 2 η, δ empty, and ξ = (µ η, σ 2 u a): Since β n, we have that θ = (, ξ ) = (, µ η, σ 2 u a). The Fisher Information, V (γ ), is given y E φ x 4 2 (σu 2 ) 2 i x 2 2 (σu 2 ) 2 i σu 2 x2 i x 2 2 (σu 2 ) 2 i 2 (σu 2 ) 2 9 The first and second order partial derivatives of the likelihood function can e found in Appendix A.4.. 9

21 Since ξ does not impact the asymptotic distriution of n( ˆβ n β n ) (see e.g., A), we have that n( ˆβ n β n ) d max(, N(, Σ ββ )) or, equivalently, n ˆβn d max(, N(, Σ ββ )), (4) where Σ ββ denotes 2(σ2 u ) 2 E φ [x 4]. i Asy, = Asy, = 3 Asy, = n = 4, =.5 n = 4, = 3.5 n = 4, = Figure : Asymptotic and finite sample densities of ˆβ n = ˆσ 2 η for different values of. Figure shows the asymptotic and finite sample densities of ˆβ n = ˆσ 2 η for different values of = nβ n. takes on the values, 3, and 6 from left to right, which corresponds to β n taking on the values,.5, and.3, respectively. 2 3) β = σ 2 u a, δ = σ 2 η, and ξ = µ η : Since β n and δ n, we have that θ = (,, ξ ) = (,, µ η). The Fisher Information, V (γ ), is given y E φ 2 (σu 2 ) 2 2 x 2 2 (σu 2 ) 2 i 2 x 2 (σu 2 ) 2 i x 4 (σu 2 ) 2 i σu 2 x2 i 2 In fact, the asymptotic approximation is otained y evaluating V (γ ) at β equal to,.5, and.3 rather than,, and. This provides a much etter finite sample approximation and reflects common practice, where plug-in estimates of V (γ ) are utilized. This also holds true for Figure 2 elow. 2.

22 The asymptotic distriution of n( ˆβ n β n ) can e deduced from that of n ˆβ n, which is given y the distriution of Y Σ βδ Σ δδ min(, Y d), where ( Y Y d Here, Σ denotes the inverse of ) (( ) ) [ N, Σ with Σ = d [ x 2 2 (σ E 2 φ u ) 2 2 (σu 2 ) 2 i x 2 2 (σu 2 ) 2 i x 4 2 (σu 2 ) 2 i ]. Σ ββ Σ βδ Σ βδ Σ δδ ]. Asy, d = Asy, d = 3 Asy, d = n = 4, d = 2 n = 4, d = 3 2 n = 4, d = Figure 2: Asymptotic and finite sample densities of ˆβ n = ˆσ 2 u for different values of d. Figure 2 shows the asymptotic and finite sample densities of ˆβ n = ˆσ u 2 for different values of d = nδ n. d takes on the values, 3, and 6 from left to right, which corresponds to δ n taking on the values,.5, and.3, respectively. The Monte Carlo results illustrate that the finite sample distriution is well approximated y the asymptotic distriution derived aove. 2

23 The reason why the asymptotic distriution of a constrained extremum estimator is not normal is that the estimator is restricted to the parameter space. If it were possile to otain an unrestricted estimator, then that estimator would e asymptotically normal under standard regularity conditions. But as seen in Example 2, an unrestricted estimator is not always availale. 3.2 Asymptotic distriution of modified extremum estimator In the preceding section, we showed that constrained extremum estimators are not asymptotically normally distriuted when the true parameter vector is near or at the oundary of the parameter space. In this section, we show that it is possile to otain an asymptotically normal estimator, even if the ojective function is not defined outside the parameter space. As mentioned aove, this is useful as it roadens the applicaility of testing procedures defined in the Gaussian shift experiment. In order to give some intuition for the estimator proposed elow, consider the asymptotic distriution of the estimator of ση 2 in Example 2 given in equation (4). It is given y a normal distriution truncated elow at. The truncation of the asymptotic distriution results from the restriction on the parameter space, which prevents the estimator from taking on negative values. If an unrestricted estimator were availale, it would e asymptotically distriuted as the underlying normal random variale, N(β, ω ββ ). Our ojective is, thus, to construct an estimator that ehaves like an unrestricted estimator. The distriutional result in the previous section is otained y showing that constrained extremum estimators ehave asymptotically like minimizers of quadratic functions over a strict suset of the Euclidean space, the local parameter space. A quadratic function, in contrast to the original ojective function, is always defined over the entire Euclidean space and can, therefore, also e minimized over the entire Euclidean space. The proposed estimator is given y the unrestricted minimizer of a quadratic function that approximates the original ojective function. Another way of motivating the estimator is as follows. When the estimate is at the oundary of the parameter space, we know that the estimate does not satisfy the first order condition of the optimization prolem. But intuitively, there is a point outside the parameter space that would satisfy the first order condition, if that point were allowed in the sense of the ojective function eing well defined at that point or, more precisely, in an open set around that point. This point is approximated y the minimum of the quadratic function that approximates the ojective function and that passes through the estimate at the oundary. Figure 3 illustrates the aove intuition graphically, where the quadratic approximation is 22

Lecture 8 Inequality Testing and Moment Inequality Models

Lecture 8 Inequality Testing and Moment Inequality Models Lecture 8 Inequality Testing and Moment Inequality Models Inequality Testing In the previous lecture, we discussed how to test the nonlinear hypothesis H 0 : h(θ 0 ) 0 when the sample information comes

More information

Long-Run Covariability

Long-Run Covariability Long-Run Covariability Ulrich K. Müller and Mark W. Watson Princeton University October 2016 Motivation Study the long-run covariability/relationship between economic variables great ratios, long-run Phillips

More information

Inference for Identifiable Parameters in Partially Identified Econometric Models

Inference for Identifiable Parameters in Partially Identified Econometric Models Inference for Identifiable Parameters in Partially Identified Econometric Models Joseph P. Romano Department of Statistics Stanford University romano@stat.stanford.edu Azeem M. Shaikh Department of Economics

More information

Chapter 1: A Brief Review of Maximum Likelihood, GMM, and Numerical Tools. Joan Llull. Microeconometrics IDEA PhD Program

Chapter 1: A Brief Review of Maximum Likelihood, GMM, and Numerical Tools. Joan Llull. Microeconometrics IDEA PhD Program Chapter 1: A Brief Review of Maximum Likelihood, GMM, and Numerical Tools Joan Llull Microeconometrics IDEA PhD Program Maximum Likelihood Chapter 1. A Brief Review of Maximum Likelihood, GMM, and Numerical

More information

Math 494: Mathematical Statistics

Math 494: Mathematical Statistics Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/

More information

Greene, Econometric Analysis (6th ed, 2008)

Greene, Econometric Analysis (6th ed, 2008) EC771: Econometrics, Spring 2010 Greene, Econometric Analysis (6th ed, 2008) Chapter 17: Maximum Likelihood Estimation The preferred estimator in a wide variety of econometric settings is that derived

More information

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3 Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest

More information

Introduction to Estimation Methods for Time Series models Lecture 2

Introduction to Estimation Methods for Time Series models Lecture 2 Introduction to Estimation Methods for Time Series models Lecture 2 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 1 / 21 Estimators:

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

SVETLANA KATOK AND ILIE UGARCOVICI (Communicated by Jens Marklof)

SVETLANA KATOK AND ILIE UGARCOVICI (Communicated by Jens Marklof) JOURNAL OF MODERN DYNAMICS VOLUME 4, NO. 4, 010, 637 691 doi: 10.3934/jmd.010.4.637 STRUCTURE OF ATTRACTORS FOR (a, )-CONTINUED FRACTION TRANSFORMATIONS SVETLANA KATOK AND ILIE UGARCOVICI (Communicated

More information

Closest Moment Estimation under General Conditions

Closest Moment Estimation under General Conditions Closest Moment Estimation under General Conditions Chirok Han and Robert de Jong January 28, 2002 Abstract This paper considers Closest Moment (CM) estimation with a general distance function, and avoids

More information

Testing for Anomalous Periods in Time Series Data. Graham Elliott

Testing for Anomalous Periods in Time Series Data. Graham Elliott Testing for Anomalous Periods in Time Series Data Graham Elliott 1 Introduction The Motivating Problem There are reasons to expect that for a time series model that an anomalous period might occur where

More information

arxiv: v4 [math.st] 29 Aug 2017

arxiv: v4 [math.st] 29 Aug 2017 A Critical Value Function Approach, with an Application to Persistent Time-Series Marcelo J. Moreira, and Rafael Mourão arxiv:66.3496v4 [math.st] 29 Aug 27 Escola de Pós-Graduação em Economia e Finanças

More information

Spectrum Opportunity Detection with Weak and Correlated Signals

Spectrum Opportunity Detection with Weak and Correlated Signals Specum Opportunity Detection with Weak and Correlated Signals Yao Xie Department of Elecical and Computer Engineering Duke University orth Carolina 775 Email: yaoxie@dukeedu David Siegmund Department of

More information

FinQuiz Notes

FinQuiz Notes Reading 9 A time series is any series of data that varies over time e.g. the quarterly sales for a company during the past five years or daily returns of a security. When assumptions of the regression

More information

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata Maura Department of Economics and Finance Università Tor Vergata Hypothesis Testing Outline It is a mistake to confound strangeness with mystery Sherlock Holmes A Study in Scarlet Outline 1 The Power Function

More information

EVALUATIONS OF EXPECTED GENERALIZED ORDER STATISTICS IN VARIOUS SCALE UNITS

EVALUATIONS OF EXPECTED GENERALIZED ORDER STATISTICS IN VARIOUS SCALE UNITS APPLICATIONES MATHEMATICAE 9,3 (), pp. 85 95 Erhard Cramer (Oldenurg) Udo Kamps (Oldenurg) Tomasz Rychlik (Toruń) EVALUATIONS OF EXPECTED GENERALIZED ORDER STATISTICS IN VARIOUS SCALE UNITS Astract. We

More information

Fixed-b Inference for Testing Structural Change in a Time Series Regression

Fixed-b Inference for Testing Structural Change in a Time Series Regression Fixed- Inference for esting Structural Change in a ime Series Regression Cheol-Keun Cho Michigan State University imothy J. Vogelsang Michigan State University August 29, 24 Astract his paper addresses

More information

ON THE COMPARISON OF BOUNDARY AND INTERIOR SUPPORT POINTS OF A RESPONSE SURFACE UNDER OPTIMALITY CRITERIA. Cross River State, Nigeria

ON THE COMPARISON OF BOUNDARY AND INTERIOR SUPPORT POINTS OF A RESPONSE SURFACE UNDER OPTIMALITY CRITERIA. Cross River State, Nigeria ON THE COMPARISON OF BOUNDARY AND INTERIOR SUPPORT POINTS OF A RESPONSE SURFACE UNDER OPTIMALITY CRITERIA Thomas Adidaume Uge and Stephen Seastian Akpan, Department Of Mathematics/Statistics And Computer

More information

HIGH-DIMENSIONAL GRAPHS AND VARIABLE SELECTION WITH THE LASSO

HIGH-DIMENSIONAL GRAPHS AND VARIABLE SELECTION WITH THE LASSO The Annals of Statistics 2006, Vol. 34, No. 3, 1436 1462 DOI: 10.1214/009053606000000281 Institute of Mathematical Statistics, 2006 HIGH-DIMENSIONAL GRAPHS AND VARIABLE SELECTION WITH THE LASSO BY NICOLAI

More information

simple if it completely specifies the density of x

simple if it completely specifies the density of x 3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely

More information

1. Define the following terms (1 point each): alternative hypothesis

1. Define the following terms (1 point each): alternative hypothesis 1 1. Define the following terms (1 point each): alternative hypothesis One of three hypotheses indicating that the parameter is not zero; one states the parameter is not equal to zero, one states the parameter

More information

Spring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM

Spring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 14 GEE-GMM Throughout the course we have emphasized methods of estimation and inference based on the principle

More information

Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics

Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics A short review of the principles of mathematical statistics (or, what you should have learned in EC 151).

More information

Economics 583: Econometric Theory I A Primer on Asymptotics: Hypothesis Testing

Economics 583: Econometric Theory I A Primer on Asymptotics: Hypothesis Testing Economics 583: Econometric Theory I A Primer on Asymptotics: Hypothesis Testing Eric Zivot October 12, 2011 Hypothesis Testing 1. Specify hypothesis to be tested H 0 : null hypothesis versus. H 1 : alternative

More information

Let us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided

Let us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided Let us first identify some classes of hypotheses. simple versus simple H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided H 0 : θ θ 0 versus H 1 : θ > θ 0. (2) two-sided; null on extremes H 0 : θ θ 1 or

More information

Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions

Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions Vadim Marmer University of British Columbia Artyom Shneyerov CIRANO, CIREQ, and Concordia University August 30, 2010 Abstract

More information

Parametric Techniques Lecture 3

Parametric Techniques Lecture 3 Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to

More information

More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order Restriction

More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order Restriction Sankhyā : The Indian Journal of Statistics 2007, Volume 69, Part 4, pp. 700-716 c 2007, Indian Statistical Institute More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order

More information

A732: Exercise #7 Maximum Likelihood

A732: Exercise #7 Maximum Likelihood A732: Exercise #7 Maximum Likelihood Due: 29 Novemer 2007 Analytic computation of some one-dimensional maximum likelihood estimators (a) Including the normalization, the exponential distriution function

More information

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Article Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Fei Jin 1,2 and Lung-fei Lee 3, * 1 School of Economics, Shanghai University of Finance and Economics,

More information

Obtaining Critical Values for Test of Markov Regime Switching

Obtaining Critical Values for Test of Markov Regime Switching University of California, Santa Barbara From the SelectedWorks of Douglas G. Steigerwald November 1, 01 Obtaining Critical Values for Test of Markov Regime Switching Douglas G Steigerwald, University of

More information

Statistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation

Statistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation Statistics 62: L p spaces, metrics on spaces of probabilites, and connections to estimation Moulinath Banerjee December 6, 2006 L p spaces and Hilbert spaces We first formally define L p spaces. Consider

More information

FIDUCIAL INFERENCE: AN APPROACH BASED ON BOOTSTRAP TECHNIQUES

FIDUCIAL INFERENCE: AN APPROACH BASED ON BOOTSTRAP TECHNIQUES U.P.B. Sci. Bull., Series A, Vol. 69, No. 1, 2007 ISSN 1223-7027 FIDUCIAL INFERENCE: AN APPROACH BASED ON BOOTSTRAP TECHNIQUES H.-D. HEIE 1, C-tin TÂRCOLEA 2, Adina I. TARCOLEA 3, M. DEMETRESCU 4 În prima

More information

Closest Moment Estimation under General Conditions

Closest Moment Estimation under General Conditions Closest Moment Estimation under General Conditions Chirok Han Victoria University of Wellington New Zealand Robert de Jong Ohio State University U.S.A October, 2003 Abstract This paper considers Closest

More information

A Note on Demand Estimation with Supply Information. in Non-Linear Models

A Note on Demand Estimation with Supply Information. in Non-Linear Models A Note on Demand Estimation with Supply Information in Non-Linear Models Tongil TI Kim Emory University J. Miguel Villas-Boas University of California, Berkeley May, 2018 Keywords: demand estimation, limited

More information

Parametric Techniques

Parametric Techniques Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure

More information

Testing Error Correction in Panel data

Testing Error Correction in Panel data University of Vienna, Dept. of Economics Master in Economics Vienna 2010 The Model (1) Westerlund (2007) consider the following DGP: y it = φ 1i + φ 2i t + z it (1) x it = x it 1 + υ it (2) where the stochastic

More information

Demand in Differentiated-Product Markets (part 2)

Demand in Differentiated-Product Markets (part 2) Demand in Differentiated-Product Markets (part 2) Spring 2009 1 Berry (1994): Estimating discrete-choice models of product differentiation Methodology for estimating differentiated-product discrete-choice

More information

Partial Identification and Confidence Intervals

Partial Identification and Confidence Intervals Partial Identification and Confidence Intervals Jinyong Hahn Department of Economics, UCLA Geert Ridder Department of Economics, USC September 17, 009 Abstract We consider statistical inference on a single

More information

Proofs for Large Sample Properties of Generalized Method of Moments Estimators

Proofs for Large Sample Properties of Generalized Method of Moments Estimators Proofs for Large Sample Properties of Generalized Method of Moments Estimators Lars Peter Hansen University of Chicago March 8, 2012 1 Introduction Econometrica did not publish many of the proofs in my

More information

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score

More information

Weak bidders prefer first-price (sealed-bid) auctions. (This holds both ex-ante, and once the bidders have learned their types)

Weak bidders prefer first-price (sealed-bid) auctions. (This holds both ex-ante, and once the bidders have learned their types) Econ 805 Advanced Micro Theory I Dan Quint Fall 2007 Lecture 9 Oct 4 2007 Last week, we egan relaxing the assumptions of the symmetric independent private values model. We examined private-value auctions

More information

Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part III)

Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part III) Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part III) Florian Pelgrin HEC September-December 2010 Florian Pelgrin (HEC) Constrained estimators September-December

More information

#A50 INTEGERS 14 (2014) ON RATS SEQUENCES IN GENERAL BASES

#A50 INTEGERS 14 (2014) ON RATS SEQUENCES IN GENERAL BASES #A50 INTEGERS 14 (014) ON RATS SEQUENCES IN GENERAL BASES Johann Thiel Dept. of Mathematics, New York City College of Technology, Brooklyn, New York jthiel@citytech.cuny.edu Received: 6/11/13, Revised:

More information

Online Supplementary Appendix B

Online Supplementary Appendix B Online Supplementary Appendix B Uniqueness of the Solution of Lemma and the Properties of λ ( K) We prove the uniqueness y the following steps: () (A8) uniquely determines q as a function of λ () (A) uniquely

More information

Chapter 1. GMM: Basic Concepts

Chapter 1. GMM: Basic Concepts Chapter 1. GMM: Basic Concepts Contents 1 Motivating Examples 1 1.1 Instrumental variable estimator....................... 1 1.2 Estimating parameters in monetary policy rules.............. 2 1.3 Estimating

More information

Spiking problem in monotone regression : penalized residual sum of squares

Spiking problem in monotone regression : penalized residual sum of squares Spiking prolem in monotone regression : penalized residual sum of squares Jayanta Kumar Pal 12 SAMSI, NC 27606, U.S.A. Astract We consider the estimation of a monotone regression at its end-point, where

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Chapter 8 Maximum Likelihood Estimation 8. Consistency If X is a random variable (or vector) with density or mass function f θ (x) that depends on a parameter θ, then the function f θ (X) viewed as a function

More information

Econ 583 Final Exam Fall 2008

Econ 583 Final Exam Fall 2008 Econ 583 Final Exam Fall 2008 Eric Zivot December 11, 2008 Exam is due at 9:00 am in my office on Friday, December 12. 1 Maximum Likelihood Estimation and Asymptotic Theory Let X 1,...,X n be iid random

More information

Asymptotics for Nonlinear GMM

Asymptotics for Nonlinear GMM Asymptotics for Nonlinear GMM Eric Zivot February 13, 2013 Asymptotic Properties of Nonlinear GMM Under standard regularity conditions (to be discussed later), it can be shown that where ˆθ(Ŵ) θ 0 ³ˆθ(Ŵ)

More information

Expansion formula using properties of dot product (analogous to FOIL in algebra): u v 2 u v u v u u 2u v v v u 2 2u v v 2

Expansion formula using properties of dot product (analogous to FOIL in algebra): u v 2 u v u v u u 2u v v v u 2 2u v v 2 Least squares: Mathematical theory Below we provide the "vector space" formulation, and solution, of the least squares prolem. While not strictly necessary until we ring in the machinery of matrix algera,

More information

On the Power of Tests for Regime Switching

On the Power of Tests for Regime Switching On the Power of Tests for Regime Switching joint work with Drew Carter and Ben Hansen Douglas G. Steigerwald UC Santa Barbara May 2015 D. Steigerwald (UCSB) Regime Switching May 2015 1 / 42 Motivating

More information

Graduate Econometrics I: Maximum Likelihood I

Graduate Econometrics I: Maximum Likelihood I Graduate Econometrics I: Maximum Likelihood I Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Maximum Likelihood

More information

Nearly Optimal Tests when a Nuisance Parameter is Present Under the Null Hypothesis

Nearly Optimal Tests when a Nuisance Parameter is Present Under the Null Hypothesis Nearly Optimal Tests when a Nuisance Parameter is Present Under the Null Hypothesis Graham Elliott UCSD Ulrich K. Müller Princeton University Mark W. Watson Princeton University and NBER January 2012 (Revised

More information

The variance for partial match retrievals in k-dimensional bucket digital trees

The variance for partial match retrievals in k-dimensional bucket digital trees The variance for partial match retrievals in k-dimensional ucket digital trees Michael FUCHS Department of Applied Mathematics National Chiao Tung University January 12, 21 Astract The variance of partial

More information

Comparison of inferential methods in partially identified models in terms of error in coverage probability

Comparison of inferential methods in partially identified models in terms of error in coverage probability Comparison of inferential methods in partially identified models in terms of error in coverage probability Federico A. Bugni Department of Economics Duke University federico.bugni@duke.edu. September 22,

More information

1Number ONLINE PAGE PROOFS. systems: real and complex. 1.1 Kick off with CAS

1Number ONLINE PAGE PROOFS. systems: real and complex. 1.1 Kick off with CAS 1Numer systems: real and complex 1.1 Kick off with CAS 1. Review of set notation 1.3 Properties of surds 1. The set of complex numers 1.5 Multiplication and division of complex numers 1.6 Representing

More information

Ch. 5 Hypothesis Testing

Ch. 5 Hypothesis Testing Ch. 5 Hypothesis Testing The current framework of hypothesis testing is largely due to the work of Neyman and Pearson in the late 1920s, early 30s, complementing Fisher s work on estimation. As in estimation,

More information

DSGE Methods. Estimation of DSGE models: GMM and Indirect Inference. Willi Mutschler, M.Sc.

DSGE Methods. Estimation of DSGE models: GMM and Indirect Inference. Willi Mutschler, M.Sc. DSGE Methods Estimation of DSGE models: GMM and Indirect Inference Willi Mutschler, M.Sc. Institute of Econometrics and Economic Statistics University of Münster willi.mutschler@wiwi.uni-muenster.de Summer

More information

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION VICTOR CHERNOZHUKOV CHRISTIAN HANSEN MICHAEL JANSSON Abstract. We consider asymptotic and finite-sample confidence bounds in instrumental

More information

CHAPTER 5. Linear Operators, Span, Linear Independence, Basis Sets, and Dimension

CHAPTER 5. Linear Operators, Span, Linear Independence, Basis Sets, and Dimension A SERIES OF CLASS NOTES TO INTRODUCE LINEAR AND NONLINEAR PROBLEMS TO ENGINEERS, SCIENTISTS, AND APPLIED MATHEMATICIANS LINEAR CLASS NOTES: A COLLECTION OF HANDOUTS FOR REVIEW AND PREVIEW OF LINEAR THEORY

More information

Fall, 2007 Nonlinear Econometrics. Theory: Consistency for Extremum Estimators. Modeling: Probit, Logit, and Other Links.

Fall, 2007 Nonlinear Econometrics. Theory: Consistency for Extremum Estimators. Modeling: Probit, Logit, and Other Links. 14.385 Fall, 2007 Nonlinear Econometrics Lecture 2. Theory: Consistency for Extremum Estimators Modeling: Probit, Logit, and Other Links. 1 Example: Binary Choice Models. The latent outcome is defined

More information

On the econometrics of the Koyck model

On the econometrics of the Koyck model On the econometrics of the Koyck model Philip Hans Franses and Rutger van Oest Econometric Institute, Erasmus University Rotterdam P.O. Box 1738, NL-3000 DR, Rotterdam, The Netherlands Econometric Institute

More information

Revisiting the Nested Fixed-Point Algorithm in BLP Random Coeffi cients Demand Estimation

Revisiting the Nested Fixed-Point Algorithm in BLP Random Coeffi cients Demand Estimation Revisiting the Nested Fixed-Point Algorithm in BLP Random Coeffi cients Demand Estimation Jinhyuk Lee Kyoungwon Seo September 9, 016 Abstract This paper examines the numerical properties of the nested

More information

Model Selection and Geometry

Model Selection and Geometry Model Selection and Geometry Pascal Massart Université Paris-Sud, Orsay Leipzig, February Purpose of the talk! Concentration of measure plays a fundamental role in the theory of model selection! Model

More information

Stochastic Convergence, Delta Method & Moment Estimators

Stochastic Convergence, Delta Method & Moment Estimators Stochastic Convergence, Delta Method & Moment Estimators Seminar on Asymptotic Statistics Daniel Hoffmann University of Kaiserslautern Department of Mathematics February 13, 2015 Daniel Hoffmann (TU KL)

More information

Inference for identifiable parameters in partially identified econometric models

Inference for identifiable parameters in partially identified econometric models Journal of Statistical Planning and Inference 138 (2008) 2786 2807 www.elsevier.com/locate/jspi Inference for identifiable parameters in partially identified econometric models Joseph P. Romano a,b,, Azeem

More information

Specification Test on Mixed Logit Models

Specification Test on Mixed Logit Models Specification est on Mixed Logit Models Jinyong Hahn UCLA Jerry Hausman MI December 1, 217 Josh Lustig CRA Abstract his paper proposes a specification test of the mixed logit models, by generalizing Hausman

More information

Luis Manuel Santana Gallego 100 Investigation and simulation of the clock skew in modern integrated circuits. Clock Skew Model

Luis Manuel Santana Gallego 100 Investigation and simulation of the clock skew in modern integrated circuits. Clock Skew Model Luis Manuel Santana Gallego 100 Appendix 3 Clock Skew Model Xiaohong Jiang and Susumu Horiguchi [JIA-01] 1. Introduction The evolution of VLSI chips toward larger die sizes and faster clock speeds makes

More information

Introduction to Maximum Likelihood Estimation

Introduction to Maximum Likelihood Estimation Introduction to Maximum Likelihood Estimation Eric Zivot July 26, 2012 The Likelihood Function Let 1 be an iid sample with pdf ( ; ) where is a ( 1) vector of parameters that characterize ( ; ) Example:

More information

The properties of L p -GMM estimators

The properties of L p -GMM estimators The properties of L p -GMM estimators Robert de Jong and Chirok Han Michigan State University February 2000 Abstract This paper considers Generalized Method of Moment-type estimators for which a criterion

More information

Statistics 3858 : Maximum Likelihood Estimators

Statistics 3858 : Maximum Likelihood Estimators Statistics 3858 : Maximum Likelihood Estimators 1 Method of Maximum Likelihood In this method we construct the so called likelihood function, that is L(θ) = L(θ; X 1, X 2,..., X n ) = f n (X 1, X 2,...,

More information

Economics 583: Econometric Theory I A Primer on Asymptotics

Economics 583: Econometric Theory I A Primer on Asymptotics Economics 583: Econometric Theory I A Primer on Asymptotics Eric Zivot January 14, 2013 The two main concepts in asymptotic theory that we will use are Consistency Asymptotic Normality Intuition consistency:

More information

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE Chapter 6. Panel Data Joan Llull Quantitative Statistical Methods II Barcelona GSE Introduction Chapter 6. Panel Data 2 Panel data The term panel data refers to data sets with repeated observations over

More information

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate

More information

Testing Predictive Ability and Power Robusti cation

Testing Predictive Ability and Power Robusti cation Testing Predictive Aility and Power Rousti cation Kyungchul Song University of British Columia, Department of Economics, 997-1873 East Mall, Vancouver, BC, V6T 1Z1, Canada. (kysong@mail.uc.ca) January

More information

September Math Course: First Order Derivative

September Math Course: First Order Derivative September Math Course: First Order Derivative Arina Nikandrova Functions Function y = f (x), where x is either be a scalar or a vector of several variables (x,..., x n ), can be thought of as a rule which

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

BOUSSINESQ-TYPE MOMENTUM EQUATIONS SOLUTIONS FOR STEADY RAPIDLY VARIED FLOWS. Yebegaeshet T. Zerihun 1 and John D. Fenton 2

BOUSSINESQ-TYPE MOMENTUM EQUATIONS SOLUTIONS FOR STEADY RAPIDLY VARIED FLOWS. Yebegaeshet T. Zerihun 1 and John D. Fenton 2 ADVANCES IN YDRO-SCIENCE AND ENGINEERING, VOLUME VI BOUSSINESQ-TYPE MOMENTUM EQUATIONS SOLUTIONS FOR STEADY RAPIDLY VARIED FLOWS Yeegaeshet T. Zerihun and John D. Fenton ABSTRACT The depth averaged Saint-Venant

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)

More information

CCP Estimation. Robert A. Miller. March Dynamic Discrete Choice. Miller (Dynamic Discrete Choice) cemmap 6 March / 27

CCP Estimation. Robert A. Miller. March Dynamic Discrete Choice. Miller (Dynamic Discrete Choice) cemmap 6 March / 27 CCP Estimation Robert A. Miller Dynamic Discrete Choice March 2018 Miller Dynamic Discrete Choice) cemmap 6 March 2018 1 / 27 Criteria for Evaluating Estimators General principles to apply when assessing

More information

16/018. Efficiency Gains in Rank-ordered Multinomial Logit Models. June 13, 2016

16/018. Efficiency Gains in Rank-ordered Multinomial Logit Models. June 13, 2016 16/018 Efficiency Gains in Rank-ordered Multinomial Logit Models Arie Beresteanu and Federico Zincenko June 13, 2016 Efficiency Gains in Rank-ordered Multinomial Logit Models Arie Beresteanu and Federico

More information

Estimation, Inference, and Hypothesis Testing

Estimation, Inference, and Hypothesis Testing Chapter 2 Estimation, Inference, and Hypothesis Testing Note: The primary reference for these notes is Ch. 7 and 8 of Casella & Berger 2. This text may be challenging if new to this topic and Ch. 7 of

More information

Verifying Regularity Conditions for Logit-Normal GLMM

Verifying Regularity Conditions for Logit-Normal GLMM Verifying Regularity Conditions for Logit-Normal GLMM Yun Ju Sung Charles J. Geyer January 10, 2006 In this note we verify the conditions of the theorems in Sung and Geyer (submitted) for the Logit-Normal

More information

1 Hoeffding s Inequality

1 Hoeffding s Inequality Proailistic Method: Hoeffding s Inequality and Differential Privacy Lecturer: Huert Chan Date: 27 May 22 Hoeffding s Inequality. Approximate Counting y Random Sampling Suppose there is a ag containing

More information

Bayesian inference with reliability methods without knowing the maximum of the likelihood function

Bayesian inference with reliability methods without knowing the maximum of the likelihood function Bayesian inference with reliaility methods without knowing the maximum of the likelihood function Wolfgang Betz a,, James L. Beck, Iason Papaioannou a, Daniel Strau a a Engineering Risk Analysis Group,

More information

ECON 4160, Autumn term Lecture 1

ECON 4160, Autumn term Lecture 1 ECON 4160, Autumn term 2017. Lecture 1 a) Maximum Likelihood based inference. b) The bivariate normal model Ragnar Nymoen University of Oslo 24 August 2017 1 / 54 Principles of inference I Ordinary least

More information

Quick Review on Linear Multiple Regression

Quick Review on Linear Multiple Regression Quick Review on Linear Multiple Regression Mei-Yuan Chen Department of Finance National Chung Hsing University March 6, 2007 Introduction for Conditional Mean Modeling Suppose random variables Y, X 1,

More information

Testing Restrictions and Comparing Models

Testing Restrictions and Comparing Models Econ. 513, Time Series Econometrics Fall 00 Chris Sims Testing Restrictions and Comparing Models 1. THE PROBLEM We consider here the problem of comparing two parametric models for the data X, defined by

More information

Introduction to Machine Learning. Lecture 2

Introduction to Machine Learning. Lecture 2 Introduction to Machine Learning Lecturer: Eran Halperin Lecture 2 Fall Semester Scribe: Yishay Mansour Some of the material was not presented in class (and is marked with a side line) and is given for

More information

1 Procedures robust to weak instruments

1 Procedures robust to weak instruments Comment on Weak instrument robust tests in GMM and the new Keynesian Phillips curve By Anna Mikusheva We are witnessing a growing awareness among applied researchers about the possibility of having weak

More information

Covariance function estimation in Gaussian process regression

Covariance function estimation in Gaussian process regression Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian

More information

A Primer on Asymptotics

A Primer on Asymptotics A Primer on Asymptotics Eric Zivot Department of Economics University of Washington September 30, 2003 Revised: October 7, 2009 Introduction The two main concepts in asymptotic theory covered in these

More information

P n. This is called the law of large numbers but it comes in two forms: Strong and Weak.

P n. This is called the law of large numbers but it comes in two forms: Strong and Weak. Large Sample Theory Large Sample Theory is a name given to the search for approximations to the behaviour of statistical procedures which are derived by computing limits as the sample size, n, tends to

More information

Real option valuation for reserve capacity

Real option valuation for reserve capacity Real option valuation for reserve capacity MORIARTY, JM; Palczewski, J doi:10.1016/j.ejor.2016.07.003 For additional information aout this pulication click this link. http://qmro.qmul.ac.uk/xmlui/handle/123456789/13838

More information

Conditional Linear Combination Tests for Weakly Identified Models

Conditional Linear Combination Tests for Weakly Identified Models Conditional Linear Combination Tests for Weakly Identified Models Isaiah Andrews JOB MARKET PAPER Abstract This paper constructs powerful tests applicable in a wide range of weakly identified contexts,

More information

BTRY 4090: Spring 2009 Theory of Statistics

BTRY 4090: Spring 2009 Theory of Statistics BTRY 4090: Spring 2009 Theory of Statistics Guozhang Wang September 25, 2010 1 Review of Probability We begin with a real example of using probability to solve computationally intensive (or infeasible)

More information

Likelihood-Based Methods

Likelihood-Based Methods Likelihood-Based Methods Handbook of Spatial Statistics, Chapter 4 Susheela Singh September 22, 2016 OVERVIEW INTRODUCTION MAXIMUM LIKELIHOOD ESTIMATION (ML) RESTRICTED MAXIMUM LIKELIHOOD ESTIMATION (REML)

More information

Instrumental Variables Estimation and Weak-Identification-Robust. Inference Based on a Conditional Quantile Restriction

Instrumental Variables Estimation and Weak-Identification-Robust. Inference Based on a Conditional Quantile Restriction Instrumental Variables Estimation and Weak-Identification-Robust Inference Based on a Conditional Quantile Restriction Vadim Marmer Department of Economics University of British Columbia vadim.marmer@gmail.com

More information