On the equivalence of confidence interval estimation based on frequentist model averaging and least-squares of the full model in linear regression

Size: px

Start display at page:

Download "On the equivalence of confidence interval estimation based on frequentist model averaging and least-squares of the full model in linear regression"

Patrick Moore
5 years ago
Views:

1 Working Paper 2016:1 Department of Statistics On the equivalence of confidence interval estimation based on frequentist model averaging and least-squares of the full model in linear regression Sebastian Ankargren and Shaobo Jin

3 Working Paper 2016:1 April 2016 Department of Statistics Uppsala University Box 513 SE UPPSALA SWEDEN Working papers can be downloaded from Title: On the equivalence of confidence interval estimation based on frequentist model averaging and least-squares for the full model in linear regression Author: Sebastian Ankargren & Shaobo Jin

4 On the equivalence of confidence interval estimation based on frequentist model averaging and least squares of the full model in linear regression Sebastian Ankargren and Shaobo Jin Department of Statistics, Uppsala University April 5, 2016 Abstract In many applications of linear regression models, model selection is vital. However, randomness due to model selection is commonly ignored in post-model selection inference. In order to account for the model selection uncertainty in these linear models, least squares frequentist model averaging has been proposed recently. In this paper, we show that the confidence interval from model averaging is asymptotically equivalent to the confidence interval from the full model. Furthermore, we demonstrate that this equivalence also holds in finite samples if the parameter of interest is a linear function of the regression coefficients. Keywords: Asymptotic equivalence Linear model Local asymptotics odel selection uncertainty Post-selection inference athematics Subject Classification 2000: 62E20 62J05 62J99 62F12 1 Introduction In many situations in applied statistics, there is no model structure known to the researcher, but a set of plausible alternatives. Commonly, to pick one model from this set of candidates, information criteria such as AIC Akaike, 1974 and BIC Schwarz, 1978 are used. This can have detrimental effects on subsequent inference since the model selection step itself is stochastic and thus subject to uncertainty, an uncertainty which inference post-model selection usually fails to account for. This issue is studied sebastian.ankargren@statistics.uu.se shaobo.jin@statistics.uu.se 1

5 in detail by e.g. Pötscher 1991; Kabaila 1995; Kabaila and Leeb 2006; Leeb and Pötscher 2005; also see Berk et al 2013 for a promising way of doing inference post-model selection. It is often by this inability of model selection to incorporate the randomness attached to it in post-selection inference that frequentist model averaging is motivated, advocating that it may serve as a better, more truthful alternative. For example, it is claimed that model averaging compromises across a set of competing models, and in doing so, incorporates model uncertainty into the conclusions about the unknown parameters Wan et al, 2010, p Hjort and Claeskens 2003 developed a frequentist model averaging machinery under the likelihood framework. They proposed a confidence interval that captures the randomness in model selection. However, Wang and Zhou 2013 showed that the proposed interval is asymptotically equivalent to the confidence interval obtained from the full model and that the finite sample confidence intervals are also equivalent if the parameter of interest is a linear combination of regression coefficients of a varying-coefficient partially linear model, which is regarded as the semi-parametric framework by the authors. The focus of this paper is the linear model using least squares. Least squares-based model averaging has previously been studied by for example Hansen 2007, Wan et al 2010, Hansen and Racine 2012, Liu and Okui 2013, Ando and Li 2014 and Cheng et al 2015, with a focus on prediction. Liu 2015 developed asymptotic distribution theory for frequentist model averaging estimators under the least squares framework, including results for allows and Jackknife model averaging Hansen, 2007; Hansen and Racine, 2012, and proposed a confidence interval in the spirit of Hjort and Claeskens In this article, we show that the inference we can make about the unknown parameters, using the distributional theory of Liu 2015, is in fact equivalent to what we can make just using the model including all possible covariates, either asymptotically or both in finite samples and asymptotically depending on the nature of the parameter of interest. From this perspective, there is no additional gain in turning to model averaging. Thus, this paper extends the work by Kabaila and Leeb 2006 and Wang and Zhou 2013 in that their criticism geared towards interval estimation based on Hjort and Claeskens 2003 is also valid for interval estimation in linear models based on Liu The remainder of the paper is organized as follows. Section 2 briefly reviews the key results in Liu 2015, Section 3 discusses the equivalence of the confidence interval and Section 4 concludes. Proofs are placed in Appendix A. 2 Frequentist model averaging In this section we briefly review the necessary aspects of Liu Assume the linear regression model y = Xβ Zγ e 1 in which y = y 1,...,y n is the n 1 response vector, X = x 1,...,x n the n p matrix of core regressors, Z = z 1,...,z n the n q matrix of auxiliary regressors, e = e 1,...,e n the n 1 vector of errors and β and γ are the p 1 and q 1 parameter 2

6 vectors. By the formulation of the model, the core regressors in X are necessarily included in all models, but those in Z are potential and subject to scrutiny. This does not restrict applications, however, as X may be an empty matrix or simply a vector of ones such that the intercept is always included. The error term is assumed to have a zero conditional mean, Ee i x i,z i = 0, but allowed to be both heteroskedastic and autocorrelated. Collect the regressors and the parameters in H = X,Z and θ = β,γ, respectively. The regression model 1 may then be written as y = Hθ e Suppose now that there is a set of candidate models. If this set consists of all nested models, then = q 1, and if it consists of all submodels, then = 2 q. Whatever the choice of model set, each model m is assumed to contain a unique combination of 0 q m q regressors from Z such that the mth model s subset of auxiliary regressors can be written as Z m = ZΠ m, where Z m is n q m and Π m is a q q m selection matrix. Since all models also include the full set of core regressors X, model m includes in total p q m regressors. For ease of notation and with no loss of generality, let the full model be ordered last in the set of models such that model is the full model including all regressors. The least squares estimator of θ in this model is ˆθ = H H 1 H y, whereas for submodel m, the estimator of θ m = β,γ m = β,γ Π m, a pq m 1 vector, is ˆθ m = H mh m 1 H my, 2 where H m = X,Z m = X,ZΠ m. For later use we also define θ m = β,γ mπ m, a p q 1 vector in which there are zeros placed in the positions corresponding to excluded variables in model m. Similarly, ˆθ m = ˆβ m, ˆγ m = ˆβ m, ˆγ mπ m where ˆθ m is p q 1 and ˆγ m is the q m last elements of ˆθ m in 2. For the asymptotic analysis local asymptotics are employed. This framework entails the following assumption. Assumption 1. γ = γ n = δ/ n, where δ is an unknown local parameter vector. Since the omitted variable bias in all models but the full is of order O1/ n under Assumption 1, n ˆθ m θ m does not diverge. Instead, we have the following: n ˆθ θ n ˆθ m θ m d N0,Q 1 ΩQ 1 d NA m δ,q 1 m Ω m Q 1 m 3 3

7 where A m = Q 1 m S mqs 0 I q Π m Π m and 0p q Ip 0 S 0 =, S I m = p qm q 0 q p Π m Q Q Q = xz Exi x = i Ex iz i Q zx Q zz Ez i x i Ez iz i n n Ω Ω Ω = xz 1 Exi x = lim n n j e ie j Ex i z j e ie j Ez i x j e ie j Ez i z j e ie j Ω zx Ω zz i=1 j=1 Q m = S mqs m, Ω m = S mωs m. The limiting distributions in 3 are results of the following assumption. Assumption 2. As n, n 1 H H p Q and n 1/2 H e d R N0,Ω. Suppose the ultimate goal is to make inference regarding the true value of a known, smooth function µ : R pq R evaluated at the unknown parameters; that is, we are interested in the true value of µθ. This is called the focus parameter. The model averaging estimator is a weighted average of the estimator for the focus parameter µθ in the individual submodels. The averaging estimator is µ = wm ˆδ ˆµ m, where wm ˆδ denotes the data-dependent weight for model m and ˆµ m should be understood as ˆµ m := µ ˆθ m = µ ˆβ m, ˆγ m, i.e. the function µ at the estimated values of β and γ m, with the q q m elements in γ not included in γ m set to 0. The local parameter, δ, can be unbiasedly, but not consistently, estimated asymptotically in the sense that ˆδ = n ˆγ d R δ = δ S 0Q 1 R Nδ,S 0Q 1 ΩQ 1 S 0. Let D θ be the derivative of µ with respect to the full parameter vector θ evaluated at the null points β,0. The key theorem in Liu 2015 underlying the results for a frequentist model averaging confidence interval for µ is stated next. d Theorem 1 Liu, If wm ˆδ wm R δ as n and Assumptions 1-2 hold, then n µ µ d D θ Q 1 R D θ wm R δ C m R δ where C m = P m Q I pq S 0 and P m = S m S mqs m 1 S m. 4

8 Proof. See Liu The proposed confidence interval rests on the idea of replacing limit quantities by finite sample counterparts. Thus, for sufficiently large n, it should be that n µ µ ˆD θ wm ˆδĈ m ˆδ app. N0,1 ˆκ where ˆκ is a consistent estimator of κ = D θ Q 1 ΩQ 1 D θ. Therefore, a confidence interval with asymptotic coverage of 1001 α% is µ b ˆδ z 1 α/2 ˆκ, µ b ˆδ z 1 α/2 ˆκ, 4 where b ˆδ = ˆD θ w m ˆδ Ĉ m ˆγ and z 1 α/2 is the 1 α/2 quantile of the standard normal distribution. 3 Equivalence of confidence intervals In this section we establish the equivalence of frequentist model averaging and full model confidence intervals. 3.1 Asymptotic equivalence Consider the case where there is a fixed model and that this model is the full model. By 3 and the delta theorem, the asymptotic distribution of the estimator of the focus parameter is n µ ˆθ µ θ d N 0,D θ Q 1 ΩQ 1 D θ. The confidence interval for µ in the full model based upon a finite-sample approximation to this asymptotic distribution is µ ˆθ z1 α/2 ˆκ, µ ˆθ z1 α/2 ˆκ, 5 where z is the 1 α/2 quantile of a standard normal distribution. Note that the interval in 5 has the same length as the interval in 4. It therefore follows that if the centers of the two intervals are asymptotically the same, then they are asymptotically equivalent. That is, equivalence holds if µ = ˆµ b ˆδ o p n 1/2. The existence of this particular relation is our main result, and it is established in the following theorem. The proof is placed in Appendix A. Theorem 2. The frequentist model averaging and full model confidence intervals in 4 and 5, respectively, are asymptotically equivalent. 5

9 Theorem 2 suggests that for large sample sizes, there are no gains in terms of interval estimation in using model averaging. A confidence interval based on least squares estimation of the full model instead offers the more compelling alternative, given its simplicity. 3.2 Small sample equivalence Theorem 2 shows that the confidence interval 5 is asymptotically equivalent to 4 for a general smooth and real-valued function µ θ. However, nothing is said about the finite sample properties, in which there may be substantial differences. In this section, we establish also small sample equivalence when µ θ = c θ, where c is a vector of known scalars. That is, the focus parameter is some fixed linear function of the regression coefficients. If the focus parameter is µ θ = c θ, the center of the full model confidence interval 5 is c ˆθ and the center of the model averaging confidence interval 4 is w m ˆδ ] Ĉ m ˆγ c [ w m ˆδ ˆθ m = c [ 1 ˆθ w m ˆδ ˆθ m ˆθ Ĉ m ˆγ ], since ˆD θ = c. The equality holds because Ĉ = 0 and ˆθ = ˆθ. Based on the center 6, we can establish the following theorem. Theorem 3. If µ θ = c θ, the confidence interval 5 is equivalent to the confidence interval 4 in finite samples, where c is a p q 1 constant vector. Theorem 3 says that the full model produces the same confidence interval as model averaging under some restricted cases. However, Theorem 3 says nothing about the small sample equivalence for other cases, e.g., β 1 /β 2 or expβ 1. We conjecture that the small sample equivalence is likely to fail for focus parameters not of the form µ θ = c θ. This is illustrated in the following example. Example 1. Consider a model with two regressors y = Xβ Zγ ɛ, where γ = δ/ n. From the above discussion, the confidence interval of β from the full model is the same as that from model averaging, both asymptotically and in a finite sample. In particular, from Theorem 2 we know that, for any differentiable function h, the confidence interval of hβ from the full model is asymptotically the same as that from model averaging. However, the above discussion says nothing about the finite sample equivalence. In this example, consider hβ = β 2. The center from the model averaging confidence interval is w 1 ˆδ ˆβ 2 1 [ 1 w 1 ˆδ ] ˆβ 2 2w 1 ˆδ ˆβ ˆQ 1 ˆQ xz ˆγ. 6 6

10 Observe that ˆβ 1 = ˆQ 1 X y/n, ˆβ = ˆQ zz X y/n ˆQ xz Z y/n ˆQ ˆQ zz ˆQ 2 xz and ˆγ = ˆQ xz X y/n ˆQ Z y/n. ˆQ ˆQ zz ˆQ 2 xz Then, the difference in two centers is w 1 ˆδ [ ˆβ 2 1 ˆβ 2 2 ˆβ ˆQ 1 ˆQ xz ˆγ ], which is zero if and only if w 1 ˆδ = 0 or ˆβ 1 2 ˆβ 2 2 ˆβ ˆQ 1 ˆQ xz ˆγ = 0. The former means the full model receives the total weight. Plugging in the expressions of ˆβ 1, ˆβ, and ˆγ, the latter satisfies ˆβ 1 2 ˆβ 2 2 ˆβ ˆQ 1 ˆQ xz ˆγ = 1 n 2 ˆQ 2 xz ˆQ xz X y ˆQ Z y 2 ˆQ 2 ˆQ ˆQ zz ˆQ xz 2 2, which is zero only if ˆQ xz X y = ˆQ Z y or ˆQ xz = 0. If X and Z are correlated, the two intervals are likely to be different. 4 Conclusion In this work, we have studied the frequentist model averaging confidence interval for linear models proposed by Liu Using similar techniques as Wang and Zhou 2013, our key result is that the model averaging confidence interval is asymptotically equivalent to the interval obtained by use of least squares estimation in the full model. Furthermore, we also provide the stronger result that the intervals are exactly equivalent in finite samples if µ, the function defining the focus parameter, is linear. While the results are disadvantageous to model averaging, they should be interpreted with care. Numerous studies have illustrated impressive performance of model averaging in terms of prediction. This strand of the literature is separate from what we consider and thus remains unaffected by the conclusion herein. In fact, the good predictive performance may instead suggest that with other methods, we may be able to increase the efficiency of our confidence intervals by resorting to model averaging. This, however, we leave for further research. A Proofs Proof of Theorem 2. A proof technique similar to that used by Wang and Zhou 2013 is also employed here, but applied to the different problem setting that we have. As already noted in the text, the intervals have the same length and thus what needs to be proved is that the centers are asymptotically equal. To this end, let R N 0,Ω. Lemma 1 in Liu 2015 indicates that n ˆθ m θ m = Q 1 m S mqs 0 Iq Π mπ m δ Q 1 m S mr o p 1, 7 where θ m = β,δ m/ n. For simplicity and without loss of generality, assume that the set of models is ordered in such a way that denotes the full model. If m =, 7

11 then Π = I, S = I, and Q 1 = Q 1, so then what remains is n ˆθ θ = Q 1 R o p 1, where θ = β,δ / n. Thus, R = nq ˆθ θ o p 1 and 7 becomes n ˆθ m θ m = Q 1 m S mqs 0 Iq Π mπ m δ Q 1 m S mq n ˆθ θ o p 1.8 By rules for inverses of block matrices e.g., Theorem in Harville, 1997, Q 1 Q 1 Q xz Π mk m Π m Q zx Q 1 Q 1 Q xz Π mk m where Q 1 m = K m Π m Q zx Q 1 K m, 9 K m = [ Π m Qzz Q zx Q 1 Q xz Π m ] 1 = Πm K 1 Π m with K = Q zz Q zx Q 1 1. Q xz Plugging 9 into Q 1 m S mq yields Q 1 m S mq Ip A = m, 0 B m where A m = Q 1 Q xz Iq Π mk m Π m K 1 and B m = K m Π m K 1. Thus, 8 can be expressed as n ˆθ m θ m = Am B m Iq Π Ip A mπ m δ m 0 B m ˆθ θ o p 1. Recall that ˆθ m = ˆβ m, ˆδ m/ n and θm = β,δ m/ n, where δ m = Π m δ. Then n ˆβ m β = A m δ n ˆβ β A m n ˆγ δ o p 1, 11 n n ˆγ m δ m = B m Iq Π n mπ m δ Bm n ˆγ δ o p 1, 12 n where we have used that I q Π mb m I q Π mπ m = I q Π mb m. The Taylor expansion of n µ ˆβ m,π m ˆγ m µ β,δ/ n around β,δ/ n is [ n µ ] ˆβ m,π δ m ˆγ m µ β, n = µ n ˆβ m β µ n Π m ˆγ m δ o p 1, 13 β where the partial derivatives are evaluated at β,0 q 1 because of the continuous mapping theorem. Likewise, for the full model, [ n µ ] δ ˆβ, ˆγ µ β, n = µ n ˆβ β µ n ˆγ δ o p β 8

12 We then get n µ = w m ˆδ nµ ˆβ m,π m ˆγ m Rewriting = w m ˆδ [ δ nµ β, n µ n ˆβ m β β µ n Π m ˆγ m δ ] o p 1 by 13 = δ µ nµ β, n ˆβ β n β w m ˆδ µ [ A m δ n ˆγ δ ] β w m ˆδ [ µ n Π m ˆγ m δ ] o p 1 by 11. Π m ˆγ m δ = Π m ˆγ m δ m n I q Π mπ m δ n and using 12 δ µ n µ = nµ β, n ˆβ β n β w m ˆδ µ [ A m δ n ˆγ δ ] β w m ˆδ [ µ Π mb m { Iq Π mπ m δ n ˆγ δ } I q Π mπ m δ ] o p 1 9

13 Applying 14 yields µ n µ = nµ ˆβ, ˆγ n ˆγ δ w m ˆδ µ [ A m δ n ˆγ δ ] β w m ˆδ [ µ w m ˆδ µ Π mb m { Iq Π mπ m δ n ˆγ δ } I q Π mπ m δ ] o p 1 = nµ ˆβ, ˆγ w m ˆδ [ µ µ A m Π mb ] m I q n ˆγ δ β w m ˆδ [ µ µ A m Π β mb m Iq Π ] mπ m δ Iq Π mπ m δ op 1 A m ˆγ δ = nµ ˆβ, ˆγ w m ˆδ D θ Π mb m I q w m ˆδ D θ A m Π mb m I q Π δ mπ m w m ˆδ µ Iq Π mπ m δ op Further, using C m = S m Q 1 m S mq I S 0 = A m Π mb m I q 16 in 15 yields n µ = nµ ˆβ, ˆγ [ D θ w m ˆδ [ w m ˆδ µ D θ w m ˆδ C m ] ˆγ δ n A m ] δ Π mb m I q Π mπ m Iq Π mπ m δ op The center of the confidence interval proposed in Liu 2015 is µ b ˆδ, where b ˆδ = ˆD θ w m ˆδ Ĉ m ˆγ with ˆδ = n ˆγ. Thus, using 17 together with 16, 10

14 we have n µ = nµ ˆβ, ˆγ [ D θ D θ w m ˆδ C m ] ˆγ ] δ w m ˆδ [ A m Π mb m I q Π mπ m w m ˆδ µ Iq Π mπ m δ op 1 = nµ ˆβ, ˆγ [D θ D θ w m ˆδ C m ] ˆγ w m ˆδ 0 I Π mb m Π mπ m w m ˆδ µ = nµ ˆβ, ˆγ [D θ w m ˆδ µ w m ˆδ µ = nµ ˆβ, ˆγ [D θ δ Iq Π mπ m δ op 1 w m ˆδ C m ] ˆγ Iq Π mπ m δ [ D θ Iq Π mπ m δ op 1 by 10 w m ˆδ C m ] ˆγ o p 1. w m ˆδ C m ]δ Hence, µ = ˆµ b ˆδo p n 1/2 and the intervals are asymptotically equivalent. Proof of Theorem 3. Also for this proof, it suffices to show that the intervals have the same centers asymptotically as the lengths are equal. Applying equation 9 in the appendix leads to ˆθ m = 1 n ˆQ 1 ˆQ 1 ˆQ xz Π ˆK m m Π m ˆQ zx ˆQ 1 X y ˆQ 1 ˆQ xz Π ˆK m m Π m Z y ˆK m Π m ˆQ zx ˆQ 1 X y ˆK m Π m Z y where the first block corresponds to ˆβ m and the second block corresponds to ˆγ m. Note that ˆβ ˆθ Ĉ m ˆγ = Â m ˆγ Π ˆB, m m ˆγ, where Â m = ˆQ 1 ˆQ xz I Π m ˆK m Π m ˆK 1 and ˆB m = ˆK m Π m ˆK 1. Therefore, the 11

15 difference of the centers satisfies [ 1 c w m ˆδ ] [ 1 ˆθ m ˆθ Ĉ m ˆγ = c w m ˆδ ˆβ m ˆβ ] Â m ˆγ ˆγ m Π ˆB. m m ˆγ 18 First, for the upper block ˆβ m ˆβ Â m ˆγ = ˆQ 1 ˆQ 1 ˆQ 1 ˆQ 1 ˆQ 1 = 0 p 1. ˆQ xz Π ˆK m m Π m ˆQ zx ˆQ 1 X y ˆQ 1 ˆQ xz Π ˆK m m Π m Z y ˆQ xz ˆK ˆQ zx ˆQ 1 X y ˆQ 1 ˆQ xz ˆKZ y ˆQ xz I Π m ˆK m Π m ˆK 1 ˆK ˆQ zx ˆQ 1 X y ˆKZ y Second, for the lower block ˆγ m Π ˆB m m ˆγ = Π m ˆγm ˆB m ˆγ = Π [ m ˆK m Π m ˆQ zx ˆQ 1 X y ˆK m Π m Z y ˆB m ˆK ˆQ zx ˆQ 1 X y ˆKZ y ] = Π m ˆB m ˆK ˆK m Π m ˆQ zx ˆQ 1 X Z y = 0 q 1 where the last equality holds because of ˆB m ˆK = ˆK m Π m ˆK 1 ˆK = ˆK m Π m. Thus, the difference between the centers in 18 may be further seen to be [ 1 c w m ˆδ ] [ 1 ˆθ m ˆθ Ĉ m ˆγ = c w m ˆδ 0 p 1 0 q 1 ] = 0 pq 1. Consequently, the confidence intervals for a linear function µθ = c θ of the parameters based on either frequentist model averaging or on least squares in the full model are equivalent, even in finite samples. References Akaike H 1974 A New Look at the Statistical odel Identification. IEEE Automat Contr 196: Ando T, Li KC 2014 A odel-averaging Approach for High-Dimensional Regression. J Am Stat Assoc : Berk R, Brown L, Buja A, Zhang K, Zhao L 2013 Valid post-selection inference. Ann Stat 412: Cheng TCF, Ing CK, Yu SH 2015 Toward optimal model averaging in regression models with time series errors. J Econometrics 1892:

16 Hansen BE 2007 Least squares model averaging. Econometrica 754: Hansen BE, Racine JS 2012 Jackknife model averaging. J Econometrics 1671:38 46 Harville DA 1997 atrix algebra from a statistician s perspective. Springer, New York Hjort NL, Claeskens G 2003 Frequentist odel Average Estimators. J Am Stat Assoc 98464: Kabaila P 1995 The Effect of odel Selection on Confidence Regions and Prediction Regions. Economet Theor 113: Kabaila P, Leeb H 2006 On the Large-Sample inimal Coverage Probability of Confidence Intervals After odel Selection. J Am Stat Assoc : Leeb H, Pötscher B 2005 odel Selection and Inference: Facts and Fiction. Economet Theor 211:21 59 Liu CA 2015 Distribution theory of the least squares averaging estimator. J Econometrics 1861: Liu Q, Okui R 2013 Heteroscedasticity-robust C p model averaging. Economet J 163: Pötscher B 1991 Effects of odel Selection on Inference. Economet Theor 72: Schwarz G 1978 Estimating the dimension of a model. Ann Stat 62: Wan ATK, Zhang X, Zou G 2010 Least squares model averaging by allows criterion. J Econometrics 1562: Wang H, Zhou SZF 2013 Interval Estimation by Frequentist odel Averaging. Commun Stat Theory 4223:

ESSAYS ON MODEL AVERAGING. Chu-An Liu. A dissertation submitted in partial fulfillment of the requirements for the degree of. Doctor of Philosophy

ESSAYS ON MODEL AVERAGING. Chu-An Liu. A dissertation submitted in partial fulfillment of the requirements for the degree of. Doctor of Philosophy ESSAYS ON MODEL AVERAGING by Chu-An Liu A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Economics) at the UNIVERSITY OF WISCONSIN MADISON 0 Date