Confidence Sets Based on Shrinkage Estimators

Size: px
Start display at page:

Download "Confidence Sets Based on Shrinkage Estimators"

Transcription

1 Confidence Sets Based on Shrinkage Estimators Mikkel Plagborg-Møller Harvard University June 2017

2 Shrinkage estimators in applied work ˆβ shrink = argmin β { ˆQ(β) + λc(β) } Shrinkage/penalized estimators popular in economics: Random effects. High-dimensional prediction. Smoothing jagged functions. Shiller (1973); Hodrick & Prescott (1981); Breitung & Roling (2015); Barnichon & Brownlees (2017) Estimating fixed effects. Chetty et al. (2014); Chamberlain (2016) Shrinking toward theory. Hansen (2016); Fessler & Kasy (2017) Shrinkage parameter λ often data-dependent. 2

3 Challenges of shrinkage inference How to calculate SEs for shrinkage estimators? With data-dependent shrinkage parameter λ, asy. distribution often discontinuous in true parameters. Example For finite-dim parameters, impossible to estimate CDF of ˆβ shrink uniformly consistently. Leeb & Pötscher (2005) Standard bootstrap typically doesn t work. Beran (2010) Applied researchers often just undersmooth (i.e., SE for usual point estimator). Not always valid. 3

4 This project Class of generalized ridge regression estimators: Vinod (1978) ˆβ M,W (λ) = argmin β R n { β ˆβ 2 W + λ Mβ 2}. Shrinkage parameter λ selected by unbiased risk estimate. Gaussian location model: ˆβ N n (β, Σ), known Σ. Conditional QLR test for linear hypothesis on β. Exact size. Conditional QLR confidence set by test inversion. Simulations show favorable average length/area of CSs. Uniform asymptotic validity even when data is non-gaussian. 4

5 Relationship to literature Large stats lit uses analytically convenient transformations and priors. Casella & Hwang (1982, 1984, 1987, 2012); Tseng & Brown (1997) My starting point: How to calculate SEs for given ridge estimator? Arbitrary correlation structure, arbitrary shrinkage hypothesis. CSs tied to (and always contain) meaningful point estimator. Tests/CSs have Empirical Bayes (random effects) interpretation. But I do not start from decision-theoretic first principles. Impossible to uniformly dominate expected volume of Wald ellipsoid for 1-D or 2-D problems. Stein (1962); Brown (1966); Joshi (1969) 5

6 Other related literature Shrinkage: Stein (1956); James & Stein (1961); Bock (1975); Oman (1982); Casella & Hwang (1987) Unbiased risk estimate: Mallows (1973); Stein (1973, 1981); Berger (1985); Claeskens & Hjort (2003); Hansen (2010) Asymptotics for shrinkage: Leeb & Pötscher (2005); Hansen (2016) Uniformity: Andrews, Cheng & Guggenberger (2011); McCloskey (2015) Post-regularization inference: Chernozhukov, Hansen & Spindler (2015) Conditional inference: Andrews & Mikusheva (2016) Adaptive confidence sets: Pratt (1961); Brown, Casella & Hwang (1995); Wasserman (2006); Armstrong & Kolesár (2016) 6

7 Outline 1 Shrinkage estimators and unbiased risk estimate 2 Testing 3 Confidence sets 4 Simulation study 5 Uniform asymptotic validity 6 Applications 7 Summary

8 Gaussian location model For now, consider finite-sample Gaussian location model β R n unknown. Σ symmetric p.d. and known. ˆβ N n (β, Σ). Will later consider asymptotic framework for which the Gaussian model is the right limit experiment. Plug in consistent estimator ˆΣ. 7

9 General shrinkage estimator class { ˆβ M,W (λ) = argmin β ˆβ 2 W + λ Mβ 2} = Θ M,W (λ) ˆβ, β R n Θ M,W (λ) = (I n + λw 1 M M) 1. M R m n, W R n n symmetric p.d. Example: M = Penalizes jaggedness R(n 2) n. Whittaker (1923); Shiller (1972); Hodrick & Prescott (1981); Wahba (1990) 8

10 8 6 response, basis points horizon, months y t : GZ excess bond premium. x t : high-freq. FFF shock. Controls: 2 lags of y t, x t, log(cpi), log(ip), 1yrTreas. Sample:

11 Projection shrinkage Shrinkage particularly tractable when W = I n and M = P R n n is orthogonal projection matrix: P = P = P 2. Projection shrinkage towards linear subspace span(i n P). Stein (1956); Oman (1982a,b); Bock (1985); Casella & Hwang (1987) ˆβ P (λ) = argmin { β ˆβ 2 + λ Pβ 2} β R n = λ P ˆβ + (I n P) ˆβ. Example: I n P = proj. matrix from regression onto basis functions. 10

12 5 response, basis points horizon, months y t : GZ excess bond premium. x t : high-freq. FFF shock. Controls: 2 lags of y t, x t, log(cpi), log(ip), 1yrTreas. Sample:

13 MSE risk criterion: Unbiased risk estimate R M,W (λ; β ) = E β Unbiased risk estimate (URE): ( ) ˆβ M,W (λ) β 2 W. Bias/var. Mallows (1973); Stein (1973, 1981); Berger (1985); Hansen (2010) ˆR M,W (λ) = ˆβ M,W (λ) ˆβ 2 W + 2 tr{w Θ M,W (λ)σ}. Define ˆλ M,W = argmin λ 0 ˆR M,W (λ). May equal. lim λ ˆβM,W (λ) well defined if M full rank or proj. 12

14 MSE risk criterion: Unbiased risk estimate R M,W (λ; β ) = E β Unbiased risk estimate (URE): ( ) ˆβ M,W (λ) β 2 W. Bias/var. Mallows (1973); Stein (1973, 1981); Berger (1985); Hansen (2010) ˆR M,W (λ) = ˆβ M,W (λ) ˆβ 2 W + 2 tr{w Θ M,W (λ)σ}. Define ˆλ M,W = argmin λ 0 ˆR M,W (λ). May equal. lim λ ˆβM,W (λ) well defined if M full rank or proj. Suff. cond. for unique minimum: All nonzero eig.val s of MW 1 M are equal (e.g., proj shrink). Assume a.s. unique min. for rest of talk. 12

15 1 estimated MSE, normalized ˆR P ( x 1 x ), x [0, 1) λ/(1+λ) y t : GZ excess bond premium. x t : high-freq. FFF shock. Controls: 2 lags of y t, x t, log(cpi), log(ip), 1yrTreas. Sample:

16 Optimal projection shrinkage For projection shrinkage, can minimize URE in closed form: ˆβ P (ˆλ P ) = ( 1 tr(σ ) P) P ˆβ 2 + P ˆβ + (I n P) ˆβ, James-Stein shrinkage towards linear subspace. Stein (1956); James & Stein (1961); Oman (1982a,b); Bock (1985) Σ P = PΣP. 14

17 Optimal projection shrinkage For projection shrinkage, can minimize URE in closed form: ˆβ P (ˆλ P ) = ( 1 tr(σ ) P) P ˆβ 2 + P ˆβ + (I n P) ˆβ, James-Stein shrinkage towards linear subspace. Stein (1956); James & Stein (1961); Oman (1982a,b); Bock (1985) Proposition (Hansen, 2016): If tr(σ P ) > 4ρ(Σ P ), E β Σ P = PΣP. ( ˆβ P (ˆλ P ) β 2) ( < E β ˆβ β 2) for all β. Necessary cond n: rk(p) > 4. E.g., if I n P is projection onto p basis functions, then need n > p

18 Outline 1 Shrinkage estimators and unbiased risk estimate 2 Testing 3 Confidence sets 4 Simulation study 5 Uniform asymptotic validity 6 Applications 7 Summary

19 Hypothesis testing in shrinkage applications R R r n full row rank. No UMP test exists. H 0 : Rβ = b, H 1 : Rβ b. Wald test is UMP unbiased (r = 1), UMP invariant, and admissible. If we re already using shrinkage point estimator, might want hypothesis test tied to this estimator as well. Obtain CS by inversion. My proposed test is biased+noninvariant, so may achieve higher power than usual Wald test for some DGPs. 15

20 Empirical Bayes quasi-likelihood ratio test Base hypothesis test on (negative) quasi-log-likelihood ˆL M,W (β) = β ˆβ 2 W + ˆλ M,W Mβ 2. Empirical Bayes (random effects) interpretation: β data N ( ˆβ M,W (ˆλ M,W ), (W + ˆλ M,W M M) 1). QLR test statistic of Rβ = b: min β : Rβ=b ˆL M,W (β) min ˆL M,W (β) β = R ˆβ M,W (ˆλ M,W ) b 2 (R(W +ˆλ M,W M M) 1 R ) 1 16

21 Null distribution impractical LR M,W (b) = R ˆβ M,W (ˆλ M,W ) b 2 (R(W +ˆλ M,W M M) 1 R ) 1 Assume Var(RZ MZ) nonsingular, Z N n (0, I n ). Then LR well defined even when ˆλ M,W =. Holds in many cases. If Var(RZ MZ) singular, can use ad hoc LR statistic LR M,W (b) = R ˆβ M,W (ˆλ M,W ) b 2 (RW 1 R ) 1. 17

22 Null distribution impractical LR M,W (b) = R ˆβ M,W (ˆλ M,W ) b 2 (R(W +ˆλ M,W M M) 1 R ) 1 Assume Var(RZ MZ) nonsingular, Z N n (0, I n ). Then LR well defined even when ˆλ M,W =. Holds in many cases. If Var(RZ MZ) singular, can use ad hoc LR statistic LR M,W (b) = R ˆβ M,W (ˆλ M,W ) b 2 (RW 1 R ) 1. Practical problem: Null distribution of LR statistic depends on Mβ. Solution: Condition on sufficient statistic for n r nuisance param s. 17

23 Sufficient statistic for nuisance parameters Define ζ = ΣR (RΣR ) 1 R n r and P = ζr R n n. Statistic ˆν = (I n P) ˆβ is S-ancillary wrt. Rβ : ˆβ ˆν F Rβ,Σ, ˆν F (In P)β,Σ. It would be uncontroversial to condition on ˆν in the absence of prior information linking Rβ and (I n P)β. In practice, the prior information Mβ 1 may not substantially constrain the relationship between Rβ and (I n P)β. Then conditioning wastes little information. Severini (1995) I condition on ˆν. Later: connection to Empirical Bayes HPD set. 18

24 Critical value by simulation Conditional QLR test rejects H 0 if LR M,W (b) > q 1 α,m,w (b, ˆν). Conditional critical value given ˆν = ν: q 1 α,m,w (b, ν) = quantile 1 α ( R β( λ; U) b 2 (R(W + λ(u)m M) 1 R ) 1 ), where U N r (b, RΣR ), β(λ; U) = Θ M,W (λ)(ζu + ν) for all λ 0, { } λ(u) = argmin β(λ; U) (ζu + ν) 2 W + 2 tr(w Θ M,W (λ)σ). λ 0 By design, conditional (and thus unconditional) size = 1 α. 19

25 Outline 1 Shrinkage estimators and unbiased risk estimate 2 Testing 3 Confidence sets 4 Simulation study 5 Uniform asymptotic validity 6 Applications 7 Summary

26 Confidence set by test inversion Invert CQLR test to obtain CS for b = Rβ : Ĉ M,W = { b R r : LR } M,W (b) q 1 α,m,w (b, ˆν). Do this by grid search. Simulate quantile at each point. Feasible in one or two dimensions (proj. shrinkage fast). Uniform band If M full rank or proj., can compute simple, finite upper bound on critical value. More Ĉ M,W contained in bounded ellipsoid centered at R ˆβ M,W (ˆλ M,W ). Limits grid search. 20

27 Properties of shrinkage confidence set 1 ĈM,W always contains shrinkage point estimate. 2 Generally not symmetric around point estimate. 3 Not always convex. 4 Converges a.s. to usual Wald ellipsoid as Mβ, M fixed. 5 Expected volume depends on β only through Mβ. Appears difficult to characterize expected volume. Even for projection shrinkage, conditional power of CQLR test depends on 6 parameters. 21

28 Empirical Bayes HPD set ˆL M,W (β) = β ˆβ 2 W + ˆλ M,W Mβ 2, β data N ( ˆβ M,W (ˆλ M,W ), (W + ˆλ M,W M M) 1). Empirical Bayes 1 α Highest Posterior Density set for Rβ : Ĉ EB = Doesn t control frequentist coverage. { b R r : LR } M,W (b) χ 2 r,1 α. Like shrinkage CS, but non-random critical value. 22

29 Minimum coverage discrepancy with EB HPD set Symmetric set difference: A B = (A B)\(A B). Proposition (following Andrews & Mikusheva, 2016) Let C be any similar confidence set for Rβ (like ĈM,W ): P β ( Rβ C ) = 1 α for all β R n. Then P β ( ) ( ) Rβ ĈM,W ĈEB P β Rβ C ĈEB for all β R n. Proof 23

30 Outline 1 Shrinkage estimators and unbiased risk estimate 2 Testing 3 Confidence sets 4 Simulation study 5 Uniform asymptotic validity 6 Applications 7 Summary

31 Illustration: bivariate shrinkage toward average Bivariate model, projection shrinkage toward average: Lindley (1962) ˆβ = ( ˆβ 1 ˆβ 2 ) e 1 ˆβ P (ˆλ P ) = ˆβ 1 + ˆβ 2 2 Parameter of interest: β 1. N 2 (( β 1 β 2 ), ( 1 ρ ρ 1 ( ) 2(1 ρ) + 1 ( ˆβ 1 ˆβ 2 ) 2 + )), ˆβ 1 ˆβ 2. 2 Both MSE of shrinkage estimator and expected length of shrinkage CI depend on DGP only through β 2 β 1 and ρ. 24

32 Illustration: bivariate shrinkage toward average RMSE avg. length of 90% CI = 0.0 = 0.3 =

33 Simulation study of confidence intervals β i = ˆβ N n (β, Σ), 1 i 1 n 1 if K = 0, sin 2πK(i 1) n 1 if K > 0, Σ ij = σ i σ j κ i j, σ i = σ 0 ( 1 + (i 1) ϕ 1 n 1 Consider projection shrinkage toward quadratic polynomial. Lower bound on expected length relative to Wald CI: Pratt (1961) ). (1 α)φ 1 (1 α) + (2π) 1/2 e 1 2 (Φ 1 (1 α)) 2 Φ 1 (1 α/2) for α =

34 Simulation study of confidence intervals MSE ˆβ(ˆλ) Length Ĉ n K κ σ 0 ϕ Tot 1st Mid 1st Mid MSE relative to ˆβ, average length relative to Wald. Conf. level = 90%. 1st = β 1, Mid = β 1+[n/2]. 27

35 Simulation study of 2-D confidence sets Same design, but now construct 2-D confidence set for (β 1, β 1+[n/2] ). Lower bound on expected area relative to Wald ellipse: Pratt (1961); Brown, Casella & Hwang (1995) 2 0 r Φ ( Φ 1 (1 α) r ) dr χ 2 1 α, for α =

36 Simulation study of 2-D confidence sets Area n K κ σ 0 ϕ Ĉ Ĉ adhoc Average area relative to Wald. Conf. level = 90%. 29

37 Takeaways from simulations Shrinkage CS works well when shrinkage point estimator works well. Shrinkage may be harmful when... 1 Mβ conveys little info about Rβ. 2 Mβ neither small nor large. 3 Correlations are high. 4 Variance of MLE of nuisance parameters large relative to variance of MLE of parameter of interest (e.g., small n). 30

38 Outline 1 Shrinkage estimators and unbiased risk estimate 2 Testing 3 Confidence sets 4 Simulation study 5 Uniform asymptotic validity 6 Applications 7 Summary

39 Uniform asymptotic size control CQLR test achieves uniform asymptotic size control, provided ˆβ is uniformly asy. normal, and ˆΣ is uniformly consistent for Σ. Uniform frequentist validity contrasts with other approaches. Undersmoothing: Pretend λ is small, ignore bias of shrinkage estimator as well as variability in λ. Switching rule: Use Wald SE if M ˆβ > c, otherwise use asymptotics under assumption Mβ = 0. Random effects: Treat random effects assumption as part of the DGP rather than just a prior. Size control wrt. random effects distribution. 31

40 Assumption: Preliminary estimator well-behaved Assumption Define S = {A S n + : c 1/ρ(A 1 ) ρ(a) c} for fixed c, c > 0. The distribution of the data F T for sample size T is indexed by three parameters β B R n, Σ S, and γ Γ. The estimators ( ˆβ, ˆΣ) R n S n + satisfy the following: For every sequence {β T, Σ T, γ T } T 1 B S Γ and every subsequence {k T } T 1 of {T } T 1, there exists a further subsequence { k T } T 1 such that k T ˆΣ 1/2 ( ˆβ β kt ) d N n(0, I n ), F k T (β k T,Σ k T,γ k ) T (ˆΣ Σ kt ) p 0, as T. F k T (β k T,Σ k T,γ k ) T S n + = set of symmetric positive definite n n matrices. 32

41 Shrinkage test is uniformly valid Let LR and ˆq 1 α denote CQLR test statistic and quantile obtained by plugging in T 1 ˆΣ in place of Σ. (Suppress M, W.) Proposition Let the previous assumption hold. Assume either rk(m) = m or M = P. Assume also Var(RZ MZ) is nonsingular, Z N n (0, I n ). Then ( lim inf inf Prob F T T (β,σ,γ) LR(Rβ) ˆq 1 α (Rβ, ˆν)) = 1 α. (β,σ,γ) B S Γ 33

42 Shrinkage test is uniformly valid Let LR and ˆq 1 α denote CQLR test statistic and quantile obtained by plugging in T 1 ˆΣ in place of Σ. (Suppress M, W.) Proposition Let the previous assumption hold. Assume either rk(m) = m or M = P. Assume also Var(RZ MZ) is nonsingular, Z N n (0, I n ). Then ( lim inf inf Prob F T T (β,σ,γ) LR(Rβ) ˆq 1 α (Rβ, ˆν)) = 1 α. (β,σ,γ) B S Γ Caveat: I have only written down the full proof for proj. shrinkage. I believe I have the arguments worked out for the general case. Proof idea: Consider drifting parameters β T... 1 If T Mβ T, we converge to non-shrinkage case. 2 If T Mβ T is bounded, we re in the Gaussian model in the limit. 33

43 Outline 1 Shrinkage estimators and unbiased risk estimate 2 Testing 3 Confidence sets 4 Simulation study 5 Uniform asymptotic validity 6 Applications 7 Summary

44 Treatment effect heterogeneity NSW job training experiment. Lalonde (1986); Dehejia & Wahba (1999) Outcome: earnings (absolute $) 3 years after treatment assignment. 297 treated, 425 control. Bin subjects by age decile subjects per bin. ˆβ R 10 : ATE estimate by bin. Projection shrinkage toward average ˆβ. 34

45 Treatment effect heterogeneity: confidence intervals Conf. level = 90%. Vertical axis = ATE ($), horizontal axis = age (years). 35

46 Treatment effect heterogeneity: 2-D confidence set ages ages 17- Conf. level = 90%. Axes = ATE ($). Ad hoc QLR statistic. 36

47 MIDAS forecasting Predict monthly PCE inflation using daily commodity prices, 1991:2 2017:2. MIDAS specification (lag lengths chosen by AIC): 6 25 p PCE,t = µ + γ l p PCE,t l + β j z t,j + ε t. l=1 j=1 z t,j : j-th daily observation of log Bloomberg commodity price index (BCOM) on or after 1st day of month t. ˆβ R 25 : least-squares estimator. Projection shrinkage toward straight line. Breitung & Roling (2015) 37

48 MIDAS forecasting: confidence intervals Conf. level = 90%. Vertical axis = inflation (log points), horizontal axis = lags (days). 38

49 MIDAS forecasting: 2-D confidence set Conf. level = 90%. Axes = inflation (log points). 39

50 Outline 1 Shrinkage estimators and unbiased risk estimate 2 Testing 3 Confidence sets 4 Simulation study 5 Uniform asymptotic validity 6 Applications 7 Summary

51 Summary Considered setting where generalized ridge regression point estimator is of interest: smoothing, shrinking toward average, etc. Proposed conditional QLR test based on same quasi-log-likelihood as shrinkage point estimator. Exact conditional size in Gaussian location model. Asymptotic uniform size control more generally. Shrinkage confidence set by test inversion. Contains shrinkage point estimate. Minimum coverage discrepancy w. EB HPD set among similar CSs. Computationally feasible in 1 2 dimensions. Proj. shrinkage fast. Promising simulation evidence. 40

52 Thank you

53 Non-standard asymptotics: example ˆβ N n (β, T 1 I n ) James-Stein estimator of β R n : ˆβ JS = ( 1 n 2 ) T ˆβ 2 ˆβ. If β 0: T ( ˆβ JS β ) d N n (0, I n ). If β = 0: ( T ( ˆβ JS β ) d 1 n 2 Z 2 ) Z, Z N n (0, I n ). Back 42

54 W = I n for simplicity. URE captures bias/variance tradeoff Risk decomposition: Claeskens & Hjort (2003) R M,In (λ) = tr { [I n Θ M,In (λ)] 2 β β } + tr { Θ M,In (λ) 2 Σ }. }{{}}{{} bias squared variance Unbiased estimate: β β = E( ˆβ ˆβ ) Σ. Plug in: R M,In (λ) = tr { [I n Θ M,In (λ)] 2 ( ˆβ ˆβ Σ) } + tr { Θ M,In (λ) 2 Σ } = ˆR M,In (λ) tr(σ). Back 43

55 Triangle inequality: Bound on critical value LR M,W (Rβ) R( ˆβ M,W (ˆλ M,W ) ˆβ) V (ˆλ) 1 + R( ˆβ β) V (ˆλ) 1. Let Z N n (0, W 1 ). For any β R n and A R n n symm. p.d., ( R(β ˆβ) 2 β ˆβ 2 V (ˆλ) 1 A ρ RA 1 R Var(RZ MZ) 1). Since ˆR M,W (ˆλ M,W ) ˆR M,W (0), { ˆβ M,W (ˆλ M,W ) ˆβ 2 W 2 tr MΣM (MW 1 M ) 1}. Under the null H 0 : Rβ = Rβ, R( ˆβ β) 2 (RΣR ) 1 χ 2 (r). Back 44

56 Uniform confidence band Supremum test statistic of H 0 : β i = β i, i = 1,..., n: ŜLR M,W (β) = sup i=1,...,n ˆβ i,m,w (ˆλ M,W ) β i e i (W + ˆλ M,W M M) 1. e i Simulate null critical value q 1 α,m,w (β) for any β. Simultaneous confidence band: rectangular envelope of inverted test. n C M,W = inf β i, sup β i. i=1 β : ŜLR(β) q 1 α (β) β : ŜLR(β) q 1 α (β) Computationally challenging. Can sample from band. Inoue & Kilian (2016) Back 45

57 Coverage discrepancy: proof sketch Proof is a confidence set reinterpretation of Andrews & Mikusheva (2016) result on conditional testing. =1 α ( ) { [ }}{ P β Rβ C ĈEB = E β 1(Rβ C) ] [ ] +E β 1(Rβ ĈEB) [ 2E β 1(Rβ C)1(Rβ ] ĈEB) 46

58 Coverage discrepancy: proof sketch Proof is a confidence set reinterpretation of Andrews & Mikusheva (2016) result on conditional testing. =1 α ( ) { [ }}{ P β Rβ C ĈEB = E β 1(Rβ C) ] [ ] +E β 1(Rβ ĈEB) [ 2E β 1(Rβ C)1(Rβ ] ĈEB) ( ) ( ) P β Rβ C ĈEB P β Rβ ĈM,W ĈEB [{ = 2E β 1(Rβ ĈM,W ) 1(Rβ C) } ] 1(Rβ ĈEB) [{ = 2E β 1(Rβ ĈM,W ) 1(Rβ C) } )] 1( LR M,W (Rβ ) χ 2 r,1 α 46

59 Similarity of C and completeness of the Gaussian family imply conditional similarity (like ĈM,W ): ( P β Rβ C ) ˆν = 1 α. By law of iterated expectations, [{ } ( )] 1(Rβ ĈM,W ) 1(Rβ C) 1 q 1 α,m,w (Rβ, ˆν) χ 2 r,1 α = 0. E β 47

60 Similarity of C and completeness of the Gaussian family imply conditional similarity (like ĈM,W ): ( P β Rβ C ) ˆν = 1 α. By law of iterated expectations, [{ } ( )] 1(Rβ ĈM,W ) 1(Rβ C) 1 q 1 α,m,w (Rβ, ˆν) χ 2 r,1 α = 0. E β ( ) ( ) P β Rβ C ĈEB P β Rβ ĈM,W ĈEB [ { = 2E β 1(Rβ ĈM,W ) 1(Rβ C) } { ) ( )} ] 1( LR M,W (Rβ ) χ 2 r,1 α 1 q 1 α,m,w (Rβ, ˆν) χ 2 r,1 α Variable inside the expectation is a.s. nonnegative by def n of ĈM,W. 47

61 Similarity of C and completeness of the Gaussian family imply conditional similarity (like ĈM,W ): ( P β Rβ C ) ˆν = 1 α. By law of iterated expectations, [{ } ( )] 1(Rβ ĈM,W ) 1(Rβ C) 1 q 1 α,m,w (Rβ, ˆν) χ 2 r,1 α = 0. E β ( ) ( ) P β Rβ C ĈEB P β Rβ ĈM,W ĈEB [ { = 2E β 1(Rβ ĈM,W ) 1(Rβ C) } { ) ( )} ] 1( LR M,W (Rβ ) χ 2 r,1 α 1 q 1 α,m,w (Rβ, ˆν) χ 2 r,1 α Variable inside the expectation is a.s. nonnegative by def n of ĈM,W. Crucial: EB set inverts same test stat., but non-random crit. val. Back 47

Confidence Sets Based on Shrinkage Estimators

Confidence Sets Based on Shrinkage Estimators Confidence Sets Based on Shrinkage Estimators Mikkel Plagborg-Møller April 12, 2017 Shrinkage estimators in applied work { } ˆβ shrink = argmin β ˆQ(β) + λc(β) Shrinkage/penalized estimators popular in

More information

Part III. A Decision-Theoretic Approach and Bayesian testing

Part III. A Decision-Theoretic Approach and Bayesian testing Part III A Decision-Theoretic Approach and Bayesian testing 1 Chapter 10 Bayesian Inference as a Decision Problem The decision-theoretic framework starts with the following situation. We would like to

More information

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that Linear Regression For (X, Y ) a pair of random variables with values in R p R we assume that E(Y X) = β 0 + with β R p+1. p X j β j = (1, X T )β j=1 This model of the conditional expectation is linear

More information

Machine learning, shrinkage estimation, and economic theory

Machine learning, shrinkage estimation, and economic theory Machine learning, shrinkage estimation, and economic theory Maximilian Kasy December 14, 2018 1 / 43 Introduction Recent years saw a boom of machine learning methods. Impressive advances in domains such

More information

Lecture 20 May 18, Empirical Bayes Interpretation [Efron & Morris 1973]

Lecture 20 May 18, Empirical Bayes Interpretation [Efron & Morris 1973] Stats 300C: Theory of Statistics Spring 2018 Lecture 20 May 18, 2018 Prof. Emmanuel Candes Scribe: Will Fithian and E. Candes 1 Outline 1. Stein s Phenomenon 2. Empirical Bayes Interpretation of James-Stein

More information

Habilitationsvortrag: Machine learning, shrinkage estimation, and economic theory

Habilitationsvortrag: Machine learning, shrinkage estimation, and economic theory Habilitationsvortrag: Machine learning, shrinkage estimation, and economic theory Maximilian Kasy May 25, 218 1 / 27 Introduction Recent years saw a boom of machine learning methods. Impressive advances

More information

finite-sample optimal estimation and inference on average treatment effects under unconfoundedness

finite-sample optimal estimation and inference on average treatment effects under unconfoundedness finite-sample optimal estimation and inference on average treatment effects under unconfoundedness Timothy Armstrong (Yale University) Michal Kolesár (Princeton University) September 2017 Introduction

More information

Fixed Effects, Invariance, and Spatial Variation in Intergenerational Mobility

Fixed Effects, Invariance, and Spatial Variation in Intergenerational Mobility American Economic Review: Papers & Proceedings 2016, 106(5): 400 404 http://dx.doi.org/10.1257/aer.p20161082 Fixed Effects, Invariance, and Spatial Variation in Intergenerational Mobility By Gary Chamberlain*

More information

STA 732: Inference. Notes 10. Parameter Estimation from a Decision Theoretic Angle. Other resources

STA 732: Inference. Notes 10. Parameter Estimation from a Decision Theoretic Angle. Other resources STA 732: Inference Notes 10. Parameter Estimation from a Decision Theoretic Angle Other resources 1 Statistical rules, loss and risk We saw that a major focus of classical statistics is comparing various

More information

Data Mining Stat 588

Data Mining Stat 588 Data Mining Stat 588 Lecture 02: Linear Methods for Regression Department of Statistics & Biostatistics Rutgers University September 13 2011 Regression Problem Quantitative generic output variable Y. Generic

More information

Long-Run Covariability

Long-Run Covariability Long-Run Covariability Ulrich K. Müller and Mark W. Watson Princeton University October 2016 Motivation Study the long-run covariability/relationship between economic variables great ratios, long-run Phillips

More information

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis. 401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis

More information

Statistics 203: Introduction to Regression and Analysis of Variance Penalized models

Statistics 203: Introduction to Regression and Analysis of Variance Penalized models Statistics 203: Introduction to Regression and Analysis of Variance Penalized models Jonathan Taylor - p. 1/15 Today s class Bias-Variance tradeoff. Penalized regression. Cross-validation. - p. 2/15 Bias-variance

More information

LECTURE ON HAC COVARIANCE MATRIX ESTIMATION AND THE KVB APPROACH

LECTURE ON HAC COVARIANCE MATRIX ESTIMATION AND THE KVB APPROACH LECURE ON HAC COVARIANCE MARIX ESIMAION AND HE KVB APPROACH CHUNG-MING KUAN Institute of Economics Academia Sinica October 20, 2006 ckuan@econ.sinica.edu.tw www.sinica.edu.tw/ ckuan Outline C.-M. Kuan,

More information

The outline for Unit 3

The outline for Unit 3 The outline for Unit 3 Unit 1. Introduction: The regression model. Unit 2. Estimation principles. Unit 3: Hypothesis testing principles. 3.1 Wald test. 3.2 Lagrange Multiplier. 3.3 Likelihood Ratio Test.

More information

Model comparison and selection

Model comparison and selection BS2 Statistical Inference, Lectures 9 and 10, Hilary Term 2008 March 2, 2008 Hypothesis testing Consider two alternative models M 1 = {f (x; θ), θ Θ 1 } and M 2 = {f (x; θ), θ Θ 2 } for a sample (X = x)

More information

Statistical Inference

Statistical Inference Statistical Inference Liu Yang Florida State University October 27, 2016 Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, 2016 1 / 27 Outline The Bayesian Lasso Trevor Park

More information

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3 Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest

More information

Regression, Ridge Regression, Lasso

Regression, Ridge Regression, Lasso Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.

More information

Averaging Estimators for Regressions with a Possible Structural Break

Averaging Estimators for Regressions with a Possible Structural Break Averaging Estimators for Regressions with a Possible Structural Break Bruce E. Hansen University of Wisconsin y www.ssc.wisc.edu/~bhansen September 2007 Preliminary Abstract This paper investigates selection

More information

Understanding Regressions with Observations Collected at High Frequency over Long Span

Understanding Regressions with Observations Collected at High Frequency over Long Span Understanding Regressions with Observations Collected at High Frequency over Long Span Yoosoon Chang Department of Economics, Indiana University Joon Y. Park Department of Economics, Indiana University

More information

Machine Learning for OR & FE

Machine Learning for OR & FE Machine Learning for OR & FE Regression II: Regularization and Shrinkage Methods Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com

More information

Least Squares Model Averaging. Bruce E. Hansen University of Wisconsin. January 2006 Revised: August 2006

Least Squares Model Averaging. Bruce E. Hansen University of Wisconsin. January 2006 Revised: August 2006 Least Squares Model Averaging Bruce E. Hansen University of Wisconsin January 2006 Revised: August 2006 Introduction This paper developes a model averaging estimator for linear regression. Model averaging

More information

Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions

Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions Vadim Marmer University of British Columbia Artyom Shneyerov CIRANO, CIREQ, and Concordia University August 30, 2010 Abstract

More information

Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Structural Breaks October 29-31, / 91. Bruce E.

Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Structural Breaks October 29-31, / 91. Bruce E. Forecasting Lecture 3 Structural Breaks Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Structural Breaks October 29-31, 2013 1 / 91 Bruce E. Hansen Organization Detection

More information

ROBUST CONFIDENCE SETS IN THE PRESENCE OF WEAK INSTRUMENTS By Anna Mikusheva 1, MIT, Department of Economics. Abstract

ROBUST CONFIDENCE SETS IN THE PRESENCE OF WEAK INSTRUMENTS By Anna Mikusheva 1, MIT, Department of Economics. Abstract ROBUST CONFIDENCE SETS IN THE PRESENCE OF WEAK INSTRUMENTS By Anna Mikusheva 1, MIT, Department of Economics Abstract This paper considers instrumental variable regression with a single endogenous variable

More information

Linear Model Selection and Regularization

Linear Model Selection and Regularization Linear Model Selection and Regularization Recall the linear model Y = β 0 + β 1 X 1 + + β p X p + ɛ. In the lectures that follow, we consider some approaches for extending the linear model framework. In

More information

A more powerful subvector Anderson and Rubin test in linear instrumental variables regression. Patrik Guggenberger Pennsylvania State University

A more powerful subvector Anderson and Rubin test in linear instrumental variables regression. Patrik Guggenberger Pennsylvania State University A more powerful subvector Anderson and Rubin test in linear instrumental variables regression Patrik Guggenberger Pennsylvania State University Joint work with Frank Kleibergen (University of Amsterdam)

More information

Simultaneous Confidence Bands: Theoretical Comparisons and Recommendations for Practice

Simultaneous Confidence Bands: Theoretical Comparisons and Recommendations for Practice Simultaneous Confidence Bands: Theoretical Comparisons and Recommendations for Practice PRELIMINARY AND INCOMPLETE José Luis Montiel Olea Columbia University montiel.olea@gmail.com Mikkel Plagborg-Møller

More information

Ridge regression. Patrick Breheny. February 8. Penalized regression Ridge regression Bayesian interpretation

Ridge regression. Patrick Breheny. February 8. Penalized regression Ridge regression Bayesian interpretation Patrick Breheny February 8 Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/27 Introduction Basic idea Standardization Large-scale testing is, of course, a big area and we could keep talking

More information

Model Selection and Geometry

Model Selection and Geometry Model Selection and Geometry Pascal Massart Université Paris-Sud, Orsay Leipzig, February Purpose of the talk! Concentration of measure plays a fundamental role in the theory of model selection! Model

More information

Cointegrated VAR s. Eduardo Rossi University of Pavia. November Rossi Cointegrated VAR s Financial Econometrics / 56

Cointegrated VAR s. Eduardo Rossi University of Pavia. November Rossi Cointegrated VAR s Financial Econometrics / 56 Cointegrated VAR s Eduardo Rossi University of Pavia November 2013 Rossi Cointegrated VAR s Financial Econometrics - 2013 1 / 56 VAR y t = (y 1t,..., y nt ) is (n 1) vector. y t VAR(p): Φ(L)y t = ɛ t The

More information

optimal inference in a class of nonparametric models

optimal inference in a class of nonparametric models optimal inference in a class of nonparametric models Timothy Armstrong (Yale University) Michal Kolesár (Princeton University) September 2015 setup Interested in inference on linear functional Lf in regression

More information

g-priors for Linear Regression

g-priors for Linear Regression Stat60: Bayesian Modeling and Inference Lecture Date: March 15, 010 g-priors for Linear Regression Lecturer: Michael I. Jordan Scribe: Andrew H. Chan 1 Linear regression and g-priors In the last lecture,

More information

Optimizing forecasts for inflation and interest rates by time-series model averaging

Optimizing forecasts for inflation and interest rates by time-series model averaging Optimizing forecasts for inflation and interest rates by time-series model averaging Presented at the ISF 2008, Nice 1 Introduction 2 The rival prediction models 3 Prediction horse race 4 Parametric bootstrap

More information

Efficient Shrinkage in Parametric Models

Efficient Shrinkage in Parametric Models Efficient Shrinkage in Parametric Models Bruce E. Hansen University of Wisconsin September 2012 Revised: June 2015 Abstract This paper introduces shrinkage for general parametric models. We show how to

More information

δ -method and M-estimation

δ -method and M-estimation Econ 2110, fall 2016, Part IVb Asymptotic Theory: δ -method and M-estimation Maximilian Kasy Department of Economics, Harvard University 1 / 40 Example Suppose we estimate the average effect of class size

More information

Quick Review on Linear Multiple Regression

Quick Review on Linear Multiple Regression Quick Review on Linear Multiple Regression Mei-Yuan Chen Department of Finance National Chung Hsing University March 6, 2007 Introduction for Conditional Mean Modeling Suppose random variables Y, X 1,

More information

Linear Algebra Massoud Malek

Linear Algebra Massoud Malek CSUEB Linear Algebra Massoud Malek Inner Product and Normed Space In all that follows, the n n identity matrix is denoted by I n, the n n zero matrix by Z n, and the zero vector by θ n An inner product

More information

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models University of Illinois Fall 2016 Department of Economics Roger Koenker Economics 536 Lecture 7 Introduction to Specification Testing in Dynamic Econometric Models In this lecture I want to briefly describe

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Cross-Validation with Confidence

Cross-Validation with Confidence Cross-Validation with Confidence Jing Lei Department of Statistics, Carnegie Mellon University WHOA-PSI Workshop, St Louis, 2017 Quotes from Day 1 and Day 2 Good model or pure model? Occam s razor We really

More information

Econ 5150: Applied Econometrics Dynamic Demand Model Model Selection. Sung Y. Park CUHK

Econ 5150: Applied Econometrics Dynamic Demand Model Model Selection. Sung Y. Park CUHK Econ 5150: Applied Econometrics Dynamic Demand Model Model Selection Sung Y. Park CUHK Simple dynamic models A typical simple model: y t = α 0 + α 1 y t 1 + α 2 y t 2 + x tβ 0 x t 1β 1 + u t, where y t

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)

More information

Cross-Validation with Confidence

Cross-Validation with Confidence Cross-Validation with Confidence Jing Lei Department of Statistics, Carnegie Mellon University UMN Statistics Seminar, Mar 30, 2017 Overview Parameter est. Model selection Point est. MLE, M-est.,... Cross-validation

More information

Time Series and Forecasting Lecture 4 NonLinear Time Series

Time Series and Forecasting Lecture 4 NonLinear Time Series Time Series and Forecasting Lecture 4 NonLinear Time Series Bruce E. Hansen Summer School in Economics and Econometrics University of Crete July 23-27, 2012 Bruce Hansen (University of Wisconsin) Foundations

More information

IEOR 165 Lecture 7 1 Bias-Variance Tradeoff

IEOR 165 Lecture 7 1 Bias-Variance Tradeoff IEOR 165 Lecture 7 Bias-Variance Tradeoff 1 Bias-Variance Tradeoff Consider the case of parametric regression with β R, and suppose we would like to analyze the error of the estimate ˆβ in comparison to

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2

MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2 MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2 1 Bootstrapped Bias and CIs Given a multiple regression model with mean and

More information

VALIDITY OF SUBSAMPLING AND PLUG-IN ASYMPTOTIC INFERENCE FOR PARAMETERS DEFINED BY MOMENT INEQUALITIES

VALIDITY OF SUBSAMPLING AND PLUG-IN ASYMPTOTIC INFERENCE FOR PARAMETERS DEFINED BY MOMENT INEQUALITIES Econometric Theory, 2009, Page 1 of 41. Printed in the United States of America. doi:10.1017/s0266466608090257 VALIDITY OF SUBSAMPLING AND PLUG-IN ASYMPTOTIC INFERENCE FOR PARAMETERS DEFINED BY MOMENT

More information

ROBUST CONFIDENCE SETS IN THE PRESENCE OF WEAK INSTRUMENTS By Anna Mikusheva 1, MIT, Department of Economics. Abstract

ROBUST CONFIDENCE SETS IN THE PRESENCE OF WEAK INSTRUMENTS By Anna Mikusheva 1, MIT, Department of Economics. Abstract ROBUST CONFIDENCE SETS IN THE PRESENCE OF WEAK INSTRUMENTS By Anna Mikusheva 1, MIT, Department of Economics Abstract This paper considers instrumental variable regression with a single endogenous variable

More information

Bayesian Inference and the Parametric Bootstrap. Bradley Efron Stanford University

Bayesian Inference and the Parametric Bootstrap. Bradley Efron Stanford University Bayesian Inference and the Parametric Bootstrap Bradley Efron Stanford University Importance Sampling for Bayes Posterior Distribution Newton and Raftery (1994 JRSS-B) Nonparametric Bootstrap: good choice

More information

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Maximilian Kasy Department of Economics, Harvard University 1 / 37 Agenda 6 equivalent representations of the

More information

Applied Econometrics (QEM)

Applied Econometrics (QEM) Applied Econometrics (QEM) based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #3 1 / 42 Outline 1 2 3 t-test P-value Linear

More information

Let us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided

Let us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided Let us first identify some classes of hypotheses. simple versus simple H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided H 0 : θ θ 0 versus H 1 : θ > θ 0. (2) two-sided; null on extremes H 0 : θ θ 1 or

More information

Regime switching models

Regime switching models Regime switching models Structural change and nonlinearities Matthieu Stigler Matthieu.Stigler at gmail.com April 30, 2009 Version 1.1 This document is released under the Creative Commons Attribution-Noncommercial

More information

Lecture 8 Inequality Testing and Moment Inequality Models

Lecture 8 Inequality Testing and Moment Inequality Models Lecture 8 Inequality Testing and Moment Inequality Models Inequality Testing In the previous lecture, we discussed how to test the nonlinear hypothesis H 0 : h(θ 0 ) 0 when the sample information comes

More information

Previous lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing.

Previous lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing. Previous lecture P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing. Interaction Outline: Definition of interaction Additive versus multiplicative

More information

Econ 2148, spring 2019 Statistical decision theory

Econ 2148, spring 2019 Statistical decision theory Econ 2148, spring 2019 Statistical decision theory Maximilian Kasy Department of Economics, Harvard University 1 / 53 Takeaways for this part of class 1. A general framework to think about what makes a

More information

Analysis Methods for Supersaturated Design: Some Comparisons

Analysis Methods for Supersaturated Design: Some Comparisons Journal of Data Science 1(2003), 249-260 Analysis Methods for Supersaturated Design: Some Comparisons Runze Li 1 and Dennis K. J. Lin 2 The Pennsylvania State University Abstract: Supersaturated designs

More information

Bayesian methods in economics and finance

Bayesian methods in economics and finance 1/26 Bayesian methods in economics and finance Linear regression: Bayesian model selection and sparsity priors Linear Regression 2/26 Linear regression Model for relationship between (several) independent

More information

High-dimensional regression with unknown variance

High-dimensional regression with unknown variance High-dimensional regression with unknown variance Christophe Giraud Ecole Polytechnique march 2012 Setting Gaussian regression with unknown variance: Y i = f i + ε i with ε i i.i.d. N (0, σ 2 ) f = (f

More information

Consistent high-dimensional Bayesian variable selection via penalized credible regions

Consistent high-dimensional Bayesian variable selection via penalized credible regions Consistent high-dimensional Bayesian variable selection via penalized credible regions Howard Bondell bondell@stat.ncsu.edu Joint work with Brian Reich Howard Bondell p. 1 Outline High-Dimensional Variable

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 6 Jakub Mućk Econometrics of Panel Data Meeting # 6 1 / 36 Outline 1 The First-Difference (FD) estimator 2 Dynamic panel data models 3 The Anderson and Hsiao

More information

Nonparametric Regression. Badr Missaoui

Nonparametric Regression. Badr Missaoui Badr Missaoui Outline Kernel and local polynomial regression. Penalized regression. We are given n pairs of observations (X 1, Y 1 ),...,(X n, Y n ) where Y i = r(x i ) + ε i, i = 1,..., n and r(x) = E(Y

More information

Econ 2140, spring 2018, Part IIa Statistical Decision Theory

Econ 2140, spring 2018, Part IIa Statistical Decision Theory Econ 2140, spring 2018, Part IIa Maximilian Kasy Department of Economics, Harvard University 1 / 35 Examples of decision problems Decide whether or not the hypothesis of no racial discrimination in job

More information

OPTIMAL INFERENCE IN A CLASS OF REGRESSION MODELS. Timothy B. Armstrong and Michal Kolesár. May 2016 Revised May 2017

OPTIMAL INFERENCE IN A CLASS OF REGRESSION MODELS. Timothy B. Armstrong and Michal Kolesár. May 2016 Revised May 2017 OPTIMAL INFERENCE IN A CLASS OF REGRESSION MODELS By Timothy B. Armstrong and Michal Kolesár May 2016 Revised May 2017 COWLES FOUNDATION DISCUSSION PAPER NO. 2043R COWLES FOUNDATION FOR RESEARCH IN ECONOMICS

More information

Testing Statistical Hypotheses

Testing Statistical Hypotheses E.L. Lehmann Joseph P. Romano Testing Statistical Hypotheses Third Edition 4y Springer Preface vii I Small-Sample Theory 1 1 The General Decision Problem 3 1.1 Statistical Inference and Statistical Decisions

More information

BIOS 312: Precision of Statistical Inference

BIOS 312: Precision of Statistical Inference and Power/Sample Size and Standard Errors BIOS 312: of Statistical Inference Chris Slaughter Department of Biostatistics, Vanderbilt University School of Medicine January 3, 2013 Outline Overview and Power/Sample

More information

Size Distortion and Modi cation of Classical Vuong Tests

Size Distortion and Modi cation of Classical Vuong Tests Size Distortion and Modi cation of Classical Vuong Tests Xiaoxia Shi University of Wisconsin at Madison March 2011 X. Shi (UW-Mdsn) H 0 : LR = 0 IUPUI 1 / 30 Vuong Test (Vuong, 1989) Data fx i g n i=1.

More information

Lecture 2: Statistical Decision Theory (Part I)

Lecture 2: Statistical Decision Theory (Part I) Lecture 2: Statistical Decision Theory (Part I) Hao Helen Zhang Hao Helen Zhang Lecture 2: Statistical Decision Theory (Part I) 1 / 35 Outline of This Note Part I: Statistics Decision Theory (from Statistical

More information

Quantile Regression for Panel/Longitudinal Data

Quantile Regression for Panel/Longitudinal Data Quantile Regression for Panel/Longitudinal Data Roger Koenker University of Illinois, Urbana-Champaign University of Minho 12-14 June 2017 y it 0 5 10 15 20 25 i = 1 i = 2 i = 3 0 2 4 6 8 Roger Koenker

More information

Likelihood Ratio Tests. that Certain Variance Components Are Zero. Ciprian M. Crainiceanu. Department of Statistical Science

Likelihood Ratio Tests. that Certain Variance Components Are Zero. Ciprian M. Crainiceanu. Department of Statistical Science 1 Likelihood Ratio Tests that Certain Variance Components Are Zero Ciprian M. Crainiceanu Department of Statistical Science www.people.cornell.edu/pages/cmc59 Work done jointly with David Ruppert, School

More information

Multiscale Adaptive Inference on Conditional Moment Inequalities

Multiscale Adaptive Inference on Conditional Moment Inequalities Multiscale Adaptive Inference on Conditional Moment Inequalities Timothy B. Armstrong 1 Hock Peng Chan 2 1 Yale University 2 National University of Singapore June 2013 Conditional moment inequality models

More information

Some Curiosities Arising in Objective Bayesian Analysis

Some Curiosities Arising in Objective Bayesian Analysis . Some Curiosities Arising in Objective Bayesian Analysis Jim Berger Duke University Statistical and Applied Mathematical Institute Yale University May 15, 2009 1 Three vignettes related to John s work

More information

Lectures on Structural Change

Lectures on Structural Change Lectures on Structural Change Eric Zivot Department of Economics, University of Washington April5,2003 1 Overview of Testing for and Estimating Structural Change in Econometric Models 1. Day 1: Tests of

More information

Working Paper Series. Selecting models with judgment. No 2188 / October Simone Manganelli

Working Paper Series. Selecting models with judgment. No 2188 / October Simone Manganelli Working Paper Series Simone Manganelli Selecting models with judgment No 2188 / October 2018 Disclaimer: This paper should not be reported as representing the views of the European Central Bank (ECB).

More information

Inference in Nonparametric Series Estimation with Data-Dependent Number of Series Terms

Inference in Nonparametric Series Estimation with Data-Dependent Number of Series Terms Inference in Nonparametric Series Estimation with Data-Dependent Number of Series Terms Byunghoon ang Department of Economics, University of Wisconsin-Madison First version December 9, 204; Revised November

More information

Vector Auto-Regressive Models

Vector Auto-Regressive Models Vector Auto-Regressive Models Laurent Ferrara 1 1 University of Paris Nanterre M2 Oct. 2018 Overview of the presentation 1. Vector Auto-Regressions Definition Estimation Testing 2. Impulse responses functions

More information

What s New in Econometrics. Lecture 13

What s New in Econometrics. Lecture 13 What s New in Econometrics Lecture 13 Weak Instruments and Many Instruments Guido Imbens NBER Summer Institute, 2007 Outline 1. Introduction 2. Motivation 3. Weak Instruments 4. Many Weak) Instruments

More information

VAR Models and Applications

VAR Models and Applications VAR Models and Applications Laurent Ferrara 1 1 University of Paris West M2 EIPMC Oct. 2016 Overview of the presentation 1. Vector Auto-Regressions Definition Estimation Testing 2. Impulse responses functions

More information

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn Parameter estimation and forecasting Cristiano Porciani AIfA, Uni-Bonn Questions? C. Porciani Estimation & forecasting 2 Temperature fluctuations Variance at multipole l (angle ~180o/l) C. Porciani Estimation

More information

Carl N. Morris. University of Texas

Carl N. Morris. University of Texas EMPIRICAL BAYES: A FREQUENCY-BAYES COMPROMISE Carl N. Morris University of Texas Empirical Bayes research has expanded significantly since the ground-breaking paper (1956) of Herbert Robbins, and its province

More information

Program Evaluation with High-Dimensional Data

Program Evaluation with High-Dimensional Data Program Evaluation with High-Dimensional Data Alexandre Belloni Duke Victor Chernozhukov MIT Iván Fernández-Val BU Christian Hansen Booth ESWC 215 August 17, 215 Introduction Goal is to perform inference

More information

Threshold Autoregressions and NonLinear Autoregressions

Threshold Autoregressions and NonLinear Autoregressions Threshold Autoregressions and NonLinear Autoregressions Original Presentation: Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Threshold Regression 1 / 47 Threshold Models

More information

Why experimenters should not randomize, and what they should do instead

Why experimenters should not randomize, and what they should do instead Why experimenters should not randomize, and what they should do instead Maximilian Kasy Department of Economics, Harvard University Maximilian Kasy (Harvard) Experimental design 1 / 42 project STAR Introduction

More information

Lecture 3. Inference about multivariate normal distribution

Lecture 3. Inference about multivariate normal distribution Lecture 3. Inference about multivariate normal distribution 3.1 Point and Interval Estimation Let X 1,..., X n be i.i.d. N p (µ, Σ). We are interested in evaluation of the maximum likelihood estimates

More information

Some properties of Likelihood Ratio Tests in Linear Mixed Models

Some properties of Likelihood Ratio Tests in Linear Mixed Models Some properties of Likelihood Ratio Tests in Linear Mixed Models Ciprian M. Crainiceanu David Ruppert Timothy J. Vogelsang September 19, 2003 Abstract We calculate the finite sample probability mass-at-zero

More information

Linear Models and Estimation by Least Squares

Linear Models and Estimation by Least Squares Linear Models and Estimation by Least Squares Jin-Lung Lin 1 Introduction Causal relation investigation lies in the heart of economics. Effect (Dependent variable) cause (Independent variable) Example:

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

Lecture 8: Information Theory and Statistics

Lecture 8: Information Theory and Statistics Lecture 8: Information Theory and Statistics Part II: Hypothesis Testing and I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 23, 2015 1 / 50 I-Hsiang

More information

Instrumental Variables Estimation and Weak-Identification-Robust. Inference Based on a Conditional Quantile Restriction

Instrumental Variables Estimation and Weak-Identification-Robust. Inference Based on a Conditional Quantile Restriction Instrumental Variables Estimation and Weak-Identification-Robust Inference Based on a Conditional Quantile Restriction Vadim Marmer Department of Economics University of British Columbia vadim.marmer@gmail.com

More information

ST5215: Advanced Statistical Theory

ST5215: Advanced Statistical Theory Department of Statistics & Applied Probability Wednesday, October 5, 2011 Lecture 13: Basic elements and notions in decision theory Basic elements X : a sample from a population P P Decision: an action

More information

Estimation under Ambiguity

Estimation under Ambiguity Estimation under Ambiguity R. Giacomini (UCL), T. Kitagawa (UCL), H. Uhlig (Chicago) Giacomini, Kitagawa, Uhlig Ambiguity 1 / 33 Introduction Questions: How to perform posterior analysis (inference/decision)

More information

Lecture 11 Weak IV. Econ 715

Lecture 11 Weak IV. Econ 715 Lecture 11 Weak IV Instrument exogeneity and instrument relevance are two crucial requirements in empirical analysis using GMM. It now appears that in many applications of GMM and IV regressions, instruments

More information

STAT 200C: High-dimensional Statistics

STAT 200C: High-dimensional Statistics STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 57 Table of Contents 1 Sparse linear models Basis Pursuit and restricted null space property Sufficient conditions for RNS 2 / 57

More information

SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER. Donald W. K. Andrews. August 2011

SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER. Donald W. K. Andrews. August 2011 SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER By Donald W. K. Andrews August 2011 COWLES FOUNDATION DISCUSSION PAPER NO. 1815 COWLES FOUNDATION FOR RESEARCH IN ECONOMICS

More information

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate

More information

Bootstrapping high dimensional vector: interplay between dependence and dimensionality

Bootstrapping high dimensional vector: interplay between dependence and dimensionality Bootstrapping high dimensional vector: interplay between dependence and dimensionality Xianyang Zhang Joint work with Guang Cheng University of Missouri-Columbia LDHD: Transition Workshop, 2014 Xianyang

More information

Improved Inference for First Order Autocorrelation using Likelihood Analysis

Improved Inference for First Order Autocorrelation using Likelihood Analysis Improved Inference for First Order Autocorrelation using Likelihood Analysis M. Rekkas Y. Sun A. Wong Abstract Testing for first-order autocorrelation in small samples using the standard asymptotic test

More information

Robust Backtesting Tests for Value-at-Risk Models

Robust Backtesting Tests for Value-at-Risk Models Robust Backtesting Tests for Value-at-Risk Models Jose Olmo City University London (joint work with Juan Carlos Escanciano, Indiana University) Far East and South Asia Meeting of the Econometric Society

More information