Estimation under Ambiguity R. Giacomini (UCL), T. Kitagawa (UCL), H. Uhlig (Chicago) Giacomini, Kitagawa, Uhlig Ambiguity 1 / 33
Introduction Questions: How to perform posterior analysis (inference/decision) when a probabilistic prior is not available or not credible? How to assess robustness of the posterior results to a choice of prior? Giacomini, Kitagawa, Uhlig Ambiguity 2 / 33
Robust Bayes viewpoint I.J. Good (p5, 1965, The Estimation of Probabilities ): My own view, following Keynes and Koopman, is that judgements of probability inequalities are possible but not judgements of exact probabilities; therefore a Bayesian should have upper and lower betting probabilities. Giacomini, Kitagawa, Uhlig Ambiguity 3 / 33
Robust Bayes viewpoint I.J. Good (p5, 1965, The Estimation of Probabilities ): My own view, following Keynes and Koopman, is that judgements of probability inequalities are possible but not judgements of exact probabilities; therefore a Bayesian should have upper and lower betting probabilities. Prior input: a set of priors (ambiguous belief) An updating rule is applied to the ambiguous belief Posterior output: the set of posteriors and decision under ambiguity Giacomini, Kitagawa, Uhlig Ambiguity 3 / 33
Example: SVAR Baumeister & Hamilton (15 Ecta, BH hereafter) two-variable SVAR(8), ( n t, w t ): employment and wages growth A 0 = ( ) nt A 0 w t = c + 8 k=1 ( ) ( ) nt k ɛ d A k + t w t k ɛt s, ( ) βd 1, (ɛt β s 1 d, ɛt s ) N (0, diag(d 1, d 2 )). Giacomini, Kitagawa, Uhlig Ambiguity 4 / 33
Example: SVAR Baumeister & Hamilton (15 Ecta, BH hereafter) two-variable SVAR(8), ( n t, w t ): employment and wages growth A 0 = ( ) nt A 0 w t = c + 8 k=1 ( ) ( ) nt k ɛ d A k + t w t k ɛt s, ( ) βd 1, (ɛt β s 1 d, ɛt s ) N (0, diag(d 1, d 2 )). Parameter of interest: supply/demand elasticities and structural impulse responses. Bayesian practice in sign-restricted SVARs Agnostic prior: Canova & De Nicolo (02 JME), Faust (98 CRSPP), Uhlig (05 JME) Carefully elicited prior: BH Giacomini, Kitagawa, Uhlig Ambiguity 4 / 33
Prior elicitation procedure recommended by BH: Joint prior of (β d, β s, d 1, d 2, A): π (βd,β s ) π (d1,d 2 ) (β d,β s ) π A (βd,β s,d 1,d 2 ) (β d, β s ): Independent truncated t-dist. s.t. π βs ([0.1, 2.2]) = 0.9 and π βd ([ 2.2, 0.1]) = 0.9. (d 1, d 2 ) (α, β): Independent inverse gamma s.t. scales determined by A 0 ˆΩA 0. A (β d, β s, d 1, d 2 ): Independent Gaussian priors shrinking the reduced-form coefficients A0 1 A to random walks. Giacomini, Kitagawa, Uhlig Ambiguity 5 / 33
Prior elicitation procedure recommended by BH: Joint prior of (β d, β s, d 1, d 2, A): π (βd,β s ) π (d1,d 2 ) (β d,β s ) π A (βd,β s,d 1,d 2 ) (β d, β s ): Independent truncated t-dist. s.t. π βs ([0.1, 2.2]) = 0.9 and π βd ([ 2.2, 0.1]) = 0.9. (d 1, d 2 ) (α, β): Independent inverse gamma s.t. scales determined by A 0 ˆΩA 0. A (β d, β s, d 1, d 2 ): Independent Gaussian priors shrinking the reduced-form coefficients A0 1 A to random walks. Issues Available prior knowledge fails to pin down the exact shape. Why conjugate priors? Why priors independent? Giacomini, Kitagawa, Uhlig Ambiguity 5 / 33
Limited credibility on prior causes concern of robustness, which is even severer due to set-identification Giacomini, Kitagawa, Uhlig Ambiguity 6 / 33
Limited credibility on prior causes concern of robustness, which is even severer due to set-identification Existing robust Bayes procedure: Giacomini & Kitagawa (18): introduce arbitrary priors to unrevisable part of the prior, and draw posterior inference with the set of posteriors = posterior inference for identified set as in Kline & Tamer (16,QE), Chen, Christensen, & Tamer (19, Ecta), among others. Drawback: The set of priors is huge and may well contain unrealistic ones, e.g, an extreme scenario receives (subjective) probability one. Giacomini, Kitagawa, Uhlig Ambiguity 6 / 33
This paper Our proposal: Robust Bayesian methods with a refined set of priors Giacomini, Kitagawa, Uhlig Ambiguity 7 / 33
This paper Our proposal: Robust Bayesian methods with a refined set of priors Introduce a set of priors as a Kullback-Leibler (KL) neighborhood of a benchmark prior. Apply the Bayes rule to each prior and conduct the posterior analysis (inference and decision) with the resulting set of posteriors Giacomini, Kitagawa, Uhlig Ambiguity 7 / 33
This paper Our proposal: Robust Bayesian methods with a refined set of priors Introduce a set of priors as a Kullback-Leibler (KL) neighborhood of a benchmark prior. Apply the Bayes rule to each prior and conduct the posterior analysis (inference and decision) with the resulting set of posteriors Similar to the robust control methods of Peterson, James & Dupuis (00 IEEE) and Hansen & Sargent (01 AER), here applied to statistical decision problems with concerns about prior misspecification. Giacomini, Kitagawa, Uhlig Ambiguity 7 / 33
Notations Consider two-variable SVAR(0) model: ( ) ( ) nt ɛ d A 0 = t w t ɛt s, A 0 = structural shocks (ɛ d t, ɛ s t ) N (0, diag(d 1, d 2 )). ( ) β d 1. β s 1 Giacomini, Kitagawa, Uhlig Ambiguity 8 / 33
Notations Consider two-variable SVAR(0) model: ( ) ( ) nt ɛ d A 0 = t w t ɛt s, A 0 = structural shocks (ɛ d t, ɛ s t ) N (0, diag(d 1, d 2 )). Structural parameters: θ = (β d, β s, d 1, d 2 ) Θ ( ) β d 1. β s 1 Parameter of interest: denoted by α = α(θ) R, e.g., elasticity, impulse response of n t to unit ɛ d t, etc. Reduced-form parameters: Σ tr Cholesky decomposition of Σ the Var-cov matrix of ( n t, w t ), φ = (σ 11, σ 21, σ 22 ) Φ Giacomini, Kitagawa, Uhlig Ambiguity 8 / 33
Set-identifying assumption: downward sloping demand β d 0 and upward sloping supply β s 0 The identified set for the labor supply elasticity (Leamer (81 REStat)), α = β s, IS α (φ) { [σ21 /σ 11, σ 22 /σ 21 ], for σ 21 > 0, [0, ), for σ 21 0. Giacomini, Kitagawa, Uhlig Ambiguity 9 / 33
Set-identifying assumption: downward sloping demand β d 0 and upward sloping supply β s 0 The identified set for the labor supply elasticity (Leamer (81 REStat)), α = β s, IS α (φ) { [σ21 /σ 11, σ 22 /σ 21 ], for σ 21 > 0, [0, ), for σ 21 0. Starting from a prior π θ for θ, the posterior for α(θ) is π α X ( ) = Φ π θ φ (α(θ) )dπ φ X Giacomini, Kitagawa, Uhlig Ambiguity 9 / 33
Before observing data Innovation: Introduce the following set of priors: for λ > 0, Π λ θ {π = θ = Φ π θ φ dπ φ : π θ φ Π λ (π θ φ ) φ }, where π θ φ is a benchmark conditional prior s.t. π θ φ (IS θ(φ)) = 1, φ. { } Π λ (π θ φ ) = π θ φ : R(π θ φ π θ φ ) λ, R(π θ φ π θ φ ) = ( ) Θ ln dπ θ φ dπ θ φ : KL-distance relative to the benchmark prior. dπ θ φ Giacomini, Kitagawa, Uhlig Ambiguity 10 / 33
Before observing data Innovation: Introduce the following set of priors: for λ > 0, Π λ θ {π = θ = Φ π θ φ dπ φ : π θ φ Π λ (π θ φ ) φ }, where π θ φ is a benchmark conditional prior s.t. π θ φ (IS θ(φ)) = 1, φ. { } Π λ (π θ φ ) = π θ φ : R(π θ φ π θ φ ) λ, R(π θ φ π θ φ ) = ( ) Θ ln dπ θ φ dπ θ φ : KL-distance relative to the benchmark prior. dπ θ φ π θ φ Π λ (π θ φ ) is abs. continuous with respect to π θ φ. Giacomini, Kitagawa, Uhlig Ambiguity 10 / 33
After observing data The set of posteriors marginalized for α: Π λ α X {π = α X ( ) = Φ π θ φ (α(θ) )dπ φ X : π θ φ Π λ (π θ φ ) φ }, Giacomini, Kitagawa, Uhlig Ambiguity 11 / 33
After observing data The set of posteriors marginalized for α: Π λ α X {π = α X ( ) = Φ π θ φ (α(θ) )dπ φ X : π θ φ Π λ (π θ φ ) φ }, Report the set of posterior quantities (mean, quantiles, probs): let h(α) = α, 1{α A}, etc. max h(α)dπ α X π α X Π λ R α X ( ) = max h(α(θ))dπ θ φ dπ φ X {π θ φ Π λ (π θ φ ):φ Φ} Φ IS θ (φ) ( ) = max h(α(θ))dπ θ φ dπ φ X Φ π θ φ Π λ (π θ φ ) IS θ (φ) Giacomini, Kitagawa, Uhlig Ambiguity 11 / 33
Invariance to Marginalization The KL-neighborhood around π α φ or around π θ φ? Giacomini, Kitagawa, Uhlig Ambiguity 12 / 33
Invariance to Marginalization The KL-neighborhood around π α φ or around π θ φ? Marginalization lemma: Let Π λ (π ) be the KL-neighborhood formed α φ around the α-marginal of π θ φ. Then, Φ = Φ [ [ sup / sup / inf π θ φ Π λ (π θ φ ) inf π α φ Π λ (π α φ ) IS θ (φ) IS α (φ) h(α(θ))dπ θ φ ] dπ φ X h(α)dπ α φ ] dπ φ X. Working with Π λ (π θ φ ) or Πλ (π ) leads to the same range of posteriors α φ for α. Giacomini, Kitagawa, Uhlig Ambiguity 12 / 33
The range of posteriors Theorem: Suppose h( ) is bounded. ( inf E α X (h(α)) = π α X Π λ Φ α X where sup π α X Π λ α X E α X (h(α)) = dπ l α φ = dπ u α φ = Φ ( h(α)dπ l α φ IS α (φ) h(α)dπ u α φ IS α (φ) ) dπ φ X exp( h(α)/κ λ l (φ)) exp( h(α)/κ l λ (φ))dπ dπ α φ α φ exp(h(α)/κ λ u(φ)) exp(h(α)/κ u λ (φ))dπ dπ α φ, α φ ) dπ φ X, κλ l (φ) and κu λ (φ) are the Lagrange multipliers in the constrained optimization problems Giacomini, Kitagawa, Uhlig Ambiguity 13 / 33
Computing the range of posteriors Algorithm: Assume that π θ φ or π α φ multiplicative constant can be evaluated up to a 1 Draw many φ s from π φ X 2 At [ each φ, compute (κλ l (φ), κu λ (φ)) and approximate h(α)dπ l α φ, ] h(α)dπ u using importance sampling (if α φ necessary). Note (κλ l (φ), κu λ (φ)) can be obtained by convex optimizations: { ( min κ ln exp ± α ) } dπ κ 0 α φ IS α (φ) κ + λκ. 3 Take the averages of the intervals obtained in Step 2. Giacomini, Kitagawa, Uhlig Ambiguity 14 / 33
Asymptotics Under mild regularity conditions: As n, integration w.r.t. dπ φ X can be replaced by plug-in MLE As λ, if π α φ supports IS α(φ), the set of posterior means/quantiles converge to that of Giacomini and Kitagawa (18) As λ and n, the set of posterior means recovers the true identified set, IS α (φ 0 ) Giacomini, Kitagawa, Uhlig Ambiguity 15 / 33
Robust Bayes decision Conditional gamma-minimax: decision under multiple posteriors Giacomini, Kitagawa, Uhlig Ambiguity 16 / 33
Robust Bayes decision Conditional gamma-minimax: decision under multiple posteriors δ(x): decision function, and h(δ(x), α): loss function, e.g., h(δ(x), α) = (δ(x) α) 2. [ ] min max h(δ(x), α)dπ α φ dπ φ X δ(x) Φ π α φ Π λ (π α φ ) IS α (φ) In the empirical application, we compute a robust point estimator for α with the quadratic and absolute losses. As λ and n, the gamma-minimax estimator converges to the mid-point of the true identified set. Giacomini, Kitagawa, Uhlig Ambiguity 16 / 33
Empirical Example BH two-variable SVAR(8), quarterly data for 1970-2014. A = ( ) nt A 0 w t = c + 8 k=1 ( ) ( ) nt k ɛ d A k + t w t k ɛt s, ( ) 1 β, (ɛt 1 α d, ɛt s ) N (0, diag(d 1, d 2 )). Giacomini, Kitagawa, Uhlig Ambiguity 17 / 33
Use BH prior for θ to generate the single prior π φ and the benchmark conditional prior π α φ. The benchmark conditional prior π is obtained by reparametrizing α φ θ (α, φ) and transforming π θ into π (α,φ). Giacomini, Kitagawa, Uhlig Ambiguity 18 / 33
Application to SVAR BH prior as the benchmark. α: labor supply elasticity Benchmark posterior (lambda=0) prob density 0.0 0.5 1.0 1.5 2.0 posterior prior post med identified set 0 1 2 3 4 5 6 alpha Giacomini, Kitagawa, Uhlig Ambiguity 19 / 33
Application to SVAR BH prior as the benchmark. α: labor supply elasticity Range of posteriors: lambda= 0.1 prob density 0.0 0.5 1.0 1.5 2.0 post med bounds minmax est (abs loss) 0 1 2 3 4 5 6 alpha Giacomini, Kitagawa, Uhlig Ambiguity 20 / 33
Application to SVAR BH prior as the benchmark. α: labor supply elasticity Range of posteriors: lambda= 0.5 prob density 0.0 0.5 1.0 1.5 2.0 0 1 2 3 4 5 6 alpha Giacomini, Kitagawa, Uhlig Ambiguity 21 / 33
Application to SVAR BH prior as the benchmark. α: labor supply elasticity Range of posteriors: lambda= 1 prob density 0.0 0.5 1.0 1.5 2.0 0 1 2 3 4 5 6 alpha Giacomini, Kitagawa, Uhlig Ambiguity 22 / 33
Application to SVAR BH prior as the benchmark. α: labor supply elasticity Range of posteriors: lambda= 2 prob density 0.0 0.5 1.0 1.5 2.0 0 1 2 3 4 5 6 alpha Giacomini, Kitagawa, Uhlig Ambiguity 23 / 33
Application to SVAR BH prior as the benchmark. α: labor supply elasticity Range of posteriors: lambda= 5 prob density 0.0 0.5 1.0 1.5 2.0 0 1 2 3 4 5 6 alpha Giacomini, Kitagawa, Uhlig Ambiguity 24 / 33
How to elicit λ? Given a benchmark prior: For each candidate λ, we can compute the range of priors of the parameter of interest or a parameter for which we have some prior knowledge. Compute the ranges of priors for several values of λ, and pick one that best fits your vague prior. Giacomini, Kitagawa, Uhlig Ambiguity 25 / 33
Range of priors (lambda=0.1) Range of priors (lambda=0.5) Range of priors (lambda=1) prob density 0.0 0.5 1.0 1.5 2.0 prior mean bounds benchmark prio benchmark prior mean prob density 0.0 0.5 1.0 1.5 2.0 prob density 0.0 0.5 1.0 1.5 2.0 0 1 2 3 4 5 alpha 0 1 2 3 4 5 alpha 0 1 2 3 4 5 alpha Giacomini, Kitagawa, Uhlig Ambiguity 26 / 33
Range of priors (lambda=0.1) Range of priors (lambda=0.5) Range of priors (lambda=1) prob density 0.0 0.5 1.0 1.5 2.0 prior mean bounds benchmark prio benchmark prior mean prob density 0.0 0.5 1.0 1.5 2.0 prob density 0.0 0.5 1.0 1.5 2.0 0 1 2 3 4 5 alpha 0 1 2 3 4 5 alpha 0 1 2 3 4 5 alpha Range of posteriors (lambda=0.1) Range of posteriors (lambda=0.5) Range of posteriors (lambda=1) prob density 0.0 0.5 1.0 1.5 2.0 post mean bounds benchmark posterio benchmark post mean prob density 0.0 0.5 1.0 1.5 2.0 prob density 0.0 0.5 1.0 1.5 2.0 0 1 2 3 4 5 alpha 0 1 2 3 4 5 alpha 0 1 2 3 4 5 alpha Giacomini, Kitagawa, Uhlig Ambiguity 27 / 33
Discussions 1 Why neighborhood around π α φ rather than π α? 2 Why not multiplier minimax (a la, e.g., Hansen & Sargent)? 3 Is conditional gamma-minimax admissible? Giacomini, Kitagawa, Uhlig Ambiguity 28 / 33
Why neighborhood around π α φ? Ho (2019) considers the KL neighborhood around π α. A prior for α can be updated through the update of φ. Robustness concern should be more serious for the unrevisable part π α φ KL neighborhood around π α contains multiple priors of π φ, i.e., the marginal likelihood varies over the prior class Giacomini, Kitagawa, Uhlig Ambiguity 29 / 33
Why neighborhood around π α φ? Ho (2019) considers the KL neighborhood around π α. A prior for α can be updated through the update of φ. Robustness concern should be more serious for the unrevisable part π α φ KL neighborhood around π α contains multiple priors of π φ, i.e., the marginal likelihood varies over the prior class Non-unique way of updating the set of priors: Prior-by-prior or Type-II ML updating rule (Good (65,book), Gilboa & Schmeidler (91, JET))? Giacomini, Kitagawa, Uhlig Ambiguity 29 / 33
Why not multiplier minimax? Along Hansen & Sargent (01), we could formulate the optimal decision under ambiguity in the form of multiplier minimax: [ { } ] min max h(δ(x), α)dπ δ(x) Φ π α φ Π (π α φ ) α φ κr(π α φ IS α (φ) π α φ ) dπ φ X Giacomini, Kitagawa, Uhlig Ambiguity 30 / 33
Why not multiplier minimax? Along Hansen & Sargent (01), we could formulate the optimal decision under ambiguity in the form of multiplier minimax: [ { } ] min max h(δ(x), α)dπ δ(x) Φ π α φ Π (π α φ ) α φ κr(π α φ IS α (φ) π α φ ) dπ φ X Given κ 0 and δ, there exist KL-radius λ κ (φ) that makes the solution of the gamma minimax equivalent to that of multiplier minimax, but the implied λ(φ) depends on the loss function h(, ). We prefer the gamma minimax formulation with fixed λ, as we want the set of priors invariant to the loss function. Giacomini, Kitagawa, Uhlig Ambiguity 30 / 33
Statistical Admissibility? Unlike unconditional gamma-minimax, conditional gamma-minimax is not guaranteed to be admissible, as the worst-case prior becomes data dependent. Unconditional gamma-minimax is, on the other hand, difficult to compute, and the difference disappears in large samples. Giacomini, Kitagawa, Uhlig Ambiguity 31 / 33
Related Literatures Bayesian approach to partial identification: Poirier (97, ET), Moon & Schorfheide (12, Ecta), Kline & Tamer (16, QE), Chen, Christensen, & Tamer (18, Ecta), Giacomini & Kitagawa (18), among others. Robust Bayes/sensitivity analysis in econometrics/statistics: Robbins (51, Proc.Berkeley.Symp), Good (65, book), Kudo (67 Proc.Berkeley.Symp), Dempster (68 JRSSB) Huber (73 Bull.ISI), Chamberlain & Leamer (76 JRSSB), Manski (81 AnnStat), Leamer (82 Ecta), Berger (85 book), Berger & Berliner (86 Ann.Stat), Wasserman (90 Ann.Stat), Chamberlain (00 JAE), Geweke (05,book), Müller (12, JME), Ho (19) Decision under Ambiguity: Schmeidler (89 Ecta), Gilboa & Schmeidler (89 JME), Maccheroni, Marinacci & Rustichini (06 Ecta), Hansen & Sargent (01 AER, 08 book), Strzalecki (11 Ecta), among others Giacomini, Kitagawa, Uhlig Ambiguity 32 / 33
Conclusion This paper proposes a novel robust-bayes procedure for a class of set-identified models. It simultaneously deals with the robustness concern in the standard Bayesian method and the excess conservatism in the set-identification analysis. Giacomini, Kitagawa, Uhlig Ambiguity 33 / 33
Conclusion This paper proposes a novel robust-bayes procedure for a class of set-identified models. It simultaneously deals with the robustness concern in the standard Bayesian method and the excess conservatism in the set-identification analysis. Useful for Sensitivity analysis: perturb the prior of the Bayesian model over the KL-neighborhood, and check how much the posterior result changes. Adding uncertain assumptions to the set-identified model: Tightening up the identified set by the partially credible benchmark prior. Policy decision when available data only set-identify a welfare criterion (e.g. treatment choice with observational data). Giacomini, Kitagawa, Uhlig Ambiguity 33 / 33