Calibration Estimation of Semiparametric Copula Models with Data Missing at Random

Size: px

Start display at page:

Download "Calibration Estimation of Semiparametric Copula Models with Data Missing at Random"

Simon Pierce
5 years ago
Views:

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Shigeyuki Hamori 1 Kaiji Motegi 1 Zheng Zhang 2 1 Kobe University 2 Renmin

1 Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Shigeyuki Hamori 1 Kaiji Motegi 1 Zheng Zhang 2 1 Kobe University 2 Renmin University of China Econometrics Workshop UNC Chapel Hill November 16, 2018 Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

2 Introduction Copula Model -- Flexible modelling of multivariate distributions -- Popular in business, economics, finance, etc. Missing Data We unitethem for the first time in the literature. Treatment Effect -- Common problem in all fields -- Deep literature in statistics -- Binary phenomenon (to observe or not to observe) -- Estimate unobserved outcome -- Central topic in econometrics Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

3 Introduction How to fit copula models when there are missing data? A naïve approach is listwise deletion (LD): 1 Keep individuals with all d components being observed, and discard all other individuals. 2 Treat the individuals with complete data in an equal way. i= 1 i= 2 i= 3 i= 4 i= N Y 11 Y 12 Y 13 Y 14 Y 1N Y 21 Y 22 Y 23 Y 24 Y 2N Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

4 Introduction LD leads to a consistent estimator for the copula parameter of interest if the missing mechanism is missing completely at random (MCAR). LD leads to an inconsistent estimator if the missing mechanism is missing at random (MAR). Under MAR, target variables Y i = [Y 1i,..., Y di ] and their missing status are independent of each other given observed covariates X i = [X 1i,..., X ri ]. LD treats individuals with complete data all equally, and it does not use the information of X i. That can cause substantial bias under MAR. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

5 Introduction How can we obtain a consistent estimator for the copula parameter when the missing mechanism is MAR? A key step is the estimation of propensity score function (i.e. conditional probability of observing data given covariates). Direct estimation of propensity score is notoriously challenging, whether it is performed parametrically or nonparametrically. Parametric approaches are haunted by misspecification problems, while nonparametric approaches are often unstable. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

6 Introduction We apply Chan, Yam, and Zhang s (2016) calibration estimation to a missing data problem for the first time in the literature. The calibration estimator for the copula parameter satisfies consistency and asymptotic normality under some assumptions including i.i.d. data and the MAR condition. We also derive a consistent estimator for the asymptotic covariance matrix. Our simulation results indicate that the calibration estimator dominates listwise deletion, parametric approach, and nonparametric approach. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

7 Table of Contents 1) Introduction 2) Review of Copula Models 3) Review of Missing Data 4) Set-up of Main Problem 5) Theory of Calibration Estimation 6) Data-Driven Selection of Tuning Parameter K 7) Monte Carlo Simulations 8) Conclusions Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

8 Review of Copula Models Suppose that there are N individuals and d components: Y i = [Y 1i, Y 2i,..., Y di ] (i = 1,..., N). Suppose that we want to estimate the d-dimensional joint distribution of Y i, assuming i.i.d. What can we do? There are potential problems on estimating the joint distribution directly. d = small d = large Parametric Misspecification Misspecification Parameter proliferation Nonparametric (Curse of dimensionality) Curse of dimensionality Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

9 Review of Copula Models Copula models accomplish flexible specification with a small number of parameters. Copula models follow a two-step procedure. Step 1: Model the marginal distribution of each of the d components separately. Step 2: Combine the d marginal distributions to recover a joint distribution. Step 1 Parametric Nonparametric Nonparametric Step 2 Parametric Parametric Nonparametric Name of model Parametric copula model Semiparametric copula model (Target of our paper) Nonparametric copula model Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

10 Review of Copula Models The copula approach is justified by Sklar s (1959) theorem. Sklar s theorem ensures the existence of a unique copula function C : (0, 1) d (0, 1) that recovers a true joint distribution. Theorem (Sklar, 1959) Let {Y i } be i.i.d. random vectors with a joint distribution F : R d (0, 1). Assume that the marginal distribution of Y ji, written as F j : R (0, 1), is continuous for j {1,..., d}. Then, there exists a unique function C : (0, 1) d (0, 1) such that F (y 1,..., y d ) = C (F 1 (y 1 ),..., F d (y d )), or in terms of probability density functions, f(y 1,..., y d ) = c (F 1 (y 1 ),..., F d (y d )) f 1 (y 1 ) f d (y d ). Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

11 Review of Copula Models A well-known example of copula function: bivariate Clayton copula with a scalar parameter α > 0. Cumulative distribution function is C 2 (u 1, u 2 ; α) = (u α 1 + u α 2 1) 1 α, where u 1 = F 1 (y 1 ) (0, 1) and u 2 = F 2 (y 2 ) (0, 1) are marginal distribution functions of Y 1i and Y 2i, respectively. Probability density function is c 2 (u 1, u 2 ; α) = (1 + α)(u 1 u 2 ) α 1 (u α 1 + u α 2 1) 1 α 2. Kendall s rank correlation coefficient is τ = α/(α + 2). Larger α implies stronger association between Y 1i and Y 2i. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

12 Review of Copula Models u u 1 p.d.f. (α = or τ =.4) u u 1 p.d.f. (α = or τ =.7) 1 Association is stronger in the lower tail than in the upper tail. Such an asymmetry matches many economic and financial phenomena (e.g. stock market contagion). Larger α implies stronger association between Y 1i and Y 2i. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

13 Review of Missing Data Suppose that {Y 1i,..., Y di } N i=1 are target variables. Define the missing indicator: { 1 if Y ji is observed, T ji = 0 if Y ji is missing. Suppose that {X 1i,..., X ri } N i=1 are observable covariates. Missing Completely at Random (MCAR): Missing at Random (MAR): {T 1i,..., T di } {Y 1i,..., Y di }. {T 1i,..., T di } {Y 1i,..., Y di } {X 1i,..., X ri }. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

14 Review of Missing Data An illustrative example on health survey (d = r = 1): Y i = weight of individual i, T i = I(Y i is observed), X i = I(Individual i is female). MCAR requires that P[T i = 1] = P[T i = 1 X i = x, Y i = y], (x, y). MAR requires that P[T i = 1 X i = 1] = P[T i = 1 X i = 1, Y i = y], P[T i = 1 X i = 0] = P[T i = 1 X i = 0, Y i = y], y, y. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

15 Review of Missing Data Probably too restrictive, since men and women may have different willingness to report their weights. MCAR MNAR MAR More realistic than MCAR, since MAR controls for gender. MAR may be still restrictive, since men (or women) with different weights may have different willingness to report their weights. MNAR is most general since it controls for both gender and weight, but MNAR is hard to handle technically. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

16 Review of Missing Data It is well known that listwise deletion (LD) leads to consistent inference under MCAR. It is also well known that LD leads to inconsistent inference under MAR. Correct inference under MAR has been extensively studied since the seminal work of Rubin (1976). The present paper assumes MAR and elaborates the estimation of semiparametric copula models, which has not been done in the literature. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

17 Set-up of Main Problem Semiparametric copula models are estimated in two steps: Step 1: Estimate the marginal distributions {F 1,..., F d } nonparametrically via F j (y) = P(Y ji y) = E[I(Y ji y)]. Step 2: Estimate the true copula parameter θ 0 via θ 0 = arg max θ Θ E [log c(f 1(Y 1i ),..., F d (Y di ); θ)]. If data were all observed, then we could simply replace the population quantities with sample counterparts. When data are Missing at Random, we need to assign some weights based on propensity score functions. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

18 Set-up of Main Problem Define propensity score functions: π j (x) = P(T ji = 1 X i = x), Define p j (x) = 1/π j (x). Step 1 is rewritten as j {1,..., d}. F j (y) = E[I(Y ji y)] = E[E[I(Y ji y) X i ]] ( LIE) [ [ ] ] Tji = E E π j (X i ) X i E [I(Y ji y) X i ] [ [ ]] Tji = E E π j (X i ) I(Y ji y) X i ( MAR) [ ] Tji = E π j (X i ) I(Y ji y) ( LIE) = E[T ji p j (X i ) I(Y ji y)]. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

19 Set-up of Main Problem We have derived: F j (y) = E[I(T ji = 1) p j (X i ) I(Y ji y)]. Horvitz and Thompson s (1952) inverse probability weighting (IPW) estimator for F j is written as F j (y) = 1 N N I(T ji = 1)p j (X i )I(Y ji y). i=1 If p j (x) were known, then it would be straightforward to compute the IPW estimator. p j (x) is unknown in reality. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

20 Set-up of Main Problem Define a propensity score function: Define q(x) = 1/η(x). η(x) = P(T 1i = 1,..., T di = 1 X i = x). Using MAR and LIE, Step 2 is rewritten as θ 0 = arg max θ Θ E [I(T 1i = 1,..., T di = 1)q(X i ) log c(f 1 (Y 1i ),..., F d (Y di ); θ)]. The IPW estimator for θ 0 is given by 1 θ = arg max θ Θ N N I(T 1i = 1,..., T di = 1)q(X i ) log c( F 1 (Y 1i ),..., F d (Y di ); θ). i=1 q(x) is unknown in reality. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

21 Set-up of Main Problem Direct estimation of propensity score functions is challenging, whether it is performed parametrically or nonparametrically. In the treatment effect literature, Chan, Yam, and Zhang (2016) propose an alternative approach that bypasses a direct estimation of propensity score. They construct calibration weights by balancing the moments of observed covariates among treatment, control, and whole groups. The present paper applies their method to a missing data problem for the first time. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

22 Theory of Calibration Estimation Under MAR, the moment matching condition holds: E [ I(T ji = 1)p j,kj (X i )u Kj (X i ) ] = E[u Kj (X i )], j {1,... d} for any integrable function u Kj : R r R K j called an approximation sieve. A common choice is, say, u Kj (X i ) = [1, X i, X 2 i, X 3 i ] (r = 1, K j = 4). A sample counterpart is written as 1 N N I(T ji = 1) p j,kj (X i ) u Kj (X i ) = 1 N i=1 N u Kj (X i ). Multiple values of {p j,kj (X 1 ),..., p j,kj (X N )} satisfy the moment matching condition. Among them, we choose the one closest to a uniform weight given some distance measure. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49 i=1

23 Theory of Calibration Estimation Why do we want the uniform weight? 1 If there are no missing data, then the uniform weight leads to a natural estimator ˆF j (y) = (N + 1) 1 N i=1 I(Y ji y). 2 It is well known that volatile weights cause instability in the Horvitz-Thompson IPW estimator. Let ρ : R R be any strictly concave function. Define a concave objective function: G j,kj (λ) = 1 N Compute N [ I(Tji = 1)ρ(λ u Kj (X i )) λ u Kj (X i ) ], λ R Kj. i=1 ˆλ j,kj = arg max λ G j,k j (λ). Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

24 Theory of Calibration Estimation Compute calibration weights for marginal distributions: ˆp j,kj (X i ) = ρ (ˆλ j,k j u Kj (X i )). Estimate the marginal distribution of the j-th component by ˆF j (y) = 1 N N I(T ji = 1)ˆp j,kj (X i )I(Y ji y). i=1 Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

25 Theory of Calibration Estimation Arbitrariness of ρ arises from the arbitrariness of the distance measure. Functional forms often used in the nonparametric literature include: Exponential tilting: ρ(v) = exp( v). Empirical likelihood: ρ(v) = log(1 + v). Continuous updating GMM: ρ(v) = 0.5(1 v) 2. Inverse logistic: ρ(v) = v exp( v). Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

26 Theory of Calibration Estimation Step 2 (likelihood maximization) can be handled analogously. Under MAR, the moment matching condition holds: where E [ I(T 1i = 1,..., T di = 1)q(X i )u Kη (X i ) ] = E[u Kη (X i )], q(x i ) 1 η(x i ) = 1 P(T 1i = 1,..., T di = 1 X i = x). Find calibration weights that satisfy the moment matching condition and are closest to the uniform weight given some distance measure. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

27 Theory of Calibration Estimation Define a concave objective function: H Kη (β) = 1 N ( ) I(T 1i = 1,..., T di = 1)ρ β u Kη (X i ) 1 N N i=1 Compute ˆβ Kη = arg max β H Kη (β). Compute a calibration weight for the likelihood: ˆq Kη (X i ) = ρ ( ˆβ K η u Kη (X i )). Compute the maximum likelihood estimator ˆθ via N β u Kη (X i ). i=1 max θ Θ 1 N N i=1 ( I(T 1i = 1,..., T di = 1)ˆq Kη (X i ) log c d ˆF1 (Y 1i ),..., ˆF ) d (Y di ); θ. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

28 Theory of Calibration Estimation Theorem (Consistency and Asymptotic Normality) Impose a set of assumptions including what follows: The missing mechanism is missing at random (MAR). {Y i, T i, X i } are i.i.d. across individuals i {1,..., N}. π 1 ( ),..., π d ( ), and η( ) are sufficiently smooth. K j (N) as N, and the rate of divergence is sufficiently slow. (The same assumption applies for K η (N).) Then, consistency and asymptotic normality follow. p 1 ˆθ θ0 as N. 2 N( ˆθ θ0 ) d N(0, V ) as N. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

29 Theory of Calibration Estimation The asymptotic covariance matrix V is expressed as V = B 1 ΣB 1. We can construct consistent estimators for B and Σ, and hence ˆV = ˆB 1 ˆΣ ˆB 1 p V. See the main paper for a complete set of assumptions, proofs of the consistency and asymptotic normality, and the construction of V and ˆV. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

30 Data-Driven Selection of K A practical question is how to select {K 1,..., K d, K η }, the dimensions of approximation sieves. Focus on K j for concreteness. We select an optimal Kj from a viewpoint of covariate balancing. A natural estimator of the joint distribution of X, based on the whole group of individuals, is ˆF X (x) = 1 N N I(X 1i x 1,..., X ri x r ). i=1 An alternative estimator based on the observed group of individuals is N ˆF X,Kj (x) = T jiˆp j,kj (X i )I(X 1i x,..., X ri x r ). i=1 Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

31 Data-Driven Selection of K On one hand, we should select K j that makes ˆF X and ˆF X,Kj as close as possible. On the other hand, we should impose a penalty against raising K j in order to keep parsimony. Hence we propose to select K j {1, 2,..., K} that minimizes a penalized L 2 -distance: d Kj ( ˆF X, ˆF X,Kj ) = ] 2 [ 1 N N i=1 ˆFX (X i ) ˆF (X X,Kj i) ( 1 K 2 j /N ) 2. The proposed approach is analogous to the generalized cross-validation of Li and Racine (2007, Sec. 15.2). Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

32 Monte Carlo Simulations: DGP Target variables are Y i = [Y 1i, Y 2i ] (d = 2). Consider a scalar covariate X i. Define U i = [U 1i, U 2i, U 3i ] = [F 1 (Y 1i ), F 2 (Y 2i ), F X (X i )]. F 1 ( ) is the marginal distribution of Y 1i, and we use N(0, 1). F 2 ( ) is the marginal distribution of Y 2i, and we use N(0, 1). F X ( ) is the marginal distribution of X i, and we use N(0, 1). The inverse distribution functions F 1 1 ( ), F 1 2 ( ), and F 1 X ( ) are known and tractable. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

33 Monte Carlo Simulations: DGP Step 1: Draw U i i.i.d. Clayton 3 (α 0 ) with α 0 = 6 (Kendall s τ = 0.75). Step 2: Recover Y 1i = F1 1 (U 1i ), Y 2i = F2 1 (U 2i ), and X i = F 1 X (U 3i). Step 3: Assume that {Y 11,..., Y 1N } are all observed. Make some of {Y 21,..., Y 2N } missing according to P(T 2i = 1 X i = x i ) = exp[a + bx i ]. We use (a, b) = ( 0.420, 0.400), which implies MAR with E[T 2i ] = 0.6. Step 4: Repeat Steps 1-3 J = 1000 times with sample size N {500, 1000}. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

34 Monte Carlo Simulations: Estimation Approach #1: Listwise deletion. Step 1: Estimate the marginal distribution of the j-th component by where ˆF j (y) = 1 N + 1 N = N I(T 1i = 1, T 2i = 1)I(Y ji < y), i=1 N I(T 1i = 1, T 2i = 1) i=1 is the number of individuals with complete data. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

35 Monte Carlo Simulations: Estimation Step 2: Compute the maximum likelihood estimator ˆα by max α (0, ) 1 N N i=1 ( I(T 1i = 1, T 2i = 1) log c 2 ˆF1 (Y 1i ), ˆF ) 2 (Y 2i ); α, where c 2 (u 1, u 2 ; α) = (1 + α)(u 1 u 2 ) α 1 (u α 1 + u α 2 1) 1 α 2 is the probability density function of Clayton 2 (α). This likelihood function is correctly specified since any bivariate marginal distribution of Clayton 3 (α) is Clayton 2 (α). The same property holds for any Archimedean copula (e.g. Clayton, Gumbel, Frank). Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

36 Monte Carlo Simulations: Estimation Approach #2: Parametric estimation. Consider a correctly specified model for the propensity score: We estimate (a, b) via π 2 (x; a, b) = exp(a + bx). max N [T 2i log π 2 (X i ; a, b) + (1 T 2i ) log (1 π 2 (X i ; a, b))]. i=1 Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

37 Monte Carlo Simulations: Estimation Compute ˆp 2 (X i ) = ˆq(X i ) = Estimate marginal distributions by 1 π 2 (X i ; â, ˆb). ˆF j (y) = 1 N N I(T ji = 1)ˆp j (X i )I(Y ji < y). i=1 Estimate the copula parameter α via max α (0, ) 1 N N i=1 ( I(T 1i = 1, T 2i = 1)ˆq(X i ) log c 2 ˆF1 (Y 1i ), ˆF ) 2 (Y 2i ); α. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

38 Monte Carlo Simulations: Estimation For comparison, consider a misspecified model: π 2 (x; b) = exp(bx). This model is misspecified since the true value is a = The remaining procedure is the same. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

39 Monte Carlo Simulations: Estimation Approach #3: Nonparametric estimation of Hirano, Imbens, and Ridder (2003). Define the approximation sieve: u K2 (X i ) = [1, X i,..., X K 2 1 i ]. Define 1 π 2,K2 (X i ; λ) = 1 + exp[ λ u K2 (X i )]. Estimate λ via max N [T 2i log π 2,K2 (X i ; λ) + (1 T 2i ) log (1 π 2,K2 (X i ; λ))]. i=1 Compute ˆp 2,K2 (X i ) = ˆq K2 (X i ) = 1/π 2,K2 (X i ; ˆλ). We use fixed K 2 {3, 4} and the data-driven selection of K 2 {1,..., 5} based on the covariate balancing principle. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

40 Monte Carlo Simulations: Estimation The nonparametric approach with K 2 = 2 is essentially identical to the parametric approach with the correctly specified model. This is just a coincidence due to the fact that the true missing mechanism is characterized by a logistic function. In theory, K 2 as N and the nonparametric approach leads to a consistent estimator for any missing mechanism. Finite sample performance is another question. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

41 Monte Carlo Simulations: Estimation Approach #4: Calibration estimation. Calibration weights are computed with the exponential tilting: ρ(v) = exp( v). We use fixed K 2 {3, 4} and the data-driven selection of K 2 {1,..., 5} based on the covariate balancing principle. The procedure is as explained. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

42 Monte Carlo Simulations: Results N = 500 N = 1000 Truth: α 0 = Bias, Stdev, RMSE Bias, Stdev, RMSE Listwise Deletion 0.453, 0.594, , 0.408, Param (Correct) 0.152, 0.454, , 0.322, Param (Misspec) 1.273, 0.420, , 0.330, Nonparam (K 2 = 3) 0.152, 0.434, , 0.322, Nonparam (K 2 = 4) 0.723, 2.358, , 2.470, Nonparam (K2 ) 0.125, 0.452, , 0.333, Calibration (K 2 = 3) 0.161, 0.450, , 0.322, Calibration (K 2 = 4) 0.165, 0.456, , 0.327, Calibration (K2 ) 0.142, 0.472, , 0.315, Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

43 Monte Carlo Simulations: Results Listwise deletion produces bias under MAR as expected. The parametric approach produces enormous bias if a propensity score model is misspecified. The nonparametric approach exhibits considerable instability across K 2. The proposed calibration approach achieves strikingly sharp and stable performance. The data-driven selection of K 2 performs well for both the nonparametric and calibration estimators. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

44 Monte Carlo Simulations: Extension We make a further comparison of the nonparametric estimator and the calibration estimator under a more challenging scenario. Assume that there are r = 2 covariates: X i = [X 1i, X 2i ]. Assume that a true propensity score is P(T 2i = 1 X i = x i ) = exp[ x 1i + 0.2x 2i ]. This implies MAR with E[T 2i ] = 0.6 as before. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

45 Monte Carlo Simulations: Extension Reconstruct the approximation sieve u K2 (X) as follows: u 10 (X) = (1, X 1, X 2, X 2 1, X 2 2, X 1 X 2, X 3 1, X 3 2, X 2 1X 2, X 1 X 2 2). For K 2 {1,..., 10}, let u K2 (X) be the first K 2 elements of u 10 (X). For each estimator, we use the following fixed values: K 2 = 3 (i.e. only the first moments of X). K 2 = 6 (i.e. the second moments added). K 2 = 10 (i.e. the third moments added). We also use the data-driven selection of K 2 {1,..., 10} based on the covariate balancing principle. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

46 Monte Carlo Simulations: Extension N = 500 N = 1000 Truth: α 0 = Bias, Stdev, RMSE Bias, Stdev, RMSE Nonparam (K 2 = 3) 0.150, 0.461, , 0.324, Nonparam (K 2 = 6) 0.160, 0.531, , 0.476, Nonparam (K 2 = 10) 1.127, 4.028, , 3.013, Nonparam (K2 ) 0.106, 1.153, , 1.265, Calibration (K 2 = 3) 0.253, 0.458, , 0.330, Calibration (K 2 = 6) 0.147, 0.446, , 0.331, Calibration (K 2 = 10) 0.143, 0.455, , 0.331, Calibration (K2 ) 0.200, 0.458, , 0.346, Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

47 Monte Carlo Simulations: Extension The nonparametric approach exhibits considerable instability across K 2. The nonparametric estimator assisted by the data-driven selection of K 2 has small bias but large variance. The proposed calibration approach achieves sharp and stable performance. The calibration estimator assisted by the data-driven selection of K 2 has small bias and small variance. Hence the calibration estimator clearly outperforms the nonparametric estimator. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

48 Conclusions We investigate the estimation of semiparametric copula models under missing data for the first time in the literature. There is analogy between missing data and average treatment effects since both of them are binary in nature. We apply Chan, Yam, and Zhang s (2016) calibration estimation to missing data for the first time in the literature. The calibration estimator satisfies consistency and asymptotic normality under the i.i.d. assumption and the Missing at Random (MAR) condition. Our simulation results indicate that the calibration estimator dominates listwise deletion, parametric approach, and nonparametric approach. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

49 References Chan, K. C. G., S. C. P. Yam, and Z. Zhang (2016). Globally Efficient Non-Parametric Inference of Average Treatment Effects by Empirical Balancing Calibration Weighting. Journal of the Royal Statistical Society, Series B, 78, Hirano, K., G. W. Imbens, and G. Ridder (2003). Efficient Estimation of Average Treatment Effects Using the Estimated Propensity Score. Econometrica, 71, Horvitz, D. G. and D. J. Thompson (1952). A Generalization of Sampling without Replacement from a Finite Universe. Journal of the American Statistical Association, 47, Li, Q. and J. S. Racine (2007). Nonparametric Econometrics: Theory and Practice. Princeton University Press. Rubin, D. B. (1976). Inference and Missing Data. Biometrika, 63, Sklar, A. (1959). Fonctions de répartition à n dimensions et leurs marges. Publications de l Institut de Statistique de L Université de Paris, 8, Hamori, Motegi & Zhang (Kobe & RUC) Copula Models with Data MAR November 16, / 49

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Shigeyuki Hamori 1 Kaiji Motegi 1 Zheng Zhang 2 1 Kobe University 2 Renmin University of China Institute of Statistics