Calibration Estimation for Semiparametric Copula Models under Missing Data

Similar documents
Calibration Estimation of Semiparametric Copula Models with Data Missing at Random

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random

Flexible Estimation of Treatment Effect Parameters

Modification and Improvement of Empirical Likelihood for Missing Response Problem

Robustness to Parametric Assumptions in Missing Data Models

Covariate Balancing Propensity Score for General Treatment Regimes

A Goodness-of-fit Test for Copulas

Robustness of a semiparametric estimator of a copula

LIKELIHOOD RATIO INFERENCE FOR MISSING DATA MODELS

The Value of Knowing the Propensity Score for Estimating Average Treatment Effects

Primal-dual Covariate Balance and Minimal Double Robustness via Entropy Balancing

An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data

Spring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM

Financial Econometrics and Volatility Models Copulas

Weighting Methods. Harvard University STAT186/GOV2002 CAUSAL INFERENCE. Fall Kosuke Imai

ANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW

What s New in Econometrics. Lecture 1

Combining multiple observational data sources to estimate causal eects

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,

Using copulas to model time dependence in stochastic frontier models

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness

ECON 3150/4150, Spring term Lecture 6

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Discussion of Identifiability and Estimation of Causal Effects in Randomized. Trials with Noncompliance and Completely Non-ignorable Missing Data

Estimation of the Conditional Variance in Paired Experiments

Recent Advances in the analysis of missing data with non-ignorable missingness

Causal Inference Lecture Notes: Selection Bias in Observational Studies

Data Integration for Big Data Analysis for finite population inference

A measure of radial asymmetry for bivariate copulas based on Sobolev norm

Imputation Algorithm Using Copulas

Copula-Based Random Effects Models

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Copulas. MOU Lili. December, 2014

Marginal Specifications and a Gaussian Copula Estimation

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices

A weighted simulation-based estimator for incomplete longitudinal data models

Lecture 6: Discrete Choice: Qualitative Response

The Instability of Correlations: Measurement and the Implications for Market Risk

AN INSTRUMENTAL VARIABLE APPROACH FOR IDENTIFICATION AND ESTIMATION WITH NONIGNORABLE NONRESPONSE

Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions

MISSING or INCOMPLETE DATA

Research Article Optimal Portfolio Estimation for Dependent Financial Returns with Generalized Empirical Likelihood

Some Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model

arxiv: v2 [stat.me] 17 Jan 2017

Estimation, Inference, and Hypothesis Testing

arxiv: v1 [stat.me] 15 May 2011

Propensity score weighting for causal inference with multi-stage clustered data

Identification and Estimation of Partially Linear Censored Regression Models with Unknown Heteroscedasticity

Quaderni di Dipartimento. Small Sample Properties of Copula-GARCH Modelling: A Monte Carlo Study. Carluccio Bianchi (Università di Pavia)

Dependence Patterns across Financial Markets: a Mixed Copula Approach

High Dimensional Propensity Score Estimation via Covariate Balancing

Generated Covariates in Nonparametric Estimation: A Short Review.

A Measure of Robustness to Misspecification

Markov Switching Regular Vine Copulas

Shu Yang and Jae Kwang Kim. Harvard University and Iowa State University

Multivariate Non-Normally Distributed Random Variables

Chapter 5: Models used in conjunction with sampling. J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70

Gaussian Process Vine Copulas for Multivariate Dependence

Causal Inference in Observational Studies with Non-Binary Treatments. David A. van Dyk

A Sampling of IMPACT Research:

Harvard University. Harvard University Biostatistics Working Paper Series

Measuring Social Influence Without Bias

Topics and Papers for Spring 14 RIT

A Monte Carlo Comparison of Various Semiparametric Type-3 Tobit Estimators

Probability Distributions and Estimation of Ali-Mikhail-Haq Copula

How to select a good vine

Econometrics of Panel Data

Empirical likelihood and self-weighting approach for hypothesis testing of infinite variance processes and its applications

Semi-parametric predictive inference for bivariate data using copulas

Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs

On the Choice of Parametric Families of Copulas

Covariate selection and propensity score specification in causal inference

Statistical Methods for Handling Missing Data

Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix

MISSING or INCOMPLETE DATA

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion

Fair Inference Through Semiparametric-Efficient Estimation Over Constraint-Specific Paths

By Bhattacharjee, Das. Published: 26 April 2018

A better way to bootstrap pairs

PARSIMONIOUS MULTIVARIATE COPULA MODEL FOR DENSITY ESTIMATION. Alireza Bayestehtashk and Izhak Shafran

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies

Analysis of Matched Case Control Data in Presence of Nonignorable Missing Exposure

Statistical Methods. Missing Data snijders/sm.htm. Tom A.B. Snijders. November, University of Oxford 1 / 23

The propensity score with continuous treatments

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates

General Doubly Robust Identification and Estimation

arxiv: v2 [stat.me] 7 Oct 2017

GOODNESS-OF-FIT TESTS FOR ARCHIMEDEAN COPULA MODELS

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari

COMPARISON OF GMM WITH SECOND-ORDER LEAST SQUARES ESTIMATION IN NONLINEAR MODELS. Abstract

Imbens/Wooldridge, Lecture Notes 1, Summer 07 1

Empirical likelihood methods in missing response problems and causal interference

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score

A New Generalized Gumbel Copula for Multivariate Distributions

Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes

The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models

A note on L convergence of Neumann series approximation in missing data problems

Transcription:

Calibration Estimation for Semiparametric Copula Models under Missing Data Shigeyuki Hamori 1 Kaiji Motegi 1 Zheng Zhang 2 1 Kobe University 2 Renmin University of China Economics and Economic Growth Centre Seminar Series Nanyang Technological University January 24, 2018 Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 1 / 49

Introduction Copula Model -- Flexible modelling of multivariate distributions -- Popular tool for business and finance Missing Data We bridge them for the first time in the literature. Treatment Effect -- Common problem in all fields -- Deep literature in statistics -- Binary phenomenon (to observe or not to observe) -- Estimate unobserved outcome -- Central topic in econometrics Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 2 / 49

Introduction How to fit copula models when there are missing data? A naïve approach is listwise deletion (LD): 1 Keep individuals with all d components observed and discard all other individuals. 2 Treat the individuals with complete data in an equal way. i= 1 i= 2 i= 3 i= 4 i= N Y 11 Y 12 Y 13 Y 14 Y 1N Y 21 Y 22 Y 23 Y 24 Y 2N Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 3 / 49

Introduction LD leads to a consistent estimator for the copula parameter of interest if the missing mechanism is missing completely at random (MCAR). LD leads to an inconsistent estimator if the missing mechanism is missing at random (MAR). Under MAR, target variables Y i = [Y 1i,..., Y di ] T and their missing status are independent of each other given observed covariates X i = [X 1i,..., X mi ] T. LD treats individuals with complete data all equally, and it does not use the information of X i. That causes a large bias in the estimator under MAR. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 4 / 49

Introduction How to obtain a consistent estimator for the copula parameter when the missing mechanism is MAR? As is well known in the literature of missing data and average treatment effects, a key step is the estimation of propensity score function (i.e. conditional probability of observing data given covariates). Direct estimation of propensity score is notoriously challenging, whether it is done parametrically or nonparametrically. Parametric approaches are haunted by misspecification problems, while nonparametric approaches are haunted by the curse of dimensionality. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 5 / 49

Introduction Chan, Yam, and Zhang (2016) propose an alternative approach called calibration estimation in the literature of average treatment effects. The calibration estimator is derived by balancing covariates among treatment, control, and whole groups. It does not require a direct estimation of propensity score. We apply the calibration estimation to a missing data problem for the first time in the literature. Our calibration estimator for the copula parameter satisfies consistency and asymptotic normality under some assumptions including i.i.d. data and the MAR condition. We also derive a consistent estimator for the asymptotic covariance matrix. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 6 / 49

Introduction We perform Monte Carlo simulations. Our simulation design itself is a contribution to the literature, since we clear two conditions at the same time in a clever way: 1 The unconditional distribution of {Y i } should be characterized by a known and tractable copula. 2 The missing mechanism should be MAR. Our simulation results indicate that the calibration estimator is indeed consistent and sufficiently close to the true value when sample size is N 500. Other estimators that do not use X i suffer from large bias regardless of sample size N. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 7 / 49

Table of Contents 1) Introduction 2) Review of Copula Models and Missing Data 3) Set-up of Main Problem 4) Theory of Calibration Estimation 5) Monte Carlo Simulations 6) Conclusions Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 8 / 49

Review of Copula Models Suppose that there are N individuals and d components: Y i = [Y 1i, Y 2i,..., Y di ] T (i = 1,..., N). Suppose that we want to estimate the d-dimensional joint distribution of Y i. What can we do? There are potential problems about estimating the joint distribution directly. d = small d = large Parametric Misspecification Misspecification Parameter proliferation Nonparametric (Curse of dimensionality) Curse of dimensionality Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 9 / 49

Review of Copula Models Copula models accomplish flexible specification with a small number of parameters. Copula models follow a two-step procedure. Step 1: Model the marginal distribution of each component separately. Step 2: Combine the d marginal distributions to recover a joint distribution. Step 1 Parametric Nonparametric Nonparametric Step 2 Parametric Parametric Nonparametric Name of model Parametric copula model Semiparametric copula model (Target of our paper) Nonparametric copula model Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 10 / 49

Review of Copula Models The copula approach is justified by Sklar s (1959) theorem. Sklar s theorem ensures the existence of a unique copula function C : (0, 1) d (0, 1) that recovers a true joint distribution. Theorem (Sklar, 1959) Let {Y i } be i.i.d. random vectors with a joint distribution F : R d (0, 1). Assume that the marginal distribution of Y ji, written as F j : R (0, 1), is continuous for j {1,..., d}. Then, there exists a unique function C : (0, 1) d (0, 1) such that F (y 1,..., y d ) = C (F 1 (y 1 ),..., F d (y d )), or in terms of probability density functions, f(y 1,..., y d ) = c (f 1 (y 1 ),..., f d (y d )). Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 11 / 49

Review of Copula Models A well-known example of copula function: bivariate Clayton copula with parameter α > 0. Cumulative distribution function is C 2 (u 1, u 2 ; α) = (u α 1 + u α 2 1) 1 α, where u 1 = F 1 (y 1 ) (0, 1) and u 2 = F 2 (y 2 ) (0, 1) are marginal distribution functions of Y 1i and Y 2i, respectively. Probability density function is c 2 (u 1, u 2 ; α) = (1 + α)(u 1 u 2 ) α 1 (u α 1 + u α 2 1) 1 α 2. Kendall s rank correlation coefficient is τ = α/(α + 2). Larger α implies stronger association between Y 1i and Y 2i. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 12 / 49

Review of Copula Models 6 20 4 2 10 0 1 0.5 u2 0 0 0.5 u1 1 0 1 0.5 u2 0 0 0.5 u1 1 PDF (α = 1.333 or τ =.4) PDF (α = 4.667 or τ =.7) Association is stronger in the lower tail than in the upper tail. Such an asymmetry matches many economic and financial phenomena (e.g. stock market contagion). Larger α implies stronger association between Y 1i and Y 2i. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 13 / 49

Review of Missing Data Suppose that {Y 1i,..., Y di } N i=1 are target variables. Define the missing indicator: { 1 if Y ji is observed, T ji = 0 if Y ji is missing. Suppose that {X 1i,..., X mi } N i=1 are observable covariates. There are three well-known layers of missing mechanism. 1 Missing Completely at Random (MCAR). 2 Missing at Random (MAR). 3 Missing Not at Random (MNAR). Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 14 / 49

Review of Missing Data Each concept is defined as follows. 1 Missing Completely at Random (MCAR): {T 1i,..., T di } {Y 1i,..., Y di }. 2 Missing at Random (MAR): {T 1i,..., T di } {Y 1i,..., Y di } {X 1i,..., X mi }. 3 Missing Not at Random (MNAR): {T 1i,..., T di } {Y 1i,..., Y di } {X 1i,..., X mi }. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 15 / 49

Review of Missing Data An illustrative example on health survey (d = m = 1): Y 1i = weight of individual i; X 1i = I(Individual i is female). MCAR requires that P[Weight of individual i is missing] should be independent of both weight and gender of individual i. MAR requires that: P[Weight of a man is missing] should be independent of his weight. P[Weight of a woman is missing] should be independent of her weight. MNAR allows for the following situations: P[Weight of a man is missing] depends on his weight. P[Weight of a woman is missing] depends on her weight. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 16 / 49

Review of Missing Data Probably too restrictive, since men and women may have different willingness to report their weights. MCAR MNAR MAR More plausible than MCAR, since MAR controls for gender. MAR may be still restrictive, since men (or women) with different weights may have different willingness to report their weights. Most general assumption, but hard to handle technically. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 17 / 49

Review of Missing Data It is well known that listwise deletion (LD) leads to consistent inference under MCAR. It is also well known that LD leads to inconsistent inference under MAR. Correct inference under MAR has been extensively studied since the semnal work of Rubin (1976). The present paper assumes MAR and elaborates the estimation of semiparametric copula models, which has not been done in the literature. MNAR is a relatively new and challenging topic. See Ai, Linton, and Zhang (2018) and references therein for recent contributions. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 18 / 49

Set-up of Main Problem Define propensity score functions: π j (x) = P(T ji = 1 X i = x), j {1,..., d}, η(x) = P(T 1i = 1,..., T di = 1 X i = x). The true copula parameter θ 0 is chracterized as follows. θ 0 = arg max θ Θ E [log c(f 1(Y 1i ),..., F d (Y di ); θ)]. Use the MAR condition and the law of iterated expectations: θ 0 = arg max θ Θ E [I(T 1i = 1,..., T di = 1)q(X i ) log c(f 1 (Y 1i ),..., F d (Y di ); θ)], where q(x i ) η(x i ) 1. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 19 / 49

Set-up of Main Problem A natural procedure of estimating θ 0 is as follows. Step 1: Estimate the marginal distributions {F 1,..., F d }. Step 2: Estimate q(x i ) (i.e. the inverse of propensity score) and then compute 1 ˆθ = arg max θ Θ N N I(T 1i = 1,..., T di = 1)ˆq(X i ) log c( ˆF 1 (Y 1i ),..., ˆF d (Y di ); θ). i=1 Calibration estimation shall be used for both { ˆF 1,..., ˆF d } and ˆq(X i ). Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 20 / 49

Set-up of Main Problem Use the MAR condition and LIE to get F j (y) = P(Y ji y) = E[I(Y ji y)] = E[I(T ji = 1)p j (X i )I(Y ji y)], where p j (x) π j (x) 1 = [P(T ji = 1 X i = x)] 1. Horvitz and Thompson s (1952) inverse probability weighting (IPW) estimator for F j is written as F j (y) = 1 N N I(T ji = 1)p j (X i )I(Y ji y). i=1 If p j (x) were known, then the IPW estimator could be computed straightforwardly. p j (x) is unknown in reality. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 21 / 49

Set-up of Main Problem Estimation of propensity score has been a major issue in the literature of missing data and treatment effects. Many papers attempt a direct estimation of p j (x), either parametrically or nonparametrically. Parametric approaches: Zhao and Lipsitz (1992), Robins, Rotnitzky, and Zhao (1994), and Bang and Robins (2005). The parametric approaches are notoriouly sensitive to misspecification (cf. Lawless, Kalbfleisch, and Wild, 1999). Nonparametric approaches: Hahn (1998), Hirano, Imbens, and Ridder (2003), Imbens, Newey, and Ridder (2005), and Chen, Hong, and Tarozzi (2008). The nonparametric approaches have a notoriously poor performance in small sample (the curse of dimensionality). Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 22 / 49

Theory of Calibration Estimation In the treatment effect literature, Chan, Yam, and Zhang (2016) propose an alternative approach that bypasses a direct estimation of propensity score. They construct calibration weights by balancing the moments of observed covariates among treatment, control, and whole groups. The present paper applies their method to a missing data problem for the first time. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 23 / 49

Theory of Calibration Estimation Under MAR, the moment matching condition holds: E [I(T ji = 1)p j (X i )u K (X i )] = E[u K (X i )], j {1,... d} for any integrable function u K : R m R K. A sample counterpart with a generic weight is written as N I(T ji = 1)p ji u K (X i ) = 1 N i=1 N u K (X i ). i=1 There are multiple values of {p j1,..., p jn } that satisfy the moment matching condition. Among them, we choose the one closest to a uniform weight given some distance measure. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 24 / 49

Theory of Calibration Estimation Why do we want the uniform weight? 1 If there are no missing data, then the uniform weight leads to a natural estimator ˆF j (y) = (N + 1) 1 N i=1 I(Y ji y). 2 It is well known that volatile weights cause instability in the Horvitz-Thompson estimator. Primal problem is a constrained optimization problem: minimize the distance s.t. the moment matching condition. Dual problem is written as an unconstrained optimization problem. To express the dual problem, let ρ : R R be any strictly concave function. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 25 / 49

Theory of Calibration Estimation Define a concave objective function: G jk (λ) = 1 N Compute Compute N [ I(Tji = 1)ρ(λ T u K (X i )) λ T u K (X i ) ], λ R K. i=1 ˆλ jk = arg max λ G jk(λ). ˆp jk (X i ) = 1 N ρ (ˆλ T jku K (X i )). Estimate the marginal distribution of the j-th component by N ˆF j (y) = I(T ji = 1)ˆp jk (X i )I(Y ji y). i=1 Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 26 / 49

Theory of Calibration Estimation A common choice of u K (X i ) is a vector of polynomials, say u K (X i ) = [1, X i, X 2 i, X 3 i ] T (m = 1, K = 4). Arbitrariness of ρ arises from the arbitrariness of the distance measure. Functional forms often used in the nonparametric literature include: Exponential Tilting: ρ(v) = exp( v). Empirical Likelihood: ρ(v) = log(1 + v). Quadratic: ρ(v) = 0.5(1 v) 2. Inverse Logistic: ρ(v) = v exp( v) Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 27 / 49

Theory of Calibration Estimation The second step of likelihood maximization can be handled analogously. Under MAR, the moment matching condition holds: where E [I(T 1i = 1,..., T di = 1)q(X i )u K (X i )] = E[u K (X i )], q(x i ) = η(x i ) 1 = [P(T 1i = 1,..., T di = 1 X i = x)] 1. Find calibration weights that satisfy the moment matching condition and are closest to the uniform weight given some distance measure. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 28 / 49

Theory of Calibration Estimation Define a concave objective function: H K (β) = 1 N Compute N I(T 1i = 1,..., T di = 1)ρ ( β T u K (X i ) ) 1 N i=1 ˆβ K = arg max β H K (β). Compute the calibration weight for the copula function: ˆq K (X i ) = 1 N ρ ( ˆβ T Ku K (X i )). Compute the maximum likelihood estimator ˆθ by N β T u K (X i ). i=1 max θ Θ N i=1 ( I(T 1i = 1,..., T di = 1)ˆq K (X i ) log c d ˆF1 (Y 1i ),..., ˆF ) d (Y di ); θ. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 29 / 49

Theory of Calibration Estimation Theorem (Consistency and Asymptotic Normality) Impose a set of assumptions including what follows: The missing mechanism is missing at random (MAR). {Y i, T i, X i } are i.i.d. across individuals i {1,..., N}. π 1 ( ),..., π d ( ), and η( ) are s-times continuously differentiable with sufficiently large s. K(N) as N, and the rate of divergence is sufficiently slow. Then, consistency and asymptotic normality follow. p 1 ˆθ θ0 as N. 2 N( ˆθ θ0 ) d N(0, V ) as N. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 30 / 49

Theory of Calibration Estimation The asymptotic covariance matrix V is expressed as V = B 1 ΣB 1. We can construct consistent estimators for B and Σ, and hence ˆV = ˆB 1 ˆΣ ˆB 1 p V. See the main paper for a complete set of assumptions, proofs of the consistency and asymptotic normality, and the construction of V and ˆV. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 31 / 49

Monte Carlo Simulations: DGP Target variables are Y i = [Y 1i, Y 2i ] T (d = 2). Consider a scalar covariate X i. Define U i = [U 1i, U 2i, U 3i ] T = [F 1 (Y 1i ), F 2 (Y 2i ), F X (X i )] T. F 1 ( ) is the marginal distribution of Y 1i, and we use χ 2 10. F 2 ( ) is the marginal distribution of Y 2i, and we use χ 2 10. F X ( ) is the marginal distribution of X i, and we use N(0, 1). The inverse distribution functions F 1 1 ( ), F 1 2 ( ), and F 1 X ( ) are known and tractable. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 32 / 49

Monte Carlo Simulations: DGP Step 1: Draw U i i.i.d. Clayton 3 (α 0 ) with α 0 = 4.667. Step 2: Recover Y 1i = F1 1 (U 1i ), Y 2i = F2 1 (U 2i ), and X i = F 1 X (U 3i). Step 3: Assume that {Y 11,..., Y 1N } are all observed. Make some of {Y 21,..., Y 2N } missing according to P (T 2i = 1 X i = x i ) = (1 + exp[a + bx i ]) 1, where (a, b) are to be chosen carefully below. Step 4: Repeat Steps 1-3 J = 1000 times with sample size N {250, 500, 1000}. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 33 / 49

Monte Carlo Simulations: DGP 0.75 0.7 0.5 0.5 Consider the unconditional probability of observing Y 2i : µ E[T 2i ] = E[E[T 2i X i = x i ]] = E[P(T 1i = 1 X i = x i )] = E [ (1 + exp[a + bx i ]) 1]. A larger µ implies less missing data on average. We can draw a contour plot of µ given (a, b). 4 3 0.65 0.6 0.55 b 2 0.85 0.8 1 0.75 0.9 0.8 0.7 0.65 0.6 0.55 0.85 0-3 -2-1 0 a Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 34 / 49

Monte Carlo Simulations: DGP Case A: (a, b) = ( 1.385, 0.000) = MCAR with µ = 0.8. Case B: (a, b) = ( 1.430, 0.400) = MAR with µ = 0.8. Case C: (a, b) = ( 0.405, 0.000) = MCAR with µ = 0.6. Case D: (a, b) = ( 0.420, 0.400) = MAR with µ = 0.6. Case D is the most challenging case since the missing mechanism is MAR and there are relatively many missing data. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 35 / 49

Monte Carlo Simulations: Estimation Approach #1: Semiparametric estimation with listwise deletion (LD). Step 1: Estimate the marginal distribution of the j-th component by ˆF j (y) = 1 N + 1 N I(T 1i = 1, T 2i = 1)I(Y ji < y), i=1 where N = N i=1 I(T 1i = 1, T 2i = 1) is the number of individuals with complete data. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 36 / 49

Monte Carlo Simulations: Estimation Step 2: Compute the maximum likelihood estimator ˆα by max α (0, ) N i=1 ( I(T 1i = 1, T 2i = 1) log c 2 ˆF1 (Y 1i ), ˆF ) 2 (Y 2i ); α, where c 2 (u 1, u 2 ; α) = (1 + α)(u 1 u 2 ) α 1 (u α 1 + u α 2 1) 1 α 2 is the probability density function of Clayton 2 (α). Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 37 / 49

Monte Carlo Simulations: Estimation Approach #2: Semiparametric estimation (SP). Step 1: Estimate the marginal distribution of the j-th component by ˆF j (y) = 1 N j + 1 N I(T ji = 1)I(Y ji < y), i=1 where N j = N i=1 I(T ji = 1) is the number of individuals with the j-th component observed. Step 2: Same as the LD approach. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 38 / 49

Monte Carlo Simulations: Estimation Approach #3: Calibration estimation (CE). Step 1: Estimate the marginal distribution of the j-th component by ˆF j (y) = N I(T ji = 1)ˆp jk (X i )I(Y ji < y), i=1 where u K (X i ) = [1, X i, X 2 i, X 3 i ] (i.e. K = 4). Step 2: Compute the maximum likelihood estimator ˆα by max α (0, ) N i=1 ( I(T 1i = 1, T 2i = 1)ˆq K (X i ) log c 2 ˆF1 (Y 1i ), ˆF ) 2 (Y 2i ); α. Calibration weights are computed with Exponential Tilting: ρ(v) = exp( v). Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 39 / 49

Monte Carlo Simulations: Estimation Summary of Three Approaches Use all Y? Use X? Step 1 Step 2 #1. LD No No Nonparametric Parametric #2. SP Yes No Nonparametric Parametric #3. CE Yes Yes Nonparametric Parametric Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 40 / 49

Monte Carlo Simulations: Estimation Our simulation design itself is a contribution to the literature. We draw U i = [F 1 (Y 1i ), F 2 (Y 2i ), F X (X i )] T i.i.d. Clayton 3 (α 0 ), which clears two conditions simultaneously: 1 Y i and X i should be associated with each other since we are interested in MAR. 2 The unconditional bivariate distribution of Y i should be characterized by a tractable copula with a known parameter (Clayton 2 (α 0 ) here). Gumbel copula and Frank copula have the same property as Clayton in terms of marginal distributions. To our knowledge, the present simulation design is the only way of clearing the two conditions. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 41 / 49

Monte Carlo Simulations: Results Bias of #1. Listwise Deletion Case A Case B Case C Case D MCAR MAR MCAR MAR µ = 0.8 µ = 0.8 µ = 0.6 µ = 0.6 N = 250-0.026 0.144-0.009 0.377 N = 500-0.014 0.159-0.042 0.361 N = 1000-0.011 0.160-0.013 0.384 LD estimator is inconsistent under MAR. Bias under MAR gets larger as missing probability gets larger. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 42 / 49

Monte Carlo Simulations: Results Bias of #2. Semiparametric Estimation Case A Case B Case C Case D MCAR MAR MCAR MAR µ = 0.8 µ = 0.8 µ = 0.6 µ = 0.6 N = 250-0.161-0.180-0.269-0.648 N = 500-0.103-0.129-0.183-0.568 N = 1000-0.047-0.116-0.106-0.521 SP estimator is inconsistent under MAR. Bias under MAR gets larger as missing probability gets larger. Removing bias requires exploiting the information of X i. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 43 / 49

Monte Carlo Simulations: Results Bias of #3. Calibration Estimation Case A Case B Case C Case D MCAR MAR MCAR MAR µ = 0.8 µ = 0.8 µ = 0.6 µ = 0.6 N = 250-0.099-0.054-0.148-0.114 N = 500-0.071-0.075-0.114-0.065 N = 1000-0.036-0.029-0.063-0.058 Calibration estimator is consistent under MAR. Finite sample bias is sufficiently small when N 500. The consistency stems from the proper usage of X i. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 44 / 49

Conclusions We investigate the estimation of semiparametric copula models under missing data for the first time in the literature. There is analogy between missing data and average treatment effects since observing or not observing data is a binary phenomenon. Chan, Yam, and Zhang (2016) propose the calibration estimation for average treatment effects. We apply the calibration estimation to missing data for the first time in the literature. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 45 / 49

Conclusions Our calibration estimator for the copula parameter satisfies consistency and asymptotic normality under some assumptions including i.i.d. data and the missing at random (MAR) mechanism. We also derive a consistent estimator for the asymptotic covariance matrix. We design Monte Carlo simulations in a clever way that MAR holds and the true value of parameter is well defined. The simulation results indicate that the calibration estimator is indeed consistent, and it is sufficiently close to the true value when sample size is N 500. Existing estimators which do not exploit the information of covariates suffer from substantial bias for any sample size. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 46 / 49

References Ai, C., O. Linton, and Z. Zhang (2018). A Simple and Efficient Estimation Method for Models with Nonignorable Missing Data. Working paper at the University of Florida. Bang, H. and J. M. Robins (2005). Doubly Robust Estimation in Missing Data and Causal Inference Models. Biometrics, 61, 962-973. Chan, K. C. G., S. C. P. Yam, and Z. Zhang (2016). Globally Efficient Non-Parametric Inference of Average Treatment Effects by Empirical Balancing Calibration Weighting. Journal of the Royal Statistical Society, Series B, 78, 673-700. Chen, X., H. Hong, and A. Tarozzi (2008). Semiparametric Efficiency in GMM Models with Auxiliary Data. The Annals of Statistics, 36, 808-843. Hahn, J. (1998). On the Role of the Propensity Score in Efficient Semiparametric Estimation of Average Treatment Effects. Econometrica, 66, 315-331. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 47 / 49

References Hirano, K., G. W. Imbens, and G. Ridder (2003). Efficient Estimation of Average Treatment Effects Using the Estimated Propensity Score. Econometrica, 71, 1161-1189. Horvitz, D. G. and D. J. Thompson (1952). A Generalization of Sampling without Replacement from a Finite Universe. Journal of the American Statistical Association, 47, 663-685. Imbens, G. W., W. K. Newey, and G. Ridder (2005). Mean-square-error Calculations for Average Treatment Effects. IEPR Working Paper No. 05.34. Lawless, J. F., J. D. Kalbfleisch, and C. J. Wild (1999). Semiparametric Methods for Response-Selective and Missing Data Problems in Regression. Journal of the Royal Statistical Society, Series B, 61, 413-438. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 48 / 49

References Robins, J. M., A. Rotnitzky, and L. P. Zhao (1994). Estimation of Regression Coefficients When Some Regressors are not Always Observed. Journal of the American Statistical Association, 89, 846-866. Rubin, D. B. (1976). Inference and Missing Data. Biometrika, 63, 581-592. Sklar, A. (1959). Fonctions de répartition à n dimensions et leurs marges. Publications de l Institut de Statistique de L Université de Paris, 8, 229-231. Zhao, L. P. and S. Lipsitz (1992). Designs and Analysis of Two-Stage Studies. Statistics in Medicine, 11, 769-782. Hamori, Motegi & Zhang (Kobe & RUC) Copula Models under Missing Data January 24, 2018 49 / 49