Marginal Specifications and a Gaussian Copula Estimation

Similar documents
A Goodness-of-fit Test for Copulas

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood

Summary of Extending the Rank Likelihood for Semiparametric Copula Estimation, by Peter Hoff

Financial Econometrics and Volatility Models Copulas

Dependence. Practitioner Course: Portfolio Optimization. John Dodson. September 10, Dependence. John Dodson. Outline.

STA 4273H: Statistical Machine Learning

Markov Chain Monte Carlo methods

Modelling Dropouts by Conditional Distribution, a Copula-Based Approach

Dependence. MFM Practitioner Module: Risk & Asset Allocation. John Dodson. September 11, Dependence. John Dodson. Outline.

Robustness of a semiparametric estimator of a copula

Report and Opinion 2016;8(6) Analysis of bivariate correlated data under the Poisson-gamma model

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

Default Priors and Effcient Posterior Computation in Bayesian

Bayesian Methods for Machine Learning

MULTIDIMENSIONAL POVERTY MEASUREMENT: DEPENDENCE BETWEEN WELL-BEING DIMENSIONS USING COPULA FUNCTION

MCMC algorithms for fitting Bayesian models

STAT 518 Intro Student Presentation

Gaussian kernel GARCH models

Semi-parametric predictive inference for bivariate data using copulas

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

Bayesian GLMs and Metropolis-Hastings Algorithm

Bayesian spatial hierarchical modeling for temperature extremes

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian Linear Regression

Bayesian inference for multivariate extreme value distributions

Multivariate negative binomial models for insurance claim counts

Bayesian Semiparametric GARCH Models

Chapter 1. Bayesian Inference for D-vines: Estimation and Model Selection

Bayesian Semiparametric GARCH Models

The Bayesian Approach to Multi-equation Econometric Model Estimation

Using Estimating Equations for Spatially Correlated A

Bayesian linear regression

Behaviour of multivariate tail dependence coefficients

Estimation of Copula Models with Discrete Margins (via Bayesian Data Augmentation) Michael S. Smith

ABC methods for phase-type distributions with applications in insurance risk problems

Efficient estimation of a semiparametric dynamic copula model

Bayesian Nonparametric Regression for Diabetes Deaths

Bayesian Methods in Multilevel Regression

Songklanakarin Journal of Science and Technology SJST R1 Sukparungsee

Gibbs Sampling in Linear Models #2

Contents. Part I: Fundamentals of Bayesian Inference 1

Riemann Manifold Methods in Bayesian Statistics

Online Appendix to: Marijuana on Main Street? Estimating Demand in Markets with Limited Access

Markov Chain Monte Carlo Methods

Multilevel Statistical Models: 3 rd edition, 2003 Contents

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model

Stat 5101 Lecture Notes

Simulation of Tail Dependence in Cot-copula

A New Generalized Gumbel Copula for Multivariate Distributions

Imputation Algorithm Using Copulas

The Instability of Correlations: Measurement and the Implications for Market Risk

Bayesian Inference and MCMC

Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models

Multivariate Distributions

Multivariate Statistics

Counts using Jitters joint work with Peng Shi, Northern Illinois University

Computational statistics

EVANESCE Implementation in S-PLUS FinMetrics Module. July 2, Insightful Corp

Kazuhiko Kakamu Department of Economics Finance, Institute for Advanced Studies. Abstract

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

Semiparametric Gaussian Copula Models: Progress and Problems

Likelihood-free MCMC

Kernel adaptive Sequential Monte Carlo

Lecture 8: The Metropolis-Hastings Algorithm

Katsuhiro Sugita Faculty of Law and Letters, University of the Ryukyus. Abstract

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random

CPSC 540: Machine Learning

Markov Switching Regular Vine Copulas

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA

Three-Stage Semi-parametric Estimation of T-Copulas: Asymptotics, Finite-Samples Properties and Computational Aspects

17 : Markov Chain Monte Carlo

Practical Bayesian Quantile Regression. Keming Yu University of Plymouth, UK

Semiparametric Gaussian Copula Models: Progress and Problems

arxiv: v1 [stat.me] 27 Feb 2017

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

Online appendix to On the stability of the excess sensitivity of aggregate consumption growth in the US

Research Division Federal Reserve Bank of St. Louis Working Paper Series

Working Papers in Econometrics and Applied Statistics

Timevarying VARs. Wouter J. Den Haan London School of Economics. c Wouter J. Den Haan

ST 740: Markov Chain Monte Carlo

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix

The Metropolis-Hastings Algorithm. June 8, 2012

Copulas. MOU Lili. December, 2014

Density Estimation. Seungjin Choi

Pattern Recognition and Machine Learning

Multivariate Non-Normally Distributed Random Variables

Gaussian Process Vine Copulas for Multivariate Dependence

Correlation: Copulas and Conditioning

Statistical Methods in Particle Physics Lecture 1: Bayesian methods

Bayesian Regression Linear and Logistic Regression

Cross-sectional space-time modeling using ARNN(p, n) processes

Lecture 5: Spatial probit models. James P. LeSage University of Toledo Department of Economics Toledo, OH

POSTERIOR ANALYSIS OF THE MULTIPLICATIVE HETEROSCEDASTICITY MODEL

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Risk Measures with Generalized Secant Hyperbolic Dependence. Paola Palmitesta. Working Paper n. 76, April 2008

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations

The sbgcop Package. March 9, 2007

A Fully Nonparametric Modeling Approach to. BNP Binary Regression

A simple graphical method to explore tail-dependence in stock-return pairs

Gibbs Sampling in Latent Variable Models #1

Transcription:

Marginal Specifications and a Gaussian Copula Estimation Kazim Azam Abstract Multivariate analysis involving random variables of different type like count, continuous or mixture of both is frequently required in economics. A Copula based methodology can be adopted for such data, where the association among the random variables is independent to their specific marginal distributions. Depending upon the chosen marginal specifications, copula estimation proceeds. A semi-parametric copula estimation, where the marginals are specified empirically performs very well, but for discrete data it s appropriateness is questioned (see Genest et al. (995)). Hoff (7) proposes a methodology where the marginal distributions are left completely unspecified and the copula parameters are estimated based on the order statistics of the observed data. We conduct an analysis to determine the effect on the estimates of a gaussian copula due to various marginal specifications. Employing a bayesian framework, we find that treating marginal distribution as unknown outperforms empirically distributed margins and misspecified margins in terms of biasedness and mean square error in small samples. JEL Classification: C, C, C5. Introduction Copula based method to conduct multivariate analysis is gaining popularity within the field of economics. It provides a framework, which is general across different type of data analysis, unlike other joint non-linear modelling of non-normal data where problems are dealt with caseby-case. The copula framework is invariant to applications in finance or micro data analysis. Embrechts et al. (999) show how for a VaR analysis, the assumption of multivariate normal fails to capture joint observations in the tails, and hence apply copula methods. Cherubini et al. () provides further financial applications. Munkin and Trivedi (999), using discrete micro data show how generally joint modelling is troublesome, and the problem increases when the

marginal distributions belong to different parametric families. Cameron et al. (), analyse a selection model with discrete outcomes in a copula framework. Among others see Cameron et al. (998) and Chib and Winkelmann ()for copula analysis with discretely varied marginal distributions. Hoff (7) applies a multivariate gaussian copula to estimate the correlation between an individuals income, degree, number of children etc. The separation of the joint distribution through a copula allows the marginals to belong to different parametric families, or even be non-parametric in which case we have a semiparametric copula estimation problem. Genest et al. (995), for continuous margins shows that by maximising the log pseudo-likelihood based on the normalized ranks, we obtain consistent and asymptotically normal estimates for the copula parameters. Such a semi-parametric specification is attractive, as unlike a parametric distribution it requires no parameters to be estimated. For financial applications such an option is welcoming, due to the marginals exhibiting high kurtosis and skewness which is troublesome for parametric marginals to capture. In case the data varies discretely, then regardless of employing a parametric or non-parametric marginal, we face difficulties. Trivedi and Zimmer (6) state for discrete margins, the copula maximizations often runs into computational problems, like algorithm convergence. They propose to employ a continuation transformation to the discrete variable and then base likelihood estimation on continuous copula families. Genest and Ne slehová (7) show that using rankbased estimators, the ties observed in the ranks would have to be dealt with first, and given low count data the step-size of an empirical distribution is large. Such problems make such an estimator quite biased. Pitt et al. (6) propose a bayesian sampling scheme for continuous and discrete margins in a fully parametric gaussian copula, where some of the issues regarding discrete margins are dealt with. Alternatively, for discrete or mixture of continuous-discrete data, Hoff (7) proposes a method where the marginals are left unspecified. Copula estimation is based on the order statistics of the observed data using bayesian techniques. The inference on the copula parameters is based on a summary statistic which is not a function of the nuisance marginal parameters. We aim to analyse the performance of the method proposed by Hoff, as compared to em-

ploying an empirical, or misspecified (continuous transformation) distribution for the marginals. We compute the biasedness and Mean Square Error (MSE) for these estimators in a gaussian copula framework with mixture of continuous-discrete margins. We merge the bayesian framework of Hoff (7) and Pitt et al. (6), to estimate the copula parameters. The sampling scheme is separated by first drawing the unknown quantities related to the marginal distributions conditional upon the copula parameters, followed by sampling the copula parameters conditional upon the marginals. We see by leaving the marginals unspecified in Hoff s method, produces less bias as compared to assuming empirically distributed or misspecified margins. The difference is larger in small samples and for correlation between continuous-discrete mixture margins. The bias approaches zero for large samples, except when misspecified margins are used. The mean square error also exhibits similar patterns, where Hoff s method based estimator has the smallest value as compared to other two, and equals the empirically distributed margins value for large samples. In section, we first provide the copula setup and provide details of the various marginal specifications which can be employed. That follows by setting out the bayesian sampling scheme in section 3. The Data Generating Process (DGP) is explained in section. Section 5 will give details of various marginal specifications we use to estimate the copula. Section 6, will describe the simulation over the DGP and quantities we are computing. Then finally we discuss the results from the simulation before concluding. Gaussian Copula Setup The definition of a copula can be best given by referring to Sklar s theorem (959), which states, if H is the multivariate distribution of dimension p, then it can partitioned into a copula C and the marginal distributions F,..., F p for the random variables Y,..., Y p given as H(y,..., y p ) = C(F (y ),..., F p (y p )), where C[, ] p [, ]. The copula distribution can also be stated as C(u,..., u p ) = pr (U u,..., U p u p ), 3

where U are the Probability Integral Transformations (PIT) of Y, obtained through the marginal distributions. There is a wide selection of copula families available, to capture different patterns of dependency among the random variables. Nelsen (7) and Joe (997)) provide a detailed coverage of copula theory the various families available. As our question investigates the effect of different specification of the marginal distribution on efficiency of a copula estimation, rather than how marginal specifications effect different copulas, we choose the most frequently used copula, namely the Gaussian copula. Using a gaussian copula along with normal margins, is essentially equivalent to a multivariate normal distribution. The gaussian copula can be defined as C(u,..., u p ) = Φ p {Φ (u ),..., Φ (u p )}, where Φ is the standard normal Cumulative Distribution Function (CDF), and Φ p is the CDF of a multivariate normal vector of dimension p. Let us denote a standard normal variable as z j with zero mean and variance one, which is computed as z j = Φ (u j ), for j =,..., p. Let z = (z,..., z p ), then we can define the multivariate normal with zero mean and the covariance matrix equal to the correlation matrix Θ as z N p (, Θ). Song () states that gaussian copula density equals Θ / exp( z Θ z)exp( zz ). Till now, we have only mentioned that u = (u,..., u p ) are obtained through PIT of the observed data, which in general copula methodology implies, applying the marginal distribution. F j, for the j th component could either be a parametric or a non-parametric marginal distribution. If a known parametric distribution is chosen, it will have some parameters associated with it. These parameters will need to be estimated along with the gaussian copula parameters (i.e. correlation matrix). For given values of the marginal parameters, the corresponding standard normals z j can be computed. If a non-parametric specification is preferred, either

due to the lack of knowledge about y j or the limitations a parametric distribution can have. Then the corresponding z j can be obtained through the empirical distribution of the observed data, without having to estimate any marginal parameters. We simplify the problem by not having mixture of marginal specifications in a given multivariate analysis. That is, if F j is specified to be parametric, then F \j (i.e. all other marginals distributions except F j ) will also be parametric, and vice versa for the case of non-parametric specifications. We now present the various marginal specifications used with gaussian copula to be considered in the simulation, in detail.. Parametric Copula Specification Let n be the total number of observations given as y,..., y n, for i =,..., n, where each y i is a (p ) vector. a Then the fully parametric gaussian copula estimation problem is given as z i N p (, Θ), y ij = F ij {Φ(z ij) β j }, for all i and j, where F ij is the CDF function for either a continuous or discrete random variable, and β j is the parameter vector associated with the j th component. For a component j, the marginal distribution F ij is fixed over all the i s, and hence could also be simply stated as F j. As F j could either be corresponding to a continuous or discrete random variable, the mapping from y ij to u ij will vary. In case j th component is continuous, F j will be a one-to-one function. Given a value of β j, then z ij can be easily be computed. But if the j th component is discrete, F j will be a many-to-one function. Then given a value for β j, we cannot directly impute the corresponding z ij. We will have to consider them as auxiliary variables, and will have to be simulated along with the copula and the marginal parameters. Our estimation problem here is similar to Pitt et al. (6), but we do not account for presence of covariates in the marginal specification. 5

. Semi-Parametric Copula Specification If the z ij are computed through assuming employing a non-parametric marginal distribution, namely an empirical distribution. Then along with a parametric copula, the estimation problem is on a semi-parametric based specification. In such a setup, there are no marginal parameters which need to be estimated, hence by employing rank based transformations over all the i s for each component j, z ij can be obtained. If all the F j s are corresponding to continuous random variables, then an estimator based on the normalized ranks is consistent and asymptotically normal (see Genest et al. (995)). However, these properties and the estimator becomes biased, if all or some of the marginals are discrete. The size of the bias depends upon the possible variation in the discrete data, the worst case being a component j is a binary random variable. The underlying problem is the ties observed in the rank of the observed data. There are various methods to deal with ties, like splitting, ignoring or adjusting the ties. Genest et al. () show through simulation that a method based on splitting the ties produces the smallest bias in the estimation of Θ. Hoff (7) presents a semi-parametric copula estimation technique, which unlike the above explained method, treats all the z ij as auxiliary variables. No assumption is made regarding F j, and it is treated as completely unknown. This method is applicable to discrete, continuous and mixture of both data. The methodology is equivalent to employing the standard semiparametric technique above, where z ij are known, if the margins are continuous and the sample size n is large. The benefit of not having to estimate the marginal parameters, comes at a cost of having less information from the available data. Let us see the exact specifications for both of the cases, when z (i.e. all the standard normals known) are completely known, and when as Hoff (7) we treat them as latent variables in a semi-parameteric setup... Empirical Distribution F j Given empirical distributions are used for all the marginals in a multivariate gaussian copula, then there are no parameters associated to any components. Also given z are completely known, then the modelling specification becomes 6

z i N p (, Θ), y ij = F ij {Φ(z ij)}, F j (y mj ) = n n+ i= (y ij y mj ), for all i and j. F j denotes the empirical distribution, used instead of a parametric F j for all j. The third equation is just the empirical CDF, and we divide the ranks by n + to avoid boundary values. As mentioned, we only need to estimate the correlation matrix Θ, and in case any of the random variable is discrete then decide how to deal with ties. As Genest et al. (995) show splitting the ties produces the smallest bias, so we randomly split them... Unknown F j Here, unlike employing an empirical CDF and splitting the ties in the rank of the observed data to obtain z, we treat them as completely unknown. There is no assumption made regarding any F j, and they are all completely unknown. The only information we have regarding F j is that for all the components it is a non-decreasing function. We also know the ranks for each i in a given component j, hence if the rank of y ij is k, then we can write corresponding order statistic as y (k) j, such that y ij = y (k). From this information we can infer that the unobserved j z ij corresponding to y ij, will have the same rank k. Hence we can write this more formally as y (k ) j z (k ) j (y ij = y (k) j ) y (k+) j, < (z ij = z (k) j ) < z (k+) j, Note that we do not have strict inequality for the observed data, which is to accommodate ties in the ranks. From the above equalities, we know for certain that z ij has to lie in the interval dictated by the largest order statistic which is smaller than z (k) j and the smallest order statistic which is greater than z (k) j. Based on this information, we set out the same gaussian copula specification as z i N p (, Θ), for all i. When the interval where z ij lies in becomes smaller due to having continuous margins and large n, then the uncertainty regarding the true value of z ij is less and the methodology is similar to simply applying the empirical distribution F j for each component. 7

In the next section we describe the bayesian sampling scheme to estimate Θ, for all of the marginal specifications described above. 3 Bayesian Estimation As described previously, we aim to estimate the gaussian copula parameters, namely the correlation matrix denoted as Θ. If all the margins are parametrically specified, then each component j will have a parameter vector β j associated to it. Let β denote the vector containing all the marginal parameters from all the components, β = (β,..., β p ). In a parametric setting, if any of the random variables is discrete, we also need to estimate the standard normals z ij, for that component. When we apply the empirical distribution to obtain the corresponding z, then the only parameter needed to be estimated is the correlation matrix Θ, however if F is taken to be unknown, then we do not observe z, and will have to sampled along with Θ. Given this setting we can partition the bayesian sampling scheme into two parts. First β = (β,..., β p ) and z = (z,..., z p ) are sampled conditional upon Θ, where needed. Secondly, we sample Θ conditional upon β and z. 3. First Stage p(β, z Θ) This stage can be skipped and move to the second stage, if a semi-parametric specification is employed with empirically distributed margins F. But if the margins are specified parametrically F, or if the marginal distributions are not known, then we first need to sample β and z. 3.. Parametric Margins In case the marginals are all parametrically specified, then we sample in this order. Sample from p(β j y.,j, z.,\j, Θ), where y.,j denotes all the observations n for the given component j, and z.,\j is all the observations from all the other components exempt j.. If j th margin s distribution F j is continuous, then compute z ij = Φ {F ij (y ij β j )}. If F j is a discrete distreibution, sample z ij from p(z ij β j, y ij, z i,\j ; Θ), for all i. 8

The above two steps are repeated for each j, in turn. Pitt et al. (6) provide details in respect to the form of the conditional density of β j. As it is not possible to sample directly from the conditional density. They propose a Metropolis-Hasting algorithm, where the proposal density is approximated to a multivariate t distribution, with mean equal to the mode β j of log p(β j y.,j, z.,\j, Θ), this can be found by quasi-newton Raphson scheme. The variance of the t distribution is equated to the negative inverse of the second derivative of the log conditional density, computed at the mode. The degrees-of-freedom is chosen, such that the proposal density can dominate the true density in the tails. Such a method is similar to a Laplace-type proposal (see Chib and Greenberg (998) and, Chib and Winkelmann ()). A new proposed value βj is then evaluated in a Metropolis-Hasting step. In case if the component j has a discrete distribution, z ij are sampled after sampling β j, from a univariate gaussian distribution, where the mean and the variance take into account the standard normals from the other components z i,\j and the correlation among them, given through Θ. We refer the interested reader for full details of the briefly explained above algorithm to Pitt et al. (6), page 5-5. 3.. Unspecified Marginals On the other hand if a semi-parametric copula approach is adopted, where no assumption regarding F j is made, then there no β j to sample, but only z needs to sampled in this stage. Here we follow the approach set out by Hoff (7), Where we sample z ij from z ij p(z ij Θ, z i,\j, y (k) j ), for all i and j, where the conditional density of z ij is conditioned on the correlation matrix Θ and all the other standard normals from each j. The conditioning of y k j, implies knowing the rank of z ij through y ij, using that we can determine the interval (z (k ) j, z (k+) j ) where z ij lies in. The sampling scheme iterates over all the i s in a given j, and then moves on to the next j. Similar to the case, where z are sampled if the F is known and is a discrete distribution, here z ij are sampled from a univariate gaussian distribution, with the mean and variance considering the correlation among the variables and the other components z i,\j. The major difference in sampling the z here is that the truncation is dictated by the order statistics, whereas in the parametric case, 9

the truncation is given by the CDF of the discrete parametric distribution, evaluated at y ij and y ij (see Pitt et al. (6)). Hence this scheme is adopted regardless of the true distribution of j th being continuous or discrete. Depending upon whichever marginal specifications are chosen, before moving on to the second stage we should obtain z, to proceed with sampling Θ. 3. Second Stage In this stage, we no longer care about the assumptions specified on the marginal distributions. All we require are z from the previous stage, to sample Θ. Hence, either parametrically defined marginal distribution or non-parameterically, this scheme for Θ is invariant. We can write the posterior of Θ as p(θ z) p(θ) p(z Θ). Similar to Hoff (7), we assume a semi-conjugate prior for the gaussian copula. The prior p(θ) is defined as, let the prior of V be given as an inverse-wishart distribution (ν, ν V ), parametrized such that E[V ] = V, where ν is the degrees-of-freedom and ν V the scale matrix. Let Θ be equal in distribution to correlation matrix given as Θ [i,j] = V [i,j] V[i,i] V [j,j]. The posterior of V can then shown to be proportional to V z inverse-wishart(ν + n, ν V + z z), from which a sample of V can be obtained, and then Θ computed from the above transformation. We could have followed Pitt et al. (6) with their sampling scheme for Θ, but choose not, as our focus in not on an efficient sampling scheme, but to check the effects of the marginal specifications on copula estimation.

Data Generating Process In this section we show how to simulate data from a multivariate gaussian copula and provide details about the Data Generating Process (DGP). This simulated data will then be used to test various marginal specifications and their effect on a gaussian copula estimation. For some correlation matrix Θ and β, a set of generated y can be sampled as follow. Sample z from N p (z ; Θ),. Obtain u = Φ(z), 3. Compute y ij = F j (u ij β j ), for all i and j, where u = (u,..., u p ), and each u i is (n ) vector. Step 3 above implies, we need to be able to compute the inverse CDF of the chosen parametric marginal distribution. Let us then set out the DGP, which will be used through the rest of the paper. We choose p = 3 and alter n such that it ranges from small sample (n = ) to large sample (n = 5). z N ;.8..8.6,..6 u = Φ(z), y., = F (u.,.5) F (y.,.5) = Exponential(y., λ ), y., = F (u., 6) F (y., 6) = Poisson(y., λ ), y.,3 = F 3 (u.,3.6) F (y.,.6) = Bernoulli(y., λ 3 ). So the true DGP is a mixture of continuous and discrete marginals. This DGP will stay fixed through out the simulation. Using the simulated y, we assume various marginal specifications and compare them in terms of estimating the true correlation matrix Θ.

5 Marginal Specifications Now we state the various marginal specifications we will employ, in order to estimate the gaussian copula parameters. For ease of reference, we can refer to them as Marginal Specifications (MS), So various specifications will be defined as MS, MS etc. Their detail is as follow MS All three marginals (F, F and F 3 ) are assumed to be completely unknown. Using the order statistics of the observed data, first z, and then the correlation matrix is sampled. This is as described previously, the method proposed by Hoff (7). MS Assume all the margins are empirically distributed, implying compute z i,j = F j (y i,j ), for all i and j. MS3 Perform a continuation transformation for the two discrete margins, then let z ij = Φ {F (y ij β j )} = ln N (y ij µ j, σ j ), for all i and j. Hence all margins are log normally distributed. So we only specify three different marginal specifications. The first two correspond to semiparametric copula estimation, and the last to a fully parametric copula estimation. We decided to consider misspecified margins, as in very small sample it is interesting to see how well do they perform in estimating the copula parameters. MS3 takes the discrete marginals and adds a random independent term between [, ] with the observed values, to make them continuous. This is an approach stated in Trivedi and Zimmer (6), to avoid computational problems generally encountered in likelihood estimation. This transformation along with assuming log normal distribution induces a misspecification. For the first margin (originally exponential in the DGP), is also misspecified by assuming a log normal too. Next, let us look at the MCMC and the simulation over the DGP in more detail.

6 Simulation 6. Setup We will in essence perform a Gibbs sampling type of algorithm over the two stages defined in section 3. We perform 55 iterations over the sampling scheme to obtain the posterior density of Θ, of which the first 5 are discarded for burn-in of the MCMC chain. We will not be discussing the posterior density obtained of the marginal parameters, as our focus is on the correlation matrix. From the posterior density we compute the posterior mean E(Θ y). To analyse the properties of the various marginal specifications and their effect on the estimation of Θ, we have to obtain a distribution for the posterior mean, hence we employ Monte carlo over the DGP. We choose the size of the Monte carlo simulation to be 5, which is sufficient to conduct inference on. At each Monte carlo iteration we obtain a new sample of y through the same DGP, which can be denoted as {y} s, where s =,..., S. We can define the general simulation structure as for s =,..., S, Sample {y} s from the DGP, For each, MS, MS & MS3, ] obtain E [Θ {y} s, end, end. After ignoring the first 5 iterations for burn-in, the autocorrelation for all the parameters is lower than. after three lags. The trace plot for all the parameters also shows, that the MCMC chain mixes well and dominates in the tails of the true distribution. 6. Biasedness & Variance After obtaining the distribution of the posterior mean of Θ, for all the three marginal specifications, we would like to compare them in terms of their biasedness and variance to the true 3

correlation matrix Θ. We compute two quantities of interest for all the marginal specifications. ] First, we compute the bias where we compute the difference of E [Θ {y} s from Θ. Secondly, we compute the MSE, which combines the variance and bias of the estimator. Both are given as ] Bias = S S s= [Θ {y} E s Θ, Mean Squared Error (MSE) = S S s= [ E[Θ {y} s ] Θ]. We will compare the biasedness across all three estimators (specifications), and as our interest is in determining the performance of MS compared to the other specifications, we will compute the MSE ratio of MS with respect to MS and MS3 ω = MSE M MSE M, ω 3 = MSE M MSE M3. These quantities will be computed for all the off-diagonal entries of the correlation matrix Θ (i.e. lower correlation parameters). The whole simulation is repeated for different values n. 7 Results 7. Bias Let us first take a look at the biasedness for the three marginal specifications. B MS refers to the bias values from MS in table, and similarly denoting the other marginal specifications. We see all the MS under-predict the true Θ. Θ [i,j] represent an entry in row i and column j of the correlation matrix Θ. The bias for MS is lower than MS and MS3 for small sample n =. The difference is particularly larger for Θ [,] (correlation between continuous and discrete margin) in MS and MS, hence Hoff s method has a smaller bias for mixture of distributions, which is more prominent for small sample. It is interesting to see that the misspecified model MS3 has smaller bias as compared to MS for n =, but as n increases, MS becomes less biased as compared to MS3. This is simply due to the bias created by

small sample common across all marginal specifications. As we increase the sample size, the difference between B MS and B MS converges to zero, but the rate convergence is slower for mixture of margins (continuous and discrete). We see the biasedness in MS3, does not drop as dramatically as MS and MS. This is especially true for Θ [,3], where two discrete margins are misspecified, which shows continuation transformation not to be a very efficient technique. The misspecification of an exponential margin to a log-normal still seems to create smaller bias, when associated with discrete margins transformed. Overall, we see Hoff s method has a lower bias, as compared to computing z through an empirical distribution. B MS Table : Computed bias for all marginal specifications n= n=5 n=5 n= n=5 n=5 Θ [,] -.397 -.659 -.93 -.55 -.8 - Θ [,3] -.3 -.888 -.675 -.63 -.53 - Θ [,3] -.3337 -.37 -.983 -.87 -.657 - B MS Θ [,] -.79 -.9 -. -.68 -.37 - Θ [,3] -.63 -.5 -.765 -.65 -.555 - Θ [,3] -.369 -.567 -.96 -.866 -.69 - B MS3 Θ [,] -.376 -.366 -.78 -.7 -.35 - Θ [,3] -.536 -.7 -.95 -. -.7 - Θ [,3] -.3675 -.38 -.93 -.87 -.797 - The difference becomes smaller as n increases, as the step-size in an empirical distribution and the interval wherez ij lies (Hoff s method) both become smaller. This is especially the case when the bias is examined for two continuously distributed margins. In our case, we either have mixture of continuous and discrete, or both discrete margins then Hoff s method has smaller bias. 5

7. MSE In table 7., we compute the MSE ratio of MS with respect to the other marginal specifications. We denote the ratios as ω and ω 3 with respect to MS and MS3 respectively. Similar to the case of biasedness of the marginal specifications, here in ω for the case of Θ [,] indicates the MSE is smaller for MS than MS, as compared to the other correlation parameters in Θ. Again, this ratio reaches the value one, as the sample size is increased. Over all the values of n, see Hoff s method produces the smallest MSE, as both ω and ω 3 are both less than one. ω 3, actually decreases further as n increases, further pointing out the inappropriateness of using misspecified margins. Another interesting result is how the ratio of ω for Θ [,3] and Θ [,3] differ. MS has smaller MSE compared to MS, when a continuous and high count margin (poisson), or a discrete-discrete margin are considered. But for a combination of a continuous-low count margin (binary), the MSE difference is smaller. This is due to the ranking problems a binary variable induces. Similar to the bias, MSE ratio for MS and MS reach value of as the sample size increases. ω Table : MSE ratio n= n=5 n=5 n= n=5 n=5 Θ [,].7699.7366.7639.795.8696 - Θ [,3].937.9.955.9685.97 - Θ [,3].895.965.979.956.963 - ω 3 Θ [,].877.585.39.696.73 - Θ [,3].878.8553.779.6863.6 - Θ [,3].897.66.58..36-6

7.3 MS Kernel We present some kernel density plots for the first marginal specification, assuming F to be completely unknown. Figure - 5 are density plots for the posterior mean over the DGP. We can clearly see the dispersion around the mean (red line) decreases as n increases. Theta[,] denotes the correlation parameter between the first and the second random variable. It is interesting to see the dispersion for Theta[,3] and Theta[,3] is relatively more as compared to that for Theta[,], through all sample sizes. The black line represents the true correlation from the DGP. Figure : Posterior of E(Θ y), n =. 5 3.5.5 Theta[,].5.5.5.5.5 Theta[,3].5.5.5.5.5 Theta[,3] 7

Figure : Posterior of E(Θ y), n = 5. 6 5 3.5.5 Theta[,].5.5.5.5.5 Theta[,3] 3.5.5.5.5.5 Theta[,3] Figure 3: Posterior of E(Θ y), n = 5. 8 6 3.5.5 Theta[,].5.5 Theta[,3] 3.5.5 Theta[,3] 8

Figure : Posterior of E(Θ y), n =. 8 6.5.5 Theta[,] 5 3.5.5 Theta[,3] 6 5 3.5.5 Theta[,3] Figure 5: Posterior of E(Θ y), n = 5. 8 5 6 5.5.5 Theta[,].5.5 Theta[,] 8 6.5.5 Theta[,] 9

8 Conclusion Copula based method provide flexible multivariate analysis for margins of different types and dependency patterns, which are not best described by elliptical distributions. Marginals can be specified parametrically and along with a parametric copula, the estimation problem is fully parametric. Alternatively, a non-parametric distribution can be used for the marginals which leads to a semi-parametric copula estimation problem. For random variables of continuous type, a semi-parametric copula estimation is shown to efficient and asymptotically normal (see Genest et al. (995)), and hence both approach are equivalent. But for multivariate analysis of discrete or mixture of continuous-discrete data, empirically computed marginals are not appropriate. Even in the parametric case, the exact knowledge of the marginal distribution is not always available and transformation to a continuous distribution is also not fruitful. Hoff (7) proposes a method, where the marginal parameters are not required to be estimated and by simply obtaining the information contained in the order statistics, we can estimate the copula parameters. Such a method is useful, as it considers the uncertainty in the mapping through the CDF from the observed data, and the lack of knowledge about the true marginal distribution. We specified various marginal specifications, to estimate a multivariate gaussian copula. Borrowing the bayesian estimation framework for a fully parametric copula specification, and for a semi-parametric copula estimation from Pitt et al. (6) and Hoff (7) respectively. To check for the performance of various marginal specifications, we provided a simulation setup and described a DGP of mixture of marginals type. We specified three different marginal specifications to estimate the gaussian copula. For the first case, the marginals were considered unknown (Hoff s method). Second, the PIT was performed through empirical distributions, and finally a completely misspecified margins were employed. The results showed how Hoff s method outperformed the other two specifications, especially in small samples, where the uncertainty and inappropriateness for the other methods is large. The bias for Hoff s method is smaller in all the samples, but for large samples it becomes equal to applying empirical distribution. The misspecified method has the largest bias, in large samples. Also in terms of mean squared error, Hoff s method has smaller MSE

for small samples, compared to the other two. Overall, even though for continuous and large sample both approaches are equivalent, but in case we are dealing with discrete data, Hoff s method performs better than empirically assumed margins, and should be considered for multivariate analysis of either discrete or continuousdiscrete data in a copula framework. An initial continuation transformation of discrete data before estimating the copula, is also not very efficient. References A. C. Cameron, T. Li, P. K. Trivedi, and D. M. Zimmer. Modelling the differences in counted outcomes using bivariate copula models with application to mismeasured counts. Econometrics Journal, 7():566 58, December. S. Chib and E. Greenberg. Analysis of multivariate probit models. Biometrika, pages 37 36, 998. S. Chib and R. Winkelmann. Markov chain monte carlo analysis of correlated count data. Journal of Business & Economic Statistics, 9():8 35,. P. Embrechts, A. McNeil, and D. Straumann. Correlation and dependence in risk management: Properties and pitfalls. In Risk Management: Value At Risk And Beyond, pages 76 3. Cambridge University Press, 999. C. Genest and J. Ne slehová. A primer on copulas for count data. Astin Bulletin, 37():75, 7. C. Genest, K. Ghoudi, and L.-P. Rivest. A semiparametric estimation procedure of dependence parameters in multivariate families of distributions. Biometrika, 8(3):53 55, 995. C. Genest, J. Ne slehová, and N. Ben Ghorbal. Estimators based on kendall s tau in multivariate copula models. Australian & New Zealand Journal of Statistics, 53():57 77,. ISSN 67-8X. P. D. Hoff. Extending the rank likelihood for semiparametric copula estimation. Ann. Appl. Stat., ():65 83, 7. H. Joe. Multivariate Models and Dependence Concepts. Chapman & Hall/CRC, 997. M. K. Munkin and P. K. Trivedi. Simulated maximum likelihood estimation of multivariate mixed-poisson regression models, with application. Econometrics Journal, ():9 8, 999. R. B. Nelsen. An Introduction to Copulas. Springer, 7. M. Pitt, D. Chan, and R. Kohn. Efficient bayesian inference for gaussian copula regression models. Biometrika, 93(3):537 55, September 6. P. X.-K. Song. Multivariate dispersion models generated from gaussian copula. Scandinavian Journal of Statistics, 7():35 3,. P. K. Trivedi and D. M. Zimmer. Copula Modeling: An Introduction for Practitioners. Foundations and Trends in Econometrics, ():, 6. ISSN 55-376.