Bayesian Adaptive Bandwidth Kernel Density Estimation of Irregular Multivariate Distributions

Size: px
Start display at page:

Download "Bayesian Adaptive Bandwidth Kernel Density Estimation of Irregular Multivariate Distributions"

Transcription

1 Bayesian Adaptive Bandwidth Kernel Density Estimation of Irregular Multivariate Distributions Shuowen Hu, D. S. Poskitt, Xibin Zhang Department of Econometrics and Business Statistics, Monash University, Australia Abstract In this paper, we propose a new methodology for multivariate kernel density estimation in which data are categorized into low- and high-density regions as an underlying mechanism for assigning adaptive bandwidths. We derive the posterior density of the bandwidth parameters via the Kullback-Leibler divergence criterion and use a Markov chain Monte Carlo (MCMC) sampling algorithm to estimate the adaptive bandwidths. The resulting estimator is referred to as the tail-adaptive density estimator. Monte Carlo simulation results show that the tail-adaptive density estimator outperforms the globalbandwidth density estimators implemented using different global bandwidth selection rules. The inferential potential of the tail-adaptive density estimator is demonstrated by employing the estimator to estimate the bivariate density of daily index returns observed from the USA and Australian stock markets. Keywords: marginal likelihood, Markov chain Monte Carlo, S&P500 index, value-at-risk Corresponding author. xibin.zhang@monash.edu. Address: 900 Dandenong Road, Caulfield East, Victoria 3145, Australia. Telephone: Fax: URL: Preprint submitted to Computational Statistics and Data Analysis July 25, 2011

2 1. Introduction Kernel density estimation is one of the most important techniques for understanding the distributional properties of data. It is understood that the effectiveness of such approach depends on the choice of a kernel function and the choice of a smoothing parameter or bandwidth (see for example, Izenman, 1991, for a discussion). Although the two issues cannot be treated separately, it is widely accepted that the performance of a kernel density estimator is mainly determined by the bandwidth (see for example, Scott, 1992; Wand and Jones, 1995), while the impact of kernel choices on the performance of the resulting density estimator was examined by Marron and Nolan (1988),Vieu (1999) and Horová et al. (2002). Most investigations have aimed at choosing a fixed or global bandwidth for a full sample of data (see Jones et al., 1996, for a survey). Terrell and Scott (1992) and Sain and Scott (1996) proposed the idea of data-driven adaptive bandwidth density estimation, which allows the bandwidth to vary at different data points. In this situation, bandwidth selection remains as an important issue and has been extensively investigated for univariate data. However, less attention has been paid to investigations of data-driven methods for estimating adaptive bandwidths in multivariate data. This motivates the investigation of this paper. Our investigation is also empirically motivated. Most financial analysts believe that during the course of the global financial crisis caused by the fallout of the USA sub-prime mortgage crisis, the USA stock market has had a leading effect on most other stock markets world wide. Using a kernel density estimator of bivariate stock-index returns we can derive the conditional distribution of the stock-index return in one market for a given value of the 2

3 stock-index return in the USA market, and therefore we can better understand how the former market was associated with the USA market. However, the marginal density of stock-index returns often exhibits leptokurtosis. Consequently, the kernel estimation of the bivariate density of stock-index returns may require different bandwidths to different groups of observed returns Some Background Let X = (X 1, X 2,, X d ) denote a d-dimensional random vector with its density function f(x) defined on R d (see Jácome et al., 2008, for an example of censored data). Let {x 1, x 2,, x n } be a random sample drawn from f(x). The kernel density estimator of f(x) is (Wand and Jones, 1995) ˆf H (x) = 1 n n K H (x x i ) = i=1 1 n H 1/2 n K(H 1/2 (x x i )), (1) i=1 where K( ) is a multivariate kernel, and H is a symmetric and positive definite d d matrix known as the bandwidth matrix. To choose an optimal H, several methods have been discussed in literature (see for example, Devroye and Györfi, 1985; Marron, 1987). Marron (1992) proposed a bootstrapping approach to the approximation of the mean integrated squared error (MISE), and the normal reference rule (NRR) or equivalently the rule-of-thumb that minimizes the asymptotic MISE was discussed by Scott (1992). The least squares cross-validation was discussed by Sain et al. (1994) and Duong and Hazelton (2005), and a plug-in method was suggested by Wand and Jones (1994) and improved by Duong and Hazelton (2003). Moreover, Zhang et al. (2006) proposed a Bayesian approach to the estimation of the bandwidth matrix based on Kullback-Leibler information criterion. 3

4 However, a problem in using a global bandwidth is that the kernel methods often produce unsatisfactory results for complex or irregular densities. Sain and Scott (1996) presented a classic example, in which a bimodal mixture with two modes of equal height but different levels of variation, where an optimal global bandwidth tends to under-smooth the mode with a large variation and over-smooth the mode with a small variation. Therefore, it is necessary to let the bandwidth be adaptive to local observations; a relatively small bandwidth is needed where observations are densely distributed, and a large bandwidth is required where observations are sparsely distributed (see Jones, 1990; Wand and Jones, 1995; Sain, 2002, for similar arguments). Several versions of the adaptive bandwidth kernel density estimator have been studied in literature. Mielniczuk et al. (1989) proposed to use a weighting function on data point x in a global bandwidth estimator. A popular method is to make the bandwidth as a function of data points, and Nolan and Marron (1989) discussed this issue from a general Delta-sequence estimator perspective. Loftsgaarden and Quesenberry (1965) suggested to replace H in (1) is by H(x), which is the bandwidth at data point x. This estimator was studied by Cao (2001) and Sain and Scott (2002) and is also called the balloon estimator. However, the balloon estimator does not integrate to one and therefore is not a good choice for density estimation (Terrell and Scott, 1992; Izenman, 1991). The other estimator proposed by Breiman et al. (1977) is called the sample-point estimator, which employs H(x i ) as the bandwidth associated with sample point x i. Abramson (1982a,b) suggested to choose bandwidth as the inverse square root of f(x i ). As the sample-point estimator always integrate to one, we consider this estimator throughout this paper. 4

5 The difficulty of using sample-point estimator is that the number of bandwidths exceeds the sample size in multivariate data, and this makes difficulties in estimating bandwidths. Sain and Scott (1996) and Sain (2002) suggested grouping data into different bins and using a constant bandwidth for each bin. Although this method reduces the number of bandwidths, the number of bandwidths still grows exponentially with the dimension The Tail-Adaptive Density Estimator A simple method to control the number of bandwidths to be estimated for the sample-point kernel density estimator is to put the data into a small number of groups. In this paper, we propose dividing the observations into two regions, namely the low-density region (LDR) and high-density region (HDR), and assigning two different bandwidth matrices to these two regions. When the true density is unimodal, the low-density region is the tail area and should be assigned larger bandwidths than the high-density region. We call this type of kernel density estimator the tail-adaptive density estimator. It is not new to group the observations into the low- and high-density regions. Hartigan (1975, 1987) proposed clustering data into different regions with different density values, and Hyndman (1996) presented an algorithm for computing and graphing data in different density regions. A comprehensive review of applications related to the issue of low-and high-density regions was given in Mason and Polonik (2009). In terms of bandwidth selection, Samworth and Wand (2010) considered a bandwidth selection method for univariate high-density region estimation. To our knowledge, there is yet any other investigation on bandwidth selection for a general multivariate kernel density estimation, where two different bandwidth matrices are assigned to 5

6 observations in the low- and high-density regions. In this paper, we treat the elements of the two bandwidth matrices as parameters, whose posterior can be approximately derived through the Kullback-Leibler information. Therefore, bandwidths can be estimated through a posterior simulator. During the past decade, there have been several investigations on Bayesian approaches to bandwidth estimation for kernel density estimation (see for example, Brewer, 2000; Gangopadhyay and Cheung, 2002; Kulasekera and Padgett, 2006; de Lima and Atuncar, 2010). In particular, Zhang et al. (2006) derived the posterior of bandwidths for multivariate kernel density estimation with a global bandwidth matrix. Their Monte Carlo simulation results reveal the advantage of this Bayesian approach over its competitors including the plug-in method of Duong and Hazelton (2003) and the NRR. However, Hall (1987) showed that the use of a global bandwidth for a long-tailed distribution can mislead Kullback-Leibler information. Therefore, in this paper, we extent the sampling algorithm proposed by Zhang et al. (2006) by incorporating tail-adaptive bandwidth matrices into the multivariate kernel density estimation. The rest of this paper is organized as follows. In Section 2, we derive the posterior of the elements of the tail-adaptive bandwidth matrices describe an MCMC sampling algorithm. Sections 3 and 4 present the results of Monte Carlo simulation studies designed to examine the performance of the tailadaptive density estimator. In Section 5, we apply the tail-adaptive density estimator to the estimation of the bivariate density of two asset returns. Section 6 concludes the paper. 6

7 2. Bayesian estimation of bandwidths 2.1. Likelihood cross-validation The Kullback-Leibler information, which measures the discrepancy between a density estimator and its true density, is defined as d KL (f(x), ˆf ) H (x) = log{f(x)}f(x)dx log{ ˆf H (x)}f(x)dx. (2) R d R d An optimal bandwidth could be derived by minimizing (2), and this is equivalent to the maximization of R d log{ ˆf H (x)}f(x)dx with respect to H. As this integral is the expectation of log{ ˆf H (x)} under f(x), it is approximated by n 1 n i=1 log ˆf H,i (x i ), where ˆf H,i (x i ) = 1 n 1 n H 1/2 K ( H 1/2 (x i x j ) ), (3) j=1 j i known as the leave-one-out estimator of f(x i ) (Härdle, 1991). The likelihood cross-validation method is to maximize n i=1 log ˆf H,i (x i ), which is regarded as the likelihood of {x 1, x 2,, x n } for given H, with respect to H. The bandwidth matrix can be either a full or diagonal matrix. A full bandwidth matrix can reveal useful features of the resulting density estimator, while the implementation of some computing algorithms for bandwidth estimation is often very difficult especially when adaptive bandwidth matrices are used. The numerical result obtained by Sain (2002) showed that the density estimator with a full bandwidth matrix is not smooth in lowdensity regions. In this paper, we use a diagonal bandwidth matrix and let h = (h 1, h 2,..., h d ) denote the vector of the square roots of the diagonal elements of the bandwidth matrix. h is also known as the bandwidth vector. 7

8 2.2. Tail-adaptive kernel density estimator The concept of grouping observations into low- and high-density regions has been discussed in literature (see for example, Hartigan, 1975, p205). We propose to group some observations into the low-density region, inside which each observation has a density no more than the density of each observation outside the region. Actually, our definition of the low-density region is in the same sense of Hyndman (1996) whose defined the highest density region. Let α be a threshold value that determines the proportion of the lowdensity region relative to the sample space. Let L(f α ) denote a subset of the sample space, so that the (100 α)% low-density region is shown as L(f α ) = {x : f(x) f α }, where f α is the largest constant such that Pr{x L(f α )} α. Let 1 if x j L(f α ) I j = 0 otherwise, for j = 1, 2,, n. Let h (1) denote the bandwidth vector assigned to the observations in L(f α ) and h (0) the bandwidth vector assigned to those observations outside L(f α ). The kernel density estimator is ˆf h (1),h (0)(x) = 1 n { I j K ( (x x j )./h (1))./h (1) n j=1 + (1 I j )K ( (x x j )./h (0))./h (0) }, (4) where./ denotes division by elements. The leave-one-out estimator is denoted as ˆf h (1),h (0),i(x i ) for i = 1, 2,, n. As the low-density region becomes the tail area when the underlying density is unimodal, we call (4) the tail-adaptive 8

9 estimator for simplicity. The value of α can be chosen as either 5% or 10%. As f(x) is unknown, the initial value of f α can be approximated through the kernel density estimator of f(x) using a global bandwidth Posterior of bandwidth parameters Given h (1) and h (0), the approximate likelihood is l(x 1, x 2,..., x n h (1), h (0) ) = n i=1 ˆf h (1),h (0),i(x i ). (5) As suggested by Zhang et al. (2006), we assume that the prior of each bandwidth to be the truncated Cauchy density p(h (l) k ) = h (l) k, for h (l) h(l) k > 0, k for k = 1, 2,, d, and l = 0 and 1. The posterior of h (1) and h (0) for given {x 1, x 2,..., x n } is { n π(h (1), h (0) x 1, x 2,, x n ) i=1 ˆf h (1),h (0),i(x i ) } { d k=1 } p(h (1) k ) p(h(0) k ). (6) The posterior given by (6) is of non-standard form, and we cannot derive an analytical expression as the estimate of {h (1), h (0) }. However, we can use the random-walk Metropolis algorithm to sample {h (1), h (0) } from (6). The sampling procedure is as follows. 1) Obtain initial low-density regions for a given α based on kernel density estimator with a global bandwidth vector chosen via NRR. 2) Assign initial values to h (1) and h (0), which are respectively, the bandwidth vectors given to observations within the low- and high-density regions specified in Step 1). 9

10 3) Let h denote the vector of all elements of h (1) and h (0). Use the randomwalk Metropolis algorithm to update h with the acceptance probability computed through the posterior given by (6). 4) Derive the low-density region according the density estimator with the bandwidth vectors updated in Step 3). 5) Repeat Steps 3) and 4) until the simulated chain of h achieves reasonable mixing performance. During the above iteration process, we discard the draws during the burnin period, and record the draws of h thereafter. Let { h (1), h (2),, h (M) } denote the recorded draws. The posterior mean (or ergodic average) denoted as M i=1 h (i) /M, is an estimate of h. 3. A Monte Carlo simulation study To investigate the performance of the proposed tail-adaptive kernel density estimator, we approximate Kullback-Leibler information via Monte Carlo simulation. A large number of random vectors are drawn from f(x) so as to approximate (2) numerically. A bandwidth estimation method is better than its competitor if the former produces a smaller Kullback-Leibler information than the former True densities We simulate samples from six target bivariate densities labeled A, B, C, D, E and F 1. These densities are of irregular shapes. Density A is a mixture of 1 Refer to Hu et al. (2010) for univariate target densities, their contour plots and the simulation results. 10

11 two equally weighted normal densities with bimodality: f A (x µ 1, Σ 1, µ 2, Σ 2 ) = 1 φ (x µ 2 1, Σ 1 ) + 1φ (x µ 2 2, Σ 2 ), where φ(x µ, Σ) is a multivariate normal density with mean µ and variance-covariance matrix Σ = 1 ρ. (7) ρ 1 We used µ 1 = ( 1.5, 1.5), µ 2 = (2, 2), ρ = 0.3 for Σ 1 and ρ = 0.9 for Σ 2. Density B is a mixture of two normal densities with different weights but an equal height at the modes: f B (x µ 1, Σ 1, µ 2, Σ 2 ) = 3 4 φ (x µ 1, Σ 1 ) + 1 φ (x µ 4 2, Σ 2 ), where µ 1 = ( 1.5, 1.5) and µ 2 = (1.5, 1.5). Σ is defined in (7) with ρ = 0.5 in Σ 1, and Σ 2 = 1 3 Σ 1. Density C is a mixture of two skew-normal densities proposed by Azzalini and Valle (1996): f C (x µ 1, γ 1, µ 2, γ 2, Σ) = 1 2 2φ(x µ 1, Σ)Φ (γ 1(x µ 1 )) + 1 2φ(x µ 2 2, Σ)Φ (γ 2(x µ 2 )), where Φ( ) is the cumulative density function (CDF) of the standard normal distribution. We chose µ 1 = ( 0.5, 0.5), γ 1 = ( 9, 9), µ 2 = (0, 0), γ 2 = (9, 9) and Σ is defined in (7) with ρ = 0.3. Note that γ 1, γ 2 R 2 are the shape parameters determining the skewness. Density D is a Student t density denoted as t d (x µ, Σ, ν), Density E is a mixture of two Student t densities with ν = 5: f E (x µ 1, µ 2, Σ, ν) = 0.5 t d (x µ 1, Σ 1, ν) t d (x µ 2, Σ 2, ν), where µ 1 = ( 2, 0), µ 2 = (2, 0), and Σ 1 and Σ 2 are defined in (7) with ρ = 0.5 and 0.5, respectively. Density F is a skew-t density proposed by Azzalini and Capitanio (2003): f F (x µ, Σ, α, ν) = 2 t d (x µ, Σ, ν)t d ( x ν + d), 11

12 ( ) where x = γ ω 1 ν+d 1/2, (x µ) (x µ) Σ 1 (x µ)+ν µ = (0, 0), γ = ( 2, 0), Σ is an identity matrix. ω is a diagonal matrix with diagonal elements the same as those of Σ, and T d ( ν + d) is the CDF of the Student t distribution with ν + d degrees of freedom. The contour plot of each of the six bivariate densities can be found in Figure 2 of Hu et al. (2010) Accuracy of our Bayesian bandwidth estimation We generated samples of n = 500, 1000, 2000 from each of the six bivariate densities. The kernel function is the product of univariate Gaussian kernels. We estimated the bandwidth vectors for the proposed tail-adaptive kernel density estimator with α = 0.05 and 0.1. A global bandwidth vector is also estimated via the Bayesian method of Zhang et al. (2006) and NRR. We used the random-walk Metropolis algorithm to sample all bandwidths from their posterior. The burn-in period contains 3,000 iterations, and the following 10,000 iterations were recorded. We computed the batch-mean standard deviation discussed by Roberts (1996) and the simulation inefficient factor (SIF) discussed by Kim et al. (1998) to monitor the mixing performance. Both indicators show that the sampler has achieved a reasonable mixing performance. Refer to Hu et al. (2010) for graphs on visual inspection of the mixing performance. Consider a sample generated from f F (x) with α = 0.05 and sample size Table 1 presents the MCMC results, where the SIFs are very small and indicate a reasonable mixing performance of the proposed sampler. The tailadaptive density estimator clearly captures the fat-tailed feature of f F. For example, the estimates of both components of h (1) for observations inside the 12

13 low-density region are respectively, much larger than the estimates of both components of h (0) for observations outside this region. We generated N =100,000 random vectors from the true density and calculated the estimated Kullback-Leibler information defined by (2). As shown in Table 2, among all six densities considered, the tail-adaptive density estimator obviously performs better than the global-bandwidth density estimators with the bandwidth vector estimated through Bayesian sampling and NRR. The results also indicate that the performance of the tail-adaptive density estimator is not very sensitive to different values of α. The mean integrated squared error (MISE) was also used to examine the performance of tail-adaptive density estimator. We numerically approximate the MISE through 200 data sets for each the bivariate densities with sample size 500, 1000 and Table 3 shows that the lower-adaptive estimator always outperforms the global-bandwidth estimator. 4. Tail-adaptive density estimation for high dimensions Our proposed Bayesian sampling algorithm for estimating bandwidths in tail-adaptive kernel density estimation is applicable to data of any dimension. In this section, we aim to examine the performance of the tail-adaptive estimator in comparison with the global-bandwidth estimator True densities We consider four target densities labeled G, H, I and J. Density G is a mixture of two multivariate normal densities: f G (x µ 1, µ 2, Σ 1, Σ 2 ) = 1 2 φ (x µ 1, Σ 1 ) φ (x µ 2, Σ 2 ), 13

14 where µ 1 = ( 1.5, 1.5, 1.5, 1.5, 1.5), µ 2 = (2, 2, 2, 2, 2), and both variance-covariance matrices have the form of 1 ρ ρ 2 ρ 3 ρ 4 Σ = 1 ρ 1 ρ ρ 2 ρ 3 1 ρ 2.., (8). ρ 4 ρ 3 ρ 2 ρ 1 with ρ = 0.3 for Σ 1 and ρ = 0.9 for Σ 2. Density H is a multivariate skew-normal densities: f H (x µ, Σ, α) = 2φ (x µ, Σ) Φ (γ (x µ)), where Σ is given by (8) with ρ = 0.9, µ = ( 0.5, 0.5, 0.5, 0.5, 0.5), and γ = ( 9, 9, 9, 9, 9). Density I is a mixture of two multivariate Student t densities: f I (x µ 1, µ 2, Σ 1, Σ 2, ν) = 0.5 t d (x µ 1, Σ 1, ν) t d (x µ 2, Σ 2, ν), where µ 1 = ( 2, 0, 2, 0, 2), µ 2 = (2, 0, 2, 0, 2), ν = 5, and both Σ 1 and Σ 1 are defined by (8) with ρ = 0.5 and ρ = 0.5, respectively. Density J is a multivariate skew-t densities: f J (x µ, Σ, α, ν) = 2t d (x µ, Σ, ν) T d ( x ν + d), where µ = 0, ν = 5, Σ is an identity matrix, and x is defined in a similar way as that in the bivariate f F (x) with γ = (2, 0, 2, 0, 2) Accuracy of our Bayesian bandwidth estimation We generated samples of sizes n = 500, 1000, 2000 from each of the fivedimensional densities. Table 4 presents the estimated Kullback-Leibler information between the true density and its estimator resulted from each of 14

15 the three bandwidth estimation methods. The kernel density estimator with tail-adaptive bandwidth vectors obviously outperforms its counterpart with a global bandwidth vector estimated via either the NRR or Bayesian sampling. This finding is consistent with what we found in the bivariate situation. For all sample sizes of each density, we found that the tail-adaptive kernel density estimator with α = 0.1 slightly outperforms the same estimator with α = However, we would be reluctant to make a decision as to whether the former performs better than the latter because such a is marginal. Due to the heavy burden required by the numerical computation of MISE for five-dimensional densities, we have not computed MISE in this section. 5. An application of the tail-adaptive density estimator In this section, we apply the proposed tail-adaptive kernel density estimator to the estimation of bivariate density of stock-index returns. We downloaded the daily closing index values of the S&P 500 index in the USA stock market and the All Ordinaries (AORD) in the Australian stock market, from the 2nd January 2006 to the 16th September 2010 excluding nontrading days, with sample size The AORD return was matched to the overnight S&P 500 return. Let P t denote the closing index at date t. The daily continuously compounded returns in percentage form was computed as (ln P t ln P t 1 ) 100. The sample period covers the period of current global financial crisis, and there were some extreme observations. The Pearson correlation coefficient between the two return series is We used the random-walk Metropolis algorithm to estimate bandwidths for the tail-adaptive kernel density estimator of the bivariate returns, with 15

16 α = We also estimated the global bandwidth via the NRR and Bayesian sampling. The results are given in Table 5, where the SIFs indicate reasonable mixing performance achieved by both sampling algorithms. Moreover, the use of tail-adaptive bandwidths leads to a much larger marginal likelihood than the use of a global bandwidth in this bivariate kernel density estimator, where the marginal likelihoods were calculated through Newton and Raftery (1994). Thus, the use of tail-adaptive bandwidths is strongly favored against the use of a global bandwidth. The density surface and contour plot for each of the three density estimators are presented in Hu et al. (2010), where the tail-adaptive estimator is shown to be able to capture richer dynamics than the other tow. Let x t and y t denote the S&P 500 and AORD daily returns, respectively. We used the estimated tail-adaptive bandwidths to compute the conditional density of y t at a given value of x t through f(y x t = x) = f(y, x)/f x (x), where f(y, x) is the joint density of (y t, x t ), and f x (x) is the marginal density of x t. Bandwidths estimated for a joint density can be used for the purpose to compute conditional density (Holmes et al., 2010; Polak et al., 2010). We computed the conditional density of x t given that y t is at the 10th, 5th and 1st percentiles, which are corresponding to percentage return values of -0.75, -1.13, and -2.24, respectively. We are also able to compute the conditional probability Pr{y t y x t x} = Pr{y t y, x t x}/pr{x t x}. For example, Pr{y t 0 x t 0} = It means that when the USA stock market finished daily trading with a negative return, there was a 67% chance that the Australian market would also drop. Such a percentage suggests that the Australian market followed 16

17 the USA market during this period. However, this result should not be interpreted as blaming the problems in Australian stock market to the USA stock market, while this is simply an evidence of market dependence. In addition, we can estimate the conditional CDF of y t for given x t = x: F (y x t = x) = Pr{y t y x t = x} = y f(z, x) dz. (9) f x (x) Given that the USA market finished daily trading with the S&P 500 return being at x%, the probability that the Australian market dropped beyond the same level is indicated by letting y = x in (9). At the above-mentioned percentiles of the S&P 500 return, the probability that the AORD return were below the same levels are 0.27, 0.24 and 0.12, respectively. It indicates that the probability that the Australian market had a larger drop than the USA market was no more than 27%. See Hu et al. (2010) for the graphs of the conditional density and its CDF. Moreover, we evaluated the performance of our VaR in comparison to the VaR obtained through IGARCH model (or the Riskmetrics). We followed the steps described in Bao et al. (2006) and calculated the check function of Koenker and Bassett Jr (1978). The existing sample was used as the learning set, and we forecasted daily 5% VaR of AORD from 17th September 2010 to 5th May The check function computed by IGARCH and our proposed method are and , respectively. This finding indicates that our approach to VaR estimation is more effective than the Riskmetrics. 6. Conclusion This paper proposes a kernel density estimator with tail-adaptive bandwidths, which are assigned to the low- and high-density regions, respectively. 17

18 We have derived the posterior of bandwidth parameters based on Kullback- Leibler information and presented an MCMC sampling algorithm to estimate bandwidths. The Monte Carlo simulation study shows that the tail-adaptive kernel density estimator outperforms its competitor, the global-bandwidth kernel density estimator. The simulation result also shows that the improvement made by the tail-adaptive kernel density estimator is especially obvious when the underlying density is fat-tailed. Although the probability of the low-density region has to be chosen before sampling is carried out, we have found that performance the tail-adaptive kernel estimator is not sensitive to different values of this probability. Future search could include this probability as a parameter to be estimated via the same sampling procedure. We applied the tail-adaptive kernel density estimator to the estimation of bivariate density of the paired daily returns of the Australian Ordinary and S&P 500 indices during a period that covers the global financial crisis. The tail-adaptive density estimator is strongly favored against the globalbandwidth density estimator. We derived the estimated conditional density and distribution of the Australian index return given that the USA market finished daily trading with different return values. Although the Australian stock market followed the USA market during the crisis, there was no more than 27% chance that the former market had a larger drop than the latter. Our approach can be viewed as mode estimation in clustering analysis. Even though our algorithm can be implemented to data of any dimension, the curse of dimensionality is a major concern for kernel density estimation. Ferraty and Vieu (2006) suggested attacking the dimensionality problem from a functional setting even for data of infinite dimensions. It might be possi- 18

19 ble to explore whether the proposed Bayesian approach is applicable in this situation. We leave it for future research. Acknowledgements We extend our sincere thanks to the Co-Editor Stanley Azen, an associate editor and two reviewers for their very insightful comments that have led to a substantially improved paper. Thanks also go to the Victorian Partnership for Advanced Computing (VPAC) for its quality computing facility. References Abramson, I., 1982a. Arbitrariness of the pilot estimator in adaptive kernel methods. Journal of Multivariate Analysis 12, Abramson, I., 1982b. On bandwidth variation in kernel estimates a square root law. The Annals of Statistics 10, Azzalini, A., Capitanio, A., Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t-distribution. Journal of the Royal Statistical Society, Series B 65, Azzalini, A., Valle, A., Biometrika 83, The multivariate skew-normal distribution. Bao, Y., Lee, T., Saltoglu, B., Evaluating predictive performance of value-at-risk models in emerging markets: a reality check. Journal of Forecasting 25,

20 Breiman, L., Meisel, W., Purcell, E., Variable kernel estimates of multivariate densities. Technometrics 19, Brewer, M., A Bayesian model for local smoothing in kernel density estimation. Statistics and Computing 10, Cao, R., Relative efficiency of local bandwidths in kernel density estimation. Statistics 35, Devroye, L., Györfi, L., Nonparametric Density Estimation: The L 1 View. John Wiley & Sons, New York. Duong, T., Hazelton, M., Plug-in bandwidth matrices for bivariate kernel density estimation. Journal of Nonparametric Statistics 15, Duong, T., Hazelton, M., Cross-validation bandwidth matrices for multivariate kernel density estimation. Scandinavian Journal of Statistics 32, Ferraty, F., Vieu, P., Nonparametric Functional Data Analysis: Theory and Practice. Springer, New York. Gangopadhyay, A., Cheung, K., Bayesian approach to the choice of smoothing parameter in kernel density estimation. Journal of Nonparametric Statistics 14, Hall, P., On Kullback-Leibler loss and density estimation. The Annals of Statistics 15, Härdle, W., Smoothing Techniques: with Implementation in S. Springer, New York. 20

21 Hartigan, J.A., Clustering Algorithms. John Wiley & Sons, New York. Hartigan, J.A., Estimation of a convex density contour in two dimensions. Journal of the American Statistical Association 82, Holmes, M.P., Gray, A.G., Isbell Jr, C.L., Fast kernel conditional density estimation: A dual-tree Monte Carlo approach. Computational Statistics & Data Analysis 54, Horová, I., Vieu, P., Zelinka, H., Optimal choice of nonparametric estimaties of a density and of its derivatives. Statistics & Decisions 20, Hu, S., Poskitt, D., Zhang, X., Bayesian Adaptive Bandwidth Kernel Density Estimation of Irregular Multivariate Distributions. Monash Econometrics and Business Statistics Working Papers. Hyndman, R.J., Computing and graphing highest density regions. The American Statistician 50, Izenman, A.J., Recent developments in nonparametric density estimation. Journal of the American Statistical Association 86, Jácome, M., Gijbels, I., Cao, R., Comparison of presmoothing methods in kernel density estimation under censoring. Computational Statistics 23, Jones, M.C., Variable kernel density estimates and variable kernel density estimates. Australian & New Zealand Journal of Statistics 32,

22 Jones, M.C., Marron, J.S., Sheather, S.J., A brief survey of bandwidth selection for density estimation. Journal of the American Statistical Association 91, Kim, S., Shepherd, N., Chib, S., Stochastic volatility: Likelihood inference and comparison with ARCH models. Review of Economic Studies 65, Koenker, R., Bassett Jr, G., Regression quantiles. Econometrica 46, Kulasekera, K.B., Padgett, W.J., Bayes bandwidth selection in kernel density estimation with censored data. Journal of Nonparametric Statistics 18, de Lima, M., Atuncar, G., A bayesian method to estimate the optimal bandwidth for multivariate kernel estimator. Journal of Nonparametric Statistics 23, Loftsgaarden, D.O., Quesenberry, C.P., A nonparametric estimate of a multivariate density function. The Annals of Mathematical Statistics 36, Marron, J., Bootstrap bandwidth selection, in: LePage, R., Billard, L. (Eds.), Exploring the Limits of Boostrap. John Wiley & Sons, New York, pp Marron, J.S., A comparison of cross-validation techniques in density estimation. The Annals of Statistics 15, pp

23 Marron, J.S., Nolan, D., Canonical kernels for density estimation. Statistics & Probability Letters 7, Mason, D.M., Polonik, W., Asymptotic normality of plug-in level set estimates. The Annals of Applied Probability 19, Mielniczuk, J., Sarda, P., Vieu, P., Local data-driven bandwidth choice for density estimation. Journal of Statistical Planning and Inference 23, Newton, M.A., Raftery, A.E., Approximate Bayesian inference with the weighted likelihood bootstrap. Journal of the Royal Statistical Society, Series B 56, Nolan, D., Marron, J., Uniform consistency of automatic and locationadaptive delta-sequence estimators. Probability Theory and Related Fields 80, Polak, J., Zhang, X., King, M.L., Bandwidth selection for kernel conditional density estimation using the MCMC method. Manuscript presented at Australian Statistical Conference, 6-10 December, Western Australia. Roberts, G.O., Markov chain concepts related to sampling algorithms, in: Gilks, W.R., Richardson, S., Spiegelhalter, D.J. (Eds.), Markov Chain Monte Carlo in Practice. Chapman & Hall, London, pp Sain, S.R., Multivariate locally adaptive density estimation. Computational Statistics & Data Analysis 39,

24 Sain, S.R., Baggerly, K.A., Scott, D.W., Cross-validation of multivariate densities. Journal of the American Statistical Association 89, Sain, S.R., Scott, D.W., On locally adaptive density estimation. Journal of the American Statistical Association 91, Sain, S.R., Scott, D.W., Zero-bias locally adaptive density estimators. Scandinavian Journal of Statistics 29, Samworth, R.J., Wand, M.P., Asymptotics and optimal bandwidth selection for highest density region estimation. The Annals of Statistics 38, Scott, D.W., Multivariate Density Estimation: Theory, Practice, and Visualization. John Wiley & Sons, New York. Terrell, G.R., Scott, D.W., Variable kernel density estimation. The Annals of Statistics 20, Vieu, P., Multiple kernel procedure: An asymptotic support. Scandinavian Journal of Statistics 26, Wand, M.P., Jones, M.C., Multivariate plug-in bandwidth selection. Computational Statistics 9, Wand, M.P., Jones, M.C., Kernel Smoothing. Chapman & Hall, New York. Zhang, X., King, M.L., Hyndman, R.J., A Bayesian approach to bandwidth selection for multivariate kernel density estimation. Computational Statistics & Data Analysis 50,

25 Table 1: MCMC results obtained based on a sample generated from density F Bandwidths Mean Standard Batch-mean SIF Acceptance deviation standard deviation rate h (1) h (1) h (0) h (0) Table 2: Estimated Kullback-Leibler information for bivariate densities Global-bandwidth Tail-adaptive bandwidth Density n NRR Bayesian α = 0.05 α = 0.10 f A f B f C f D f E f F

26 Table 3: Estimated MISE ( 100) for bivariate densities Global-bandwidth Tail-adaptive bandwidth Density n NRR Bayesian α = 0.05 α = 0.10 f A f B f C f D f E f F

27 Table 4: Estimated Kullback-Leibler information for 5-dimensional densities Global-bandwidth Tail-adaptive bandwidth Density n NRR Bayesian α = 0.05 α = 0.10 f G f H f I f J Table 5: A summary of MCMC results obtained through our proposed Bayesian sampling algorithm to the tail-adaptive kernel density estimator of the S&P500 and AORD returns Bandwidths Mean Standard SIF Acceptance log marginal NRR h h deviation rate likelihood Bayesian global h bandwidth h Tail-adaptive h (1) bandwidth h (1) with α = 0.05 h (0) h (0)

Bayesian estimation of bandwidths for a nonparametric regression model with a flexible error density

Bayesian estimation of bandwidths for a nonparametric regression model with a flexible error density ISSN 1440-771X Australia Department of Econometrics and Business Statistics http://www.buseco.monash.edu.au/depts/ebs/pubs/wpapers/ Bayesian estimation of bandwidths for a nonparametric regression model

More information

A Bayesian approach to parameter estimation for kernel density estimation via transformations

A Bayesian approach to parameter estimation for kernel density estimation via transformations A Bayesian approach to parameter estimation for kernel density estimation via transformations Qing Liu,, David Pitt 2, Xibin Zhang 3, Xueyuan Wu Centre for Actuarial Studies, Faculty of Business and Economics,

More information

Bayesian Semiparametric GARCH Models

Bayesian Semiparametric GARCH Models Bayesian Semiparametric GARCH Models Xibin (Bill) Zhang and Maxwell L. King Department of Econometrics and Business Statistics Faculty of Business and Economics xibin.zhang@monash.edu Quantitative Methods

More information

Gaussian kernel GARCH models

Gaussian kernel GARCH models Gaussian kernel GARCH models Xibin (Bill) Zhang and Maxwell L. King Department of Econometrics and Business Statistics Faculty of Business and Economics 7 June 2013 Motivation A regression model is often

More information

Bayesian Semiparametric GARCH Models

Bayesian Semiparametric GARCH Models Bayesian Semiparametric GARCH Models Xibin (Bill) Zhang and Maxwell L. King Department of Econometrics and Business Statistics Faculty of Business and Economics xibin.zhang@monash.edu Quantitative Methods

More information

Bayesian semiparametric GARCH models

Bayesian semiparametric GARCH models ISSN 1440-771X Australia Department of Econometrics and Business Statistics http://www.buseco.monash.edu.au/depts/ebs/pubs/wpapers/ Bayesian semiparametric GARCH models Xibin Zhang and Maxwell L. King

More information

A New Procedure for Multiple Testing of Econometric Models

A New Procedure for Multiple Testing of Econometric Models A New Procedure for Multiple Testing of Econometric Models Maxwell L. King 1, Xibin Zhang, and Muhammad Akram Department of Econometrics and Business Statistics Monash University, Australia April 2007

More information

Marginal Specifications and a Gaussian Copula Estimation

Marginal Specifications and a Gaussian Copula Estimation Marginal Specifications and a Gaussian Copula Estimation Kazim Azam Abstract Multivariate analysis involving random variables of different type like count, continuous or mixture of both is frequently required

More information

Chapter 1. Bayesian Inference for D-vines: Estimation and Model Selection

Chapter 1. Bayesian Inference for D-vines: Estimation and Model Selection Chapter 1 Bayesian Inference for D-vines: Estimation and Model Selection Claudia Czado and Aleksey Min Technische Universität München, Zentrum Mathematik, Boltzmannstr. 3, 85747 Garching, Germany cczado@ma.tum.de

More information

Stock index returns density prediction using GARCH models: Frequentist or Bayesian estimation?

Stock index returns density prediction using GARCH models: Frequentist or Bayesian estimation? MPRA Munich Personal RePEc Archive Stock index returns density prediction using GARCH models: Frequentist or Bayesian estimation? Ardia, David; Lennart, Hoogerheide and Nienke, Corré aeris CAPITAL AG,

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As

More information

Monte Carlo Integration using Importance Sampling and Gibbs Sampling

Monte Carlo Integration using Importance Sampling and Gibbs Sampling Monte Carlo Integration using Importance Sampling and Gibbs Sampling Wolfgang Hörmann and Josef Leydold Department of Statistics University of Economics and Business Administration Vienna Austria hormannw@boun.edu.tr

More information

J. Cwik and J. Koronacki. Institute of Computer Science, Polish Academy of Sciences. to appear in. Computational Statistics and Data Analysis

J. Cwik and J. Koronacki. Institute of Computer Science, Polish Academy of Sciences. to appear in. Computational Statistics and Data Analysis A Combined Adaptive-Mixtures/Plug-In Estimator of Multivariate Probability Densities 1 J. Cwik and J. Koronacki Institute of Computer Science, Polish Academy of Sciences Ordona 21, 01-237 Warsaw, Poland

More information

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations John R. Michael, Significance, Inc. and William R. Schucany, Southern Methodist University The mixture

More information

A Novel Nonparametric Density Estimator

A Novel Nonparametric Density Estimator A Novel Nonparametric Density Estimator Z. I. Botev The University of Queensland Australia Abstract We present a novel nonparametric density estimator and a new data-driven bandwidth selection method with

More information

Practical Bayesian Quantile Regression. Keming Yu University of Plymouth, UK

Practical Bayesian Quantile Regression. Keming Yu University of Plymouth, UK Practical Bayesian Quantile Regression Keming Yu University of Plymouth, UK (kyu@plymouth.ac.uk) A brief summary of some recent work of us (Keming Yu, Rana Moyeed and Julian Stander). Summary We develops

More information

eqr094: Hierarchical MCMC for Bayesian System Reliability

eqr094: Hierarchical MCMC for Bayesian System Reliability eqr094: Hierarchical MCMC for Bayesian System Reliability Alyson G. Wilson Statistical Sciences Group, Los Alamos National Laboratory P.O. Box 1663, MS F600 Los Alamos, NM 87545 USA Phone: 505-667-9167

More information

ECONOMICS 7200 MODERN TIME SERIES ANALYSIS Econometric Theory and Applications

ECONOMICS 7200 MODERN TIME SERIES ANALYSIS Econometric Theory and Applications ECONOMICS 7200 MODERN TIME SERIES ANALYSIS Econometric Theory and Applications Yongmiao Hong Department of Economics & Department of Statistical Sciences Cornell University Spring 2019 Time and uncertainty

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Local linear multiple regression with variable. bandwidth in the presence of heteroscedasticity

Local linear multiple regression with variable. bandwidth in the presence of heteroscedasticity Local linear multiple regression with variable bandwidth in the presence of heteroscedasticity Azhong Ye 1 Rob J Hyndman 2 Zinai Li 3 23 January 2006 Abstract: We present local linear estimator with variable

More information

Tail dependence in bivariate skew-normal and skew-t distributions

Tail dependence in bivariate skew-normal and skew-t distributions Tail dependence in bivariate skew-normal and skew-t distributions Paola Bortot Department of Statistical Sciences - University of Bologna paola.bortot@unibo.it Abstract: Quantifying dependence between

More information

Accounting for Missing Values in Score- Driven Time-Varying Parameter Models

Accounting for Missing Values in Score- Driven Time-Varying Parameter Models TI 2016-067/IV Tinbergen Institute Discussion Paper Accounting for Missing Values in Score- Driven Time-Varying Parameter Models André Lucas Anne Opschoor Julia Schaumburg Faculty of Economics and Business

More information

Nonparametric confidence intervals. for receiver operating characteristic curves

Nonparametric confidence intervals. for receiver operating characteristic curves Nonparametric confidence intervals for receiver operating characteristic curves Peter G. Hall 1, Rob J. Hyndman 2, and Yanan Fan 3 5 December 2003 Abstract: We study methods for constructing confidence

More information

Asymptotic distribution of the sample average value-at-risk

Asymptotic distribution of the sample average value-at-risk Asymptotic distribution of the sample average value-at-risk Stoyan V. Stoyanov Svetlozar T. Rachev September 3, 7 Abstract In this paper, we prove a result for the asymptotic distribution of the sample

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

A NEW PROCEDURE FOR MULTIPLE TESTING OF ECONOMETRIC MODELS

A NEW PROCEDURE FOR MULTIPLE TESTING OF ECONOMETRIC MODELS ISSN 1440-771X Australia Department of Econometrics and Business Statistics http://www.buseco.monash.edu.au/depts/ebs/pubs/wpapers/ A NEW PROCEDURE FOR MULTIPLE TESTING OF ECONOMETRIC MODELS Maxwell L.

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

Slice Sampling Mixture Models

Slice Sampling Mixture Models Slice Sampling Mixture Models Maria Kalli, Jim E. Griffin & Stephen G. Walker Centre for Health Services Studies, University of Kent Institute of Mathematics, Statistics & Actuarial Science, University

More information

A nonparametric method of multi-step ahead forecasting in diffusion processes

A nonparametric method of multi-step ahead forecasting in diffusion processes A nonparametric method of multi-step ahead forecasting in diffusion processes Mariko Yamamura a, Isao Shoji b a School of Pharmacy, Kitasato University, Minato-ku, Tokyo, 108-8641, Japan. b Graduate School

More information

Session 5B: A worked example EGARCH model

Session 5B: A worked example EGARCH model Session 5B: A worked example EGARCH model John Geweke Bayesian Econometrics and its Applications August 7, worked example EGARCH model August 7, / 6 EGARCH Exponential generalized autoregressive conditional

More information

Teruko Takada Department of Economics, University of Illinois. Abstract

Teruko Takada Department of Economics, University of Illinois. Abstract Nonparametric density estimation: A comparative study Teruko Takada Department of Economics, University of Illinois Abstract Motivated by finance applications, the objective of this paper is to assess

More information

Infinite-State Markov-switching for Dynamic. Volatility Models : Web Appendix

Infinite-State Markov-switching for Dynamic. Volatility Models : Web Appendix Infinite-State Markov-switching for Dynamic Volatility Models : Web Appendix Arnaud Dufays 1 Centre de Recherche en Economie et Statistique March 19, 2014 1 Comparison of the two MS-GARCH approximations

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods

Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods Robert V. Breunig Centre for Economic Policy Research, Research School of Social Sciences and School of

More information

Likelihood-free MCMC

Likelihood-free MCMC Bayesian inference for stable distributions with applications in finance Department of Mathematics University of Leicester September 2, 2011 MSc project final presentation Outline 1 2 3 4 Classical Monte

More information

Markov Switching Regular Vine Copulas

Markov Switching Regular Vine Copulas Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session CPS057) p.5304 Markov Switching Regular Vine Copulas Stöber, Jakob and Czado, Claudia Lehrstuhl für Mathematische Statistik,

More information

Local Polynomial Wavelet Regression with Missing at Random

Local Polynomial Wavelet Regression with Missing at Random Applied Mathematical Sciences, Vol. 6, 2012, no. 57, 2805-2819 Local Polynomial Wavelet Regression with Missing at Random Alsaidi M. Altaher School of Mathematical Sciences Universiti Sains Malaysia 11800

More information

Bagging During Markov Chain Monte Carlo for Smoother Predictions

Bagging During Markov Chain Monte Carlo for Smoother Predictions Bagging During Markov Chain Monte Carlo for Smoother Predictions Herbert K. H. Lee University of California, Santa Cruz Abstract: Making good predictions from noisy data is a challenging problem. Methods

More information

O Combining cross-validation and plug-in methods - for kernel density bandwidth selection O

O Combining cross-validation and plug-in methods - for kernel density bandwidth selection O O Combining cross-validation and plug-in methods - for kernel density selection O Carlos Tenreiro CMUC and DMUC, University of Coimbra PhD Program UC UP February 18, 2011 1 Overview The nonparametric problem

More information

Downloaded from:

Downloaded from: Camacho, A; Kucharski, AJ; Funk, S; Breman, J; Piot, P; Edmunds, WJ (2014) Potential for large outbreaks of Ebola virus disease. Epidemics, 9. pp. 70-8. ISSN 1755-4365 DOI: https://doi.org/10.1016/j.epidem.2014.09.003

More information

WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract

WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract Journal of Data Science,17(1). P. 145-160,2019 DOI:10.6339/JDS.201901_17(1).0007 WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION Wei Xiong *, Maozai Tian 2 1 School of Statistics, University of

More information

SUPPLEMENT TO MARKET ENTRY COSTS, PRODUCER HETEROGENEITY, AND EXPORT DYNAMICS (Econometrica, Vol. 75, No. 3, May 2007, )

SUPPLEMENT TO MARKET ENTRY COSTS, PRODUCER HETEROGENEITY, AND EXPORT DYNAMICS (Econometrica, Vol. 75, No. 3, May 2007, ) Econometrica Supplementary Material SUPPLEMENT TO MARKET ENTRY COSTS, PRODUCER HETEROGENEITY, AND EXPORT DYNAMICS (Econometrica, Vol. 75, No. 3, May 2007, 653 710) BY SANGHAMITRA DAS, MARK ROBERTS, AND

More information

Adaptive Nonparametric Density Estimators

Adaptive Nonparametric Density Estimators Adaptive Nonparametric Density Estimators by Alan J. Izenman Introduction Theoretical results and practical application of histograms as density estimators usually assume a fixed-partition approach, where

More information

A note on Reversible Jump Markov Chain Monte Carlo

A note on Reversible Jump Markov Chain Monte Carlo A note on Reversible Jump Markov Chain Monte Carlo Hedibert Freitas Lopes Graduate School of Business The University of Chicago 5807 South Woodlawn Avenue Chicago, Illinois 60637 February, 1st 2006 1 Introduction

More information

Density Estimation. Seungjin Choi

Density Estimation. Seungjin Choi Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

Adaptive Monte Carlo methods

Adaptive Monte Carlo methods Adaptive Monte Carlo methods Jean-Michel Marin Projet Select, INRIA Futurs, Université Paris-Sud joint with Randal Douc (École Polytechnique), Arnaud Guillin (Université de Marseille) and Christian Robert

More information

A Non-Parametric Approach of Heteroskedasticity Robust Estimation of Vector-Autoregressive (VAR) Models

A Non-Parametric Approach of Heteroskedasticity Robust Estimation of Vector-Autoregressive (VAR) Models Journal of Finance and Investment Analysis, vol.1, no.1, 2012, 55-67 ISSN: 2241-0988 (print version), 2241-0996 (online) International Scientific Press, 2012 A Non-Parametric Approach of Heteroskedasticity

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate

More information

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Sahar Z Zangeneh Robert W. Keener Roderick J.A. Little Abstract In Probability proportional

More information

Or How to select variables Using Bayesian LASSO

Or How to select variables Using Bayesian LASSO Or How to select variables Using Bayesian LASSO x 1 x 2 x 3 x 4 Or How to select variables Using Bayesian LASSO x 1 x 2 x 3 x 4 Or How to select variables Using Bayesian LASSO On Bayesian Variable Selection

More information

Does k-th Moment Exist?

Does k-th Moment Exist? Does k-th Moment Exist? Hitomi, K. 1 and Y. Nishiyama 2 1 Kyoto Institute of Technology, Japan 2 Institute of Economic Research, Kyoto University, Japan Email: hitomi@kit.ac.jp Keywords: Existence of moments,

More information

Bayesian Modeling of Conditional Distributions

Bayesian Modeling of Conditional Distributions Bayesian Modeling of Conditional Distributions John Geweke University of Iowa Indiana University Department of Economics February 27, 2007 Outline Motivation Model description Methods of inference Earnings

More information

A Bootstrap Test for Conditional Symmetry

A Bootstrap Test for Conditional Symmetry ANNALS OF ECONOMICS AND FINANCE 6, 51 61 005) A Bootstrap Test for Conditional Symmetry Liangjun Su Guanghua School of Management, Peking University E-mail: lsu@gsm.pku.edu.cn and Sainan Jin Guanghua School

More information

The Bias-Variance dilemma of the Monte Carlo. method. Technion - Israel Institute of Technology, Technion City, Haifa 32000, Israel

The Bias-Variance dilemma of the Monte Carlo. method. Technion - Israel Institute of Technology, Technion City, Haifa 32000, Israel The Bias-Variance dilemma of the Monte Carlo method Zlochin Mark 1 and Yoram Baram 1 Technion - Israel Institute of Technology, Technion City, Haifa 32000, Israel fzmark,baramg@cs.technion.ac.il Abstract.

More information

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns

More information

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances Advances in Decision Sciences Volume 211, Article ID 74858, 8 pages doi:1.1155/211/74858 Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances David Allingham 1 andj.c.w.rayner

More information

Introduction to Algorithmic Trading Strategies Lecture 10

Introduction to Algorithmic Trading Strategies Lecture 10 Introduction to Algorithmic Trading Strategies Lecture 10 Risk Management Haksun Li haksun.li@numericalmethod.com www.numericalmethod.com Outline Value at Risk (VaR) Extreme Value Theory (EVT) References

More information

7. Estimation and hypothesis testing. Objective. Recommended reading

7. Estimation and hypothesis testing. Objective. Recommended reading 7. Estimation and hypothesis testing Objective In this chapter, we show how the election of estimators can be represented as a decision problem. Secondly, we consider the problem of hypothesis testing

More information

Kernel Sequential Monte Carlo

Kernel Sequential Monte Carlo Kernel Sequential Monte Carlo Ingmar Schuster (Paris Dauphine) Heiko Strathmann (University College London) Brooks Paige (Oxford) Dino Sejdinovic (Oxford) * equal contribution April 25, 2016 1 / 37 Section

More information

The Bayesian Approach to Multi-equation Econometric Model Estimation

The Bayesian Approach to Multi-equation Econometric Model Estimation Journal of Statistical and Econometric Methods, vol.3, no.1, 2014, 85-96 ISSN: 2241-0384 (print), 2241-0376 (online) Scienpress Ltd, 2014 The Bayesian Approach to Multi-equation Econometric Model Estimation

More information

Transformations and Bayesian Density Estimation

Transformations and Bayesian Density Estimation Transformations and Bayesian Density Estimation Andrew Bean 1, Steve MacEachern, Xinyi Xu The Ohio State University 10 th Conference on Bayesian Nonparametrics June 25, 2015 1 bean.243@osu.edu Transformations

More information

Control Variates for Markov Chain Monte Carlo

Control Variates for Markov Chain Monte Carlo Control Variates for Markov Chain Monte Carlo Dellaportas, P., Kontoyiannis, I., and Tsourti, Z. Dept of Statistics, AUEB Dept of Informatics, AUEB 1st Greek Stochastics Meeting Monte Carlo: Probability

More information

JMASM25: Computing Percentiles of Skew- Normal Distributions

JMASM25: Computing Percentiles of Skew- Normal Distributions Journal of Modern Applied Statistical Methods Volume 5 Issue Article 3 --005 JMASM5: Computing Percentiles of Skew- Normal Distributions Sikha Bagui The University of West Florida, Pensacola, bagui@uwf.edu

More information

The Jackknife-Like Method for Assessing Uncertainty of Point Estimates for Bayesian Estimation in a Finite Gaussian Mixture Model

The Jackknife-Like Method for Assessing Uncertainty of Point Estimates for Bayesian Estimation in a Finite Gaussian Mixture Model Thai Journal of Mathematics : 45 58 Special Issue: Annual Meeting in Mathematics 207 http://thaijmath.in.cmu.ac.th ISSN 686-0209 The Jackknife-Like Method for Assessing Uncertainty of Point Estimates for

More information

Working Papers in Econometrics and Applied Statistics

Working Papers in Econometrics and Applied Statistics T h e U n i v e r s i t y o f NEW ENGLAND Working Papers in Econometrics and Applied Statistics Finite Sample Inference in the SUR Model Duangkamon Chotikapanich and William E. Griffiths No. 03 - April

More information

Smooth simultaneous confidence bands for cumulative distribution functions

Smooth simultaneous confidence bands for cumulative distribution functions Journal of Nonparametric Statistics, 2013 Vol. 25, No. 2, 395 407, http://dx.doi.org/10.1080/10485252.2012.759219 Smooth simultaneous confidence bands for cumulative distribution functions Jiangyan Wang

More information

Bootstrap tests of multiple inequality restrictions on variance ratios

Bootstrap tests of multiple inequality restrictions on variance ratios Economics Letters 91 (2006) 343 348 www.elsevier.com/locate/econbase Bootstrap tests of multiple inequality restrictions on variance ratios Jeff Fleming a, Chris Kirby b, *, Barbara Ostdiek a a Jones Graduate

More information

Sparse Nonparametric Density Estimation in High Dimensions Using the Rodeo

Sparse Nonparametric Density Estimation in High Dimensions Using the Rodeo Sparse Nonparametric Density Estimation in High Dimensions Using the Rodeo Han Liu John Lafferty Larry Wasserman Statistics Department Computer Science Department Machine Learning Department Carnegie Mellon

More information

Expectation Propagation for Approximate Bayesian Inference

Expectation Propagation for Approximate Bayesian Inference Expectation Propagation for Approximate Bayesian Inference José Miguel Hernández Lobato Universidad Autónoma de Madrid, Computer Science Department February 5, 2007 1/ 24 Bayesian Inference Inference Given

More information

Modeling Ultra-High-Frequency Multivariate Financial Data by Monte Carlo Simulation Methods

Modeling Ultra-High-Frequency Multivariate Financial Data by Monte Carlo Simulation Methods Outline Modeling Ultra-High-Frequency Multivariate Financial Data by Monte Carlo Simulation Methods Ph.D. Student: Supervisor: Marco Minozzo Dipartimento di Scienze Economiche Università degli Studi di

More information

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability

More information

Doing Bayesian Integrals

Doing Bayesian Integrals ASTR509-13 Doing Bayesian Integrals The Reverend Thomas Bayes (c.1702 1761) Philosopher, theologian, mathematician Presbyterian (non-conformist) minister Tunbridge Wells, UK Elected FRS, perhaps due to

More information

Online appendix to On the stability of the excess sensitivity of aggregate consumption growth in the US

Online appendix to On the stability of the excess sensitivity of aggregate consumption growth in the US Online appendix to On the stability of the excess sensitivity of aggregate consumption growth in the US Gerdie Everaert 1, Lorenzo Pozzi 2, and Ruben Schoonackers 3 1 Ghent University & SHERPPA 2 Erasmus

More information

Curve Fitting Re-visited, Bishop1.2.5

Curve Fitting Re-visited, Bishop1.2.5 Curve Fitting Re-visited, Bishop1.2.5 Maximum Likelihood Bishop 1.2.5 Model Likelihood differentiation p(t x, w, β) = Maximum Likelihood N N ( t n y(x n, w), β 1). (1.61) n=1 As we did in the case of the

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

Index. Pagenumbersfollowedbyf indicate figures; pagenumbersfollowedbyt indicate tables.

Index. Pagenumbersfollowedbyf indicate figures; pagenumbersfollowedbyt indicate tables. Index Pagenumbersfollowedbyf indicate figures; pagenumbersfollowedbyt indicate tables. Adaptive rejection metropolis sampling (ARMS), 98 Adaptive shrinkage, 132 Advanced Photo System (APS), 255 Aggregation

More information

On Bayesian Computation

On Bayesian Computation On Bayesian Computation Michael I. Jordan with Elaine Angelino, Maxim Rabinovich, Martin Wainwright and Yun Yang Previous Work: Information Constraints on Inference Minimize the minimax risk under constraints

More information

Bayesian inference for multivariate skew-normal and skew-t distributions

Bayesian inference for multivariate skew-normal and skew-t distributions Bayesian inference for multivariate skew-normal and skew-t distributions Brunero Liseo Sapienza Università di Roma Banff, May 2013 Outline Joint research with Antonio Parisi (Roma Tor Vergata) 1. Inferential

More information

POSTERIOR ANALYSIS OF THE MULTIPLICATIVE HETEROSCEDASTICITY MODEL

POSTERIOR ANALYSIS OF THE MULTIPLICATIVE HETEROSCEDASTICITY MODEL COMMUN. STATIST. THEORY METH., 30(5), 855 874 (2001) POSTERIOR ANALYSIS OF THE MULTIPLICATIVE HETEROSCEDASTICITY MODEL Hisashi Tanizaki and Xingyuan Zhang Faculty of Economics, Kobe University, Kobe 657-8501,

More information

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix Labor-Supply Shifts and Economic Fluctuations Technical Appendix Yongsung Chang Department of Economics University of Pennsylvania Frank Schorfheide Department of Economics University of Pennsylvania January

More information

Stable Limit Laws for Marginal Probabilities from MCMC Streams: Acceleration of Convergence

Stable Limit Laws for Marginal Probabilities from MCMC Streams: Acceleration of Convergence Stable Limit Laws for Marginal Probabilities from MCMC Streams: Acceleration of Convergence Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham NC 778-5 - Revised April,

More information

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions Journal of Modern Applied Statistical Methods Volume 8 Issue 1 Article 13 5-1-2009 Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error

More information

A review of some semiparametric regression models with application to scoring

A review of some semiparametric regression models with application to scoring A review of some semiparametric regression models with application to scoring Jean-Loïc Berthet 1 and Valentin Patilea 2 1 ENSAI Campus de Ker-Lann Rue Blaise Pascal - BP 37203 35172 Bruz cedex, France

More information

Rank Regression with Normal Residuals using the Gibbs Sampler

Rank Regression with Normal Residuals using the Gibbs Sampler Rank Regression with Normal Residuals using the Gibbs Sampler Stephen P Smith email: hucklebird@aol.com, 2018 Abstract Yu (2000) described the use of the Gibbs sampler to estimate regression parameters

More information

Stat 516, Homework 1

Stat 516, Homework 1 Stat 516, Homework 1 Due date: October 7 1. Consider an urn with n distinct balls numbered 1,..., n. We sample balls from the urn with replacement. Let N be the number of draws until we encounter a ball

More information

Preface. 1 Nonparametric Density Estimation and Testing. 1.1 Introduction. 1.2 Univariate Density Estimation

Preface. 1 Nonparametric Density Estimation and Testing. 1.1 Introduction. 1.2 Univariate Density Estimation Preface Nonparametric econometrics has become one of the most important sub-fields in modern econometrics. The primary goal of this lecture note is to introduce various nonparametric and semiparametric

More information

Generalized Autoregressive Score Models

Generalized Autoregressive Score Models Generalized Autoregressive Score Models by: Drew Creal, Siem Jan Koopman, André Lucas To capture the dynamic behavior of univariate and multivariate time series processes, we can allow parameters to be

More information

Financial Econometrics and Volatility Models Copulas

Financial Econometrics and Volatility Models Copulas Financial Econometrics and Volatility Models Copulas Eric Zivot Updated: May 10, 2010 Reading MFTS, chapter 19 FMUND, chapters 6 and 7 Introduction Capturing co-movement between financial asset returns

More information

Kullback-Leibler Designs

Kullback-Leibler Designs Kullback-Leibler Designs Astrid JOURDAN Jessica FRANCO Contents Contents Introduction Kullback-Leibler divergence Estimation by a Monte-Carlo method Design comparison Conclusion 2 Introduction Computer

More information

Local Polynomial Modelling and Its Applications

Local Polynomial Modelling and Its Applications Local Polynomial Modelling and Its Applications J. Fan Department of Statistics University of North Carolina Chapel Hill, USA and I. Gijbels Institute of Statistics Catholic University oflouvain Louvain-la-Neuve,

More information

Markov Chain Monte Carlo in Practice

Markov Chain Monte Carlo in Practice Markov Chain Monte Carlo in Practice Edited by W.R. Gilks Medical Research Council Biostatistics Unit Cambridge UK S. Richardson French National Institute for Health and Medical Research Vilejuif France

More information

Data-Adaptive Multivariate Density Estimation Using Regular Pavings, With Applications to Simulation-Intensive Inference

Data-Adaptive Multivariate Density Estimation Using Regular Pavings, With Applications to Simulation-Intensive Inference Data-Adaptive Multivariate Density Estimation Using Regular Pavings, With Applications to Simulation-Intensive Inference A thesis submitted in partial fulfilment of the requirements for the Degree of Master

More information

STA 4273H: Sta-s-cal Machine Learning

STA 4273H: Sta-s-cal Machine Learning STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 2 In our

More information

Probabilities & Statistics Revision

Probabilities & Statistics Revision Probabilities & Statistics Revision Christopher Ting Christopher Ting http://www.mysmu.edu/faculty/christophert/ : christopherting@smu.edu.sg : 6828 0364 : LKCSB 5036 January 6, 2017 Christopher Ting QF

More information

Markov Chain Monte Carlo Methods

Markov Chain Monte Carlo Methods Markov Chain Monte Carlo Methods John Geweke University of Iowa, USA 2005 Institute on Computational Economics University of Chicago - Argonne National Laboaratories July 22, 2005 The problem p (θ, ω I)

More information

Hakone Seminar Recent Developments in Statistics

Hakone Seminar Recent Developments in Statistics Hakone Seminar Recent Developments in Statistics November 12-14, 2015 Hotel Green Plaza Hakone: http://www.hgp.co.jp/language/english/sp/ Organizer: Masanobu TANIGUCHI (Research Institute for Science &

More information

13: Variational inference II

13: Variational inference II 10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).

More information

A quick introduction to Markov chains and Markov chain Monte Carlo (revised version)

A quick introduction to Markov chains and Markov chain Monte Carlo (revised version) A quick introduction to Markov chains and Markov chain Monte Carlo (revised version) Rasmus Waagepetersen Institute of Mathematical Sciences Aalborg University 1 Introduction These notes are intended to

More information