Very preliminary, please do not cite without permission. ESTIMATION AND FORECASTING LARGE REALIZED COVARIANCE MATRICES LAURENT A. F.
|
|
- Muriel Harrison
- 5 years ago
- Views:
Transcription
1 Very preliminary, please do not cite without permission. ESTIMATION AND FORECASTING LARGE REALIZED COVARIANCE MATRICES LAURENT A. F. CALLOT VU University Amsterdam, The Netherlands, CREATES, and the Tinbergen Institute. ANDERS B. KOCK CREATES, Aarhus University, Denmark. MARCELO C. MEDEIROS Department of Economics, Pontifical Catholic University of Rio de Janeiro, Rio de Janeiro, RJ, Brazil. Abstract. We consider forecasting large realized covariance matrices by penalized vector autoregressive models. Keywords: Realized covariance; vector autoregression; shrinkage; Lasso; forecasting; portfolio allocation. JEL codes: C22 1. Introduction This paper deals with modeling and forecasting large time-varying covariance matrices of daily returns on financial assets. Modern portfolio selection as well as risk management and empirical asset pricing strongly rely on precise forecasts of the covariance matrix of the assets involved. For instance,the traditional mean-variance approach of Markowitz requires the estimation or modeling of all variances and covariances, leading to unstable results when applied to a large set of assets. The evolution of financial markets increases the number of assets, leading the traditional approach to be less suitable to be used by practitioners. Typical multivariate ARCHtype models fail to deliver reliable estimates due the curse of dimensionality and large computational burden. Possible solutions frequently used in practice are (1) a weighted-average of past squared returns as in the Riskmetrics methodology or (2) the construction of factor models. In this paper we will take a different route and will consider the estimation of a vast vector autoregressive models for realized covariance matrices. To avoid the curse of dimensionality we advocate the use of the Least Absolute Shrinkage and Selection Operator (LASSO). The contributions of this paper are as follows. First, we put forward a methodology to model and forecast a large time-varying realized covariance matrices with a addresses: l.callot@vu.nl, akock@creates.au.dk, mcm@econ.puc-rio.br. Parts of the research for this paper were done while the first and second authors were visiting the Department of Economics at the Pontifical Catholic University of Rio de Janeiro, Brazil. Its hospitality is gratefully appreciated. MCM s research is partially supported by the CNPq/Brazil. 1
2 2 L. CALLOT, A. B. KOCK, AND M. C. MEDEIROS minimum number of restrictions. Our method can also shed some light on the drivers of the dynamics of these realized covariance matrices as the Lasso also does variable selection. Second, we derive an upper bound for the forecast error which is valid even in finite samples. Third, we show how this bound translates into a bound for the forecast error of the time-varying variance of a portfolio constructed with this large number of assets. Finally, we apply our methodology to the selection of a portfolio with mean-variance preferences. The rest of the paper is organized as follows. Section 2 describes the problem setup, defines notation, and briefly presents the Lasso and some key assumptions. In Section 3 we present some theoretical results. The dataset and computations issues are discussed in Section 4. The empirical results are presented in Section 5. Finally, Section?? concludes the paper. 2. Setup In this section we put forward our methodology and present a finite sample upper bound on the forecast error of our procedure. Let Σ t denote n T n T population conditional covariance matrix as of time t when conditioning on the σ-field σ({σ s : s < t}). Note that the dimension n T of Σ t is indexed by the sample size T. This reflects the fact that n T may be large compared to T and hence standard asymptotics which take dimension n T as a fixed number may not accurately reflect the actual performance in finite samples. Since Σ t is allowed to depend on its past it is a function of many variables. Defining y t = vech Σ t, we shall assume that it follows a vector autoregression of order p T, i.e., (1) y t = p T i=1 Φ i y t i + ɛ t, t = 1,..., T where Φ i, i = 1,..., p T are the k T k T dimensional parameter matrices with k T = n T (n T + 1)/2 and ɛ t N nt (0, Ω). Note that the dimension k T of the parameter matrices increases quadratically in n T. So even for conditional covariance matrices Σ t of a moderate dimension the number of parameters in (1) may be very large. Hence, standard estimation techniques such as least squares may provide very imprecise parameter estimates or even be infeasible if the number of variables is greater than the number of observations. To circumvent this problem we use the Least Absolute Shrinkage and Selection Operator (Lasso) of Tibshirani (1996) which is feasible even when the number of parameters to be estimated is (much) larger than the sample size. We suppress the dependence of n T, k T and p T on T to simplify notation. As mentioned in the introduction we are concerned with stationary VARs, meaning that the roots of I k p j=1 Φ jz j lie outside the unit circle. Equivalently, all roots of the companion matrix F must lie inside the unit disc. Let ρ (the dependence on T is suppressed) denote this largest root. It is convenient to write the model in stacked form. To do so let Z t = (y t 1,..., y t p) be the kp 1 vector of explanatory variables at time t in each equation and X = (Z T,..., Z 1 ) the T kp matrix of covariates for each equation. Let y i = (y T,i,..., y 1,i ) be the T 1 vector of observations on the ith variable (i = 1,..., k) and ɛ i = (ɛ T,i,..., ɛ 1,i ) the corresponding vector of error terms. The fact that y i inherits the gaussianity from the error terms is particularly useful since this means that y i has slim tails. Finally, βi is the kp dimensional parameter vector of true parameters for equation i which also implicitly depends on T. Hence, we may write (1) equivalently
3 as (2) ESTIMATION AND FORECASTING LARGE REALIZED COVARIANCE MATRICES 3 y i = Xβ i + ɛ i, i = 1,..., k such that each equation in the (1) may be modeled separately. Or, taking one step back, each element in Σ t is modeled as in (2). The length of β i, namely kp, may be much greater than the sample size if the original conditional covariance matrix Σ t is large. If, for example, n = 30 one has k = 465 which amounts to a total number of parameters per equation of 2325 if p = 5. As a consequence, traditional methods such as least squares will be inadequate in such situations and we will turn to the Lasso instead Notation. Let J i = {j : βi,j 0} {1,..., kp} denote the set of non-zero parameters in equation i and s i = J i its cardinality. s = max {s 1,..., s k } and let Ψ T = 1 T X X be the kp kp scaled Gramian of X. For any x R m, x = m i=1 x2 i, x l 1 = m i=1 x i and x l = max 1 i m x i denote l 2, l 1 and l norms, respectively (most often m = kp or n = s i in the sequel). When regarding the m m matrix A as a linear operator from R m to R m equipped with either the l 1 - or the l 2 -norm, A and A l1 denote the induced operator norms. A shall denote the maximum absolute entry of A. Note that it is not induced by the l -norm. For any vector δ in R n and a subset J {1,..., n} we shall let δ J denote the vector consisting only of those elements of δ indexed by J. For any two real numbers a and b, a b = max(a, b) and a b = min(a, b) Let σi,y 2 denote the variance of y t,i and σi,ɛ 2 the variance of ɛ t,i, 1 i k. Then define σ T = max 1 i k (σ i,y σ i,ɛ ) The Lasso. The LASSO was proposed by Tibshirani (1996). Its theoretical properties have been studied intensively since then, see e.g. Zhao and Yu (2006), Meinshausen and Bühlmann (2006), Bickel et al. (2009), and Bühlmann and Van De Geer (2011) to mention just a few. It is known that it only selects the correct model asymptotically under rather restrictive conditions on the dependence structure of the covariates. However, we shall see that it can still serve as an effective screening device in these situations. Put differently, it can remove many irrelevant covariates while still maintaining the relevant ones and estimating the coefficients of these with high precision. We investigate the properties of the LASSO when applied to each equation i = 1,..., k separately. The LASSO estimates βi in (2) by minimizing the following objective function (3) L(β i ) = 1 T y i Xβ i 2 + 2λ T β i l1 where λ T is a sequence to be defined exactly below. (3) is basically the least squares objective function plus an extra term penalizing parameters that are different from zero The restricted eigenvalue condition. If kp > T the Gram matrix Ψ T is singular, or equivalently, (4) δ Ψ T δ min δ R kp \{0} δ 2 = 0 In that case ordinary least squares is infeasible. However, for the LASSO Bickel et al. (2009) observed that the minimum in (4) can be replaced by a minimum over a much
4 4 L. CALLOT, A. B. KOCK, AND M. C. MEDEIROS smaller set. The same is the case for the LASSO in the VAR since we have written the VAR as a regression model. In particular we shall make use of the restricted eigenvalue condition { } κ 2 δ Ψ T δ (5) Ψ T (s i ) = min δ R : 2 R s i, δ R kp \ {0}, δ R c l1 3 δ R l1 > 0 where R {1,..., kp} and R is its cardinality. Note that instead of minimizing over all of δ R kp \ {0} as in (4) the minimum in (5) is restricted to those vectors satisfying δ R c l1 3 δ R l1. As a result, κ 2 Ψ T (r) can be positive even when the Rayleigh-Ritz ratio in (4) is zero. Note that whenever Ψ T has full rank κ 2 Ψ T (s i ) will be positive. Letting Γ = E(Ψ T ) = E(Z t Z t) denote the population covariance matrix we similarly define { } κ 2 i = κ 2 δ Γδ (6) Γ(s i ) = min δ R : 2 R s i, δ R kp \ {0}, δ R c l1 3 δ R l1 > 0 We shall assume throughout that Γ has full rank which implies that κ 2 i > 0. This is a rather standard assumption which is independent of whether T > kp or not. For more details on the restricted eigenvalue condition we refer to Kock and Callot (2012). 3. Theoretical Results Letting w R n denote a set of portfolio weights the true conditional variance of the portfolio is given by while the estimated variance is σ 2 t = w Σ t w ˆσ 2 t = w ˆΣt w As a consequence, one might be interested in measuring the precision of ˆΣ t by considering how much ˆσ 2 and σ 2 t deviate from each other. In the presence of an upper bound on the positions one may take this can be done by bounding ˆΣ t Σ t l2. The following theorem makes this claim precise Theorem 1. Assume that w l2 c for some c > 0. Then, 2 ˆσ t σt 2 ˆΣt Σ t c 2 Hence, in the presence of a restriction on the positions one can take, an upper bound on ˆΣ t Σ t implies an upper bound on the distance between ˆσ t 2 and σt 2. We shall next ( give an upper bound ) on ˆΣ t Σ t. To this end we define π q (s) = 4k 2 p 2 ζt exp + 2(k 2 p 2 ) 1 log(t ) for ζ = s 2 i log(t )(log(k2 p 2 )+1) With this notation in place we have the following theorem. (1 q) 2 κ 4 i ( Γ T i=0 F i ) 2. Theorem 2. Let λ T = 8 ln(1 + T ) 5 ln(1 + k) 4 ln(1 + p) 2 ln(k 2 p)σt 4 /T and 0 < q < 1. Then with probability at least 1 2(k 2 p) 1 ln(1+t ) 2(1 + T ) 1/A π q ( s) 2[k(p + 1)] 1 ln(t ) one has ( ˆΣT +1 Σ 16 ) T +1 2σT 2 ln(k(p + 1)) ln(t ) s qκ 2 i λ T + 1 i
5 ESTIMATION AND FORECASTING LARGE REALIZED COVARIANCE MATRICES 5 Theorem 2 gives an upper bound on the forecast error of ˆΣ T +1 which is valid even in finite samples. Note that even if we new the true parameter vector βi we could never expect a forecast error which tends to zero since the error terms ɛ T +1 are unforecastable. By combining Theorems 1 and 2 one may achieve the following upper bound on the forecast error of ˆσ T 2. Corollary 1. Under the assumptions of Theorems 1 and 2 one has that ( 2 ˆσ T +1 σt 2 16 ) +1 2σT 2 ln(k(p + 1)) ln(t ) s i λ T + 1 c 2 Corollary 1 provides a finite sample upper bound on the error of the forecast of the portfolio variance under a short selling constraint. Note that this short selling constraint is the only restriction we place on the portfolio weights. 4. Computations A first section describes the practical implementation of the forecasts, the following section discusses variable selection using the Dow-Jones data, and the final section presents forecast results Data. The data we use consists of 437 stocks of the S&P 500, with a total of 1465 daily observations from 2006 to The realized covariances are constructed from 5 minutes returns. thank and cite asger. We consider two transformations of the data both aimed at ensuring that the fitted and forecasted covariance matrices have positive diagonal after reversion of the transformation: log-covariance transformation: take the logarithm of the variances and do not transform the covariances. This transformation has the effect of smoothing the variance series relative to the covariance series. log matrix transformation: compute the matrix logarithm of the covariance matrix. The reverse transformation, the matrix exponential, ensures that the resulting matrix is positive semi-definite and smooths both diagonal and off-diagonal element. The drawback is that under this transformation the diagonal and off-diagonal parameters cannot be interpreted as pertaining to variances and covariances respectively Censoring. The sample we consider covers the financial crisis of 2008 as well as flash-crashes in 2010 and These events lead to very correlated return leading to many extreme values in the stock return correlation series. The Lasso is fragile to these kind of outliers since it works under normality assumptions. In practice, we flag for censoring every day in which more than 25% of the upper diagonal of the entries of the covariance matrix are over 4 of the series standard deviations away from their sample average. These observations are replaced by an average of the nearest 5 preceding and following non-flagged observations. Using this censoring, the flagged observations are concentrated in october 2008, the flash crashs of 2010 and 2011 are also flagged implementation. All the computations are carried using R and the lassovar package which is a wrapper for glmnet, and implementation of the coordinate descent algorithm of Friedman et al. (2010) The VARs are estimated equation by equation using the Bayesian Information Criterion to select the penalty parameter. qκ 2 i
6 6 L. CALLOT, A. B. KOCK, AND M. C. MEDEIROS 5. Empirical results This section reports our empirical results, the first part focuses on the variable selection pattern by the different versions of the Lasso. The second part reports forecast results Variable selection. In this section and the next, we focus on the 30 stocks belonging to the dow jones. These stocks can be classified in 7 broad categories highlighted in table 1. Basic Comms Consumer Consumer Energy Financial Industrial Technology Materials Cyclical Non-cyclical Table 1. Number of stocks per category, 30 Dow Jones stocks. We estimate over 400 models using a training sample with a rolling window of 1000 observations. In the tables below we report the average (across data samples) number of variables from a given category (in rows) selected in equations for stocks belonging to a given category (in columns.) The sums are also divided between diagonal (D, the variances) and off diagonal (O, the covariances) equations and covariates. The five tables below all report results for a VAR(1) estimated by Lasso or the adaptive Lasso using ols or lasso as initial estimator. These models are in tables 2, 4, and 5, using the log-variance transformation on censored data. Table 3 considers the Lasso estimator on log-matrix transformed censored data. Finally table 6 considers the Lasso on un-censored data. Let s consider table 2 as our benchmark model. The selected model for the diagonal equations is very sparse for most categories. It is striking that the model selected for off-diagonal equations contains many off-diagonal covariates, this is partly due to the large number of potential off-diagonal covariates. When considering the log-matrix transformation in 3 the number of selected variables is similar to the benchmark model except for off-diagonal covariates of off-diagonal equations where fewer covariates are selected and a clear diagonal pattern emerges. Using the adaptive Lasso with OLS as a first step estimator, table 4 results in a model that is more sparse than the benchmark model, except again for the off-diagonal covariates of off-diagonal equations where many more covariates are selected. This is in sharp contrast with the results obtained using the Lasso as an initial estimator, table 5 where the model is overall more sparse than the benchmark model. Finally when considering uncensored data in table 6 notice that the models for diagonal equations are comparable to those of the benchmark models whereas the off-diagonal equation models present a very different pattern. Very few diagonal covariates are selected, on the other hand a very large number, larger than with any of the other models considered, of off diagonal covariates is selected. The wide fluctuations in the pattern selection of off diagonal covariates in offdiagonal equations across models can be explained by considering the large number of very noisy off diagonal covariates the Lasso has to select from. Furthermore, large market wide shocks leads to sharp increases in the covariances of stock returns that are broadly correlated across covariances. These large correlated shocks to the covariances are partially eliminated by censoring or smoothed by the log matrix transform which implies relatively fewer variables selected. Using the Lasso instead
7 ESTIMATION AND FORECASTING LARGE REALIZED COVARIANCE MATRICES 7 Consumer, Cyclical Technology Energy Industrial Communications Financial Consumer, Non-cyclical Basic Materials Diagonal Basic Materials D Consumer, Non-cyclical D Financial D Communications D Industrial D Energy D Technology D Consumer, Cyclical D Basic Materials O Consumer, Non-cyclical O Financial O Communications O Industrial O Energy O Technology O Consumer, Cyclical O Off-diagonal Basic Materials D Consumer, Non-cyclical D Financial D Communications D Industrial D Energy D Technology D Consumer, Cyclical D Basic Materials O Consumer, Non-cyclical O Financial O Communications O Industrial O Energy O Technology O Consumer, Cyclical O Table 2. Number of variables selected by category. VAR(1) estimated by Lasso, censored data, the log-variance transformation. of the OLS leads to a first step screening which seems to help the second step lasso perform variable selection.
8 8 L. CALLOT, A. B. KOCK, AND M. C. MEDEIROS Consumer, Cyclical Technology Energy Industrial Communications Financial Consumer, Non-cyclical Basic Materials Diagonal Basic Materials D Consumer, Non-cyclical D Financial D Communications D Industrial D Energy D Technology D Consumer, Cyclical D Basic Materials O Consumer, Non-cyclical O Financial O Communications O Industrial O Energy O Technology O Consumer, Cyclical O Off-diagonal Basic Materials D Consumer, Non-cyclical D Financial D Communications D Industrial D Energy D Technology D Consumer, Cyclical D Basic Materials O Consumer, Non-cyclical O Financial O Communications O Industrial O Energy O Technology O Consumer, Cyclical O Table 3. Number of variables selected by category. VAR(1) estimated by Lasso, censored data, the log-matrix transformation.
9 ESTIMATION AND FORECASTING LARGE REALIZED COVARIANCE MATRICES 9 Consumer, Cyclical Technology Energy Industrial Communications Financial Consumer, Non-cyclical Basic Materials Diagonal Basic Materials D Consumer, Non-cyclical D Financial D Communications D Industrial D Energy D Technology D Consumer, Cyclical D Basic Materials O Consumer, Non-cyclical O Financial O Communications O Industrial O Energy O Technology O Consumer, Cyclical O Off-diagonal Basic Materials D Consumer, Non-cyclical D Financial D Communications D Industrial D Energy D Technology D Consumer, Cyclical D Basic Materials O Consumer, Non-cyclical O Financial O Communications O Industrial O Energy O Technology O Consumer, Cyclical O Table 4. Number of variables selected by category. VAR(1) estimated by adaptive Lasso using OLS as initial estimator, censored data, the log-variance transformation.
10 10 L. CALLOT, A. B. KOCK, AND M. C. MEDEIROS Consumer, Cyclical Technology Energy Industrial Communications Financial Consumer, Non-cyclical Basic Materials Diagonal Basic Materials D Consumer, Non-cyclical D Financial D Communications D Industrial D Energy D Technology D Consumer, Cyclical D Basic Materials O Consumer, Non-cyclical O Financial O Communications O Industrial O Energy O Technology O Consumer, Cyclical O Off-diagonal Basic Materials D Consumer, Non-cyclical D Financial D Communications D Industrial D Energy D Technology D Consumer, Cyclical D Basic Materials O Consumer, Non-cyclical O Financial O Communications O Industrial O Energy O Technology O Consumer, Cyclical O Table 5. Number of variables selected by category. VAR(1) estimated by adaptive Lasso using the Lasso as initial estimator, censored data, the log-variance transformation.
11 ESTIMATION AND FORECASTING LARGE REALIZED COVARIANCE MATRICES 11 Consumer, Cyclical Technology Energy Industrial Communications Financial Consumer, Non-cyclical Basic Materials Diagonal Basic Materials D Consumer, Non-cyclical D Financial D Communications D Industrial D Energy D Technology D Consumer, Cyclical D Basic Materials O Consumer, Non-cyclical O Financial O Communications O Industrial O Energy O Technology O Consumer, Cyclical O Off-diagonal Basic Materials D Consumer, Non-cyclical D Financial D Communications D Industrial D Energy D Technology D Consumer, Cyclical D Basic Materials O Consumer, Non-cyclical O Financial O Communications O Industrial O Energy O Technology O Consumer, Cyclical O Table 6. Number of variables selected by category. VAR(1) estimated by Lasso, un-censored data, the log-variance transformation Forecasts. The forecasts are computed recursively for horizons greater than 1. All forecast errors are computed based on the de-transformed forecasts.
12 12 L. CALLOT, A. B. KOCK, AND M. C. MEDEIROS We consider 3 levels of aggregation of the data: daily, weekly and monthly. The daily forecasts are computed using a rolling window of 1000 observations leading to 437 forecasts. The weekly models are estimated using 263 observations and the monthly models using 60 observations, which results in 52 and 12 forecasts respectively. We forecast with Vector autoregressive (VAR) models, autoregressive (AR) models, and random walk (No Change) models on both transformations (lcov, lmat) of the data. The estimators for the VARs are either the Lasso, the adaptive Lasso (with OLS or Lasso as initial estimator) and OLS. The AR models are estimated only using OLS. Key to the tables. The forecast tables below report a number of statistics for forecasts computed with several models. below we detail these statistics. Let t := t 0 + h where t 0 is the last observation and h the horizon, and ɛ h t 0 be the vector of forecast errors at horizon h forecasted from time t 0. Primary column header. beat bmk: Frequency at which the absolute forecast error of a given model is lower than the corresponding absolute forecast error the benchmark. The benchmark model is the one for which this statistic is reported as NA. pft risk: the difference between the forecasted and realized risk of an equal weight portfolio: rskt h 0 := w Σ t w w ˆΣt0 +hw. Buy n Hold: the cumulative pft risk bh H t 0 := H h=1 rskh t 0. N Frobenius: the frobenius norm of the forecast error i,j=1 (σ ij,t ˆσ ij,t0 +h) 2. Med SFE: median square forecast error MedSF E h = 1 T T t 0 =1 med(ˆɛ2h t 0 ). RMSFE: RMSF E h = 1 T T t 0 =1 mean(ˆɛ 2h t 0 ). MaxSFE: MaxSF E h = 1 T T t 0 =1 max(ˆɛ2h t 0 ). Secondary column header. h: the forecast horizon. A: the full matrix. O: off diagonal. D: diagonal. Colors. Green: No change forecasts. Blue: cens data. Red: autoregressive models estimated by OLS. Interpretation of the tables. Table 7 and 8 report results for No Change forecasts, forecasts from VAR(1) models estimated using the Lasso, the adaptive Lasso, and OLS, and AR forecasts estimated by OLS. The models are evaluated on cens and un-censord date, using either the lmat or lcov transformations. Note that for No Change forecasts the results are identical for both transformations since the errors are based on de-transformed forecasts. The first stricking results is that models estimated on uncensored data tend to be explosive with the lcov transformation but not with the lmat transformation. VARs estimated by OLS tend to be explosive even using censored data with the lcov transformation. This is further evidence that both the OLS and the Lasso are sensitive to extreme observations. When the data is smoother, as is the case with
13 ESTIMATION AND FORECASTING LARGE REALIZED COVARIANCE MATRICES 13 Buy&Hold ptf risk frobenius Med SFE RMSFE MaxSFE Model h A A A D O D O A No Change, cens No Change, cens No Change, cens No Change, cens No Change, cens No Change, un-cens No Change, un-cens No Change, un-cens No Change, un-cens No Change, un-cens Var(1), Lasso, cens, lcov Var(1), Lasso, cens, lcov Var(1), Lasso, cens, lcov Var(1), Lasso, cens, lcov Var(1), Lasso, cens, lcov Var(1), Lasso, cens, lmat Var(1), Lasso, cens, lmat Var(1), Lasso, cens, lmat Var(1), Lasso, cens, lmat Var(1), Lasso, cens, lmat Var(1), Lasso, un-cens, lcov Var(1), Lasso, un-cens, lcov Var(1), Lasso, un-cens, lcov. 10 -Inf -Inf Inf Inf Inf Var(1), Lasso, un-cens, lcov. 25 -Inf -Inf Inf Inf Inf Inf Inf Inf Var(1), Lasso, un-cens, lcov. 50 -Inf -Inf Inf Inf Inf Inf Inf Inf Var(1), Lasso, un-cens, lmat Var(1), Lasso, un-cens, lmat Var(1), Lasso, un-cens, lmat Var(1), Lasso, un-cens, lmat Var(1), Lasso, un-cens, lmat Var(1), alasso (ols), cens, lcov Var(1), alasso (ols), cens, lcov Var(1), alasso (ols), cens, lcov Var(1), alasso (ols), cens, lcov. 25 -Inf -Inf Inf Inf Inf Inf Inf Var(1), alasso (ols), cens, lcov. 50 -Inf -Inf Inf Inf Inf Inf Inf Inf Var(1) alasso (lasso), lcov, cens Var(1) alasso (lasso), lcov, cens Var(1) alasso (lasso), lcov, cens Var(1) alasso (lasso), lcov, cens Var(1) alasso (lasso), lcov, cens Table 7. Summary statistics daily forecasts, 1000 observation training sample, h-step ahead recursive forecasts. All statistics are averaged across forecast iterations. censoring and with the lmat transformation, the models are stable and often out perform the No Change benchmark. Table 9 reports results for weekly aggregated data. At this frequency stability of the VARs is no longer an issue, and in this setting the Lasso and it variants consistently outperform the No Change forecasts. In particular models estimated using the lcov transformation provide the most accurate forecasts of the covariance matrix resulting in less risky portfolios even over longer horizons. At a monthly level of aggregation, results in table 10, both transformations are equivalent and dominate (though not uniformaly) No Change forecasts. Note that at both levels of aggregation the short number of observations available relative to the number of parameters of the unrestricted model renders the OLS infeasible.
14 14 L. CALLOT, A. B. KOCK, AND M. C. MEDEIROS Buy&Hold ptf risk frobenius Med SFE RMSFE MaxSFE Model h A A A D O D O A AR(1), ols, cens, lcov AR(1), ols, cens, lcov AR(1), ols, cens, lcov AR(1), ols, cens, lcov AR(1), ols, cens, lcov AR(1), ols, un-cens, lcov AR(1), ols, un-cens, lcov AR(1), ols, un-cens, lcov AR(1), ols, un-cens, lcov AR(1), ols, un-cens, lcov AR(1), ols, un-cens, lmat AR(1), ols, un-cens, lmat AR(1), ols, un-cens, lmat AR(1), ols, un-cens, lmat AR(1), ols, un-cens, lmat AR(5), ols, cens, lcov AR(5), ols, cens, lcov AR(5), ols, cens, lcov AR(5), ols, cens, lcov AR(5), ols, cens, lcov VAR(1), ols, cens, lcov VAR(1), ols, cens, lcov Inf VAR(1), ols, cens, lcov. 10 -Inf -Inf Inf Inf Inf Inf VAR(1), ols, cens, lcov. 25 -Inf -Inf Inf Inf Inf Inf Inf Inf VAR(1), ols, cens, lcov. 50 -Inf -Inf Inf Inf Inf Inf Inf Inf VAR(1), ols, cens, lmat VAR(1), ols, cens, lmat VAR(1), ols, cens, lmat VAR(1), ols, cens, lmat VAR(1), ols, cens, lmat VAR(1), ols, un-cens, lcov VAR(1), ols, un-cens, lcov. 5 -Inf -Inf Inf Inf Inf Inf Inf VAR(1), ols, un-cens, lcov. 10 -Inf -Inf Inf Inf Inf Inf Inf Inf VAR(1), ols, un-cens, lcov. 25 -Inf -Inf Inf Inf Inf Inf Inf Inf VAR(1), ols, un-cens, lcov. 50 -Inf -Inf Inf Inf Inf Inf Inf Inf Table 8. Summary statistics daily forecasts, 1000 training observations, h-step ahead recursive forecasts, all statistics are averaged across forecast iterations. 6. Conclusions and Further Work In this paper we considered modeling and forecasting large realized covariance matrices. Our approach was based on the estimation of a large vector autoregressive model by the least absolute shrinkage and selection operator which, simultaneously, shrinks irrelevant parameters towards zero and conducts variable selection. Therefore, we avoided problems related to the curse of dimensionality which abound in the related literature. We also derived upper bounds for the forecast error. In an empiracal application focused on 30 stocks of the Dow Jones industrial average we evaluated the performance of the Lasso and Adaptive Lasso at different levels of aggregation, Compared to random walk forecasts and OLS (when feasible) forecasts, out empirical applications shows that our methodology is promising in that it provides better forecasts than the benchmarks even at long horizons.
15 ESTIMATION AND FORECASTING LARGE REALIZED COVARIANCE MATRICES 15 Buy&Hold ptf risk frobenius Med SFE RMSFE MaxSFE Model h A A A D O D O A No Change, un-cens, lcov No Change, un-cens, lcov No Change, un-cens, lcov No Change, un-cens, lcov VAR(1), alasso (lasso), un-cens, lcov VAR(1), alasso (lasso), un-cens, lcov VAR(1), alasso (lasso), un-cens, lcov VAR(1), alasso (lasso), un-cens, lcov VAR(1), alasso (lasso), un-cens, lmat VAR(1), alasso (lasso), un-cens, lmat VAR(1), alasso (lasso), un-cens, lmat VAR(1), alasso (lasso), un-cens, lmat VAR(1), Lasso, cens, lcov VAR(1), Lasso, cens, lcov VAR(1), Lasso, cens, lcov VAR(1), Lasso, cens, lcov VAR(1), Lasso, cens, lmat VAR(1), Lasso, cens, lmat VAR(1), Lasso, cens, lmat VAR(1), Lasso, cens, lmat VAR(1), Lasso, un-cens, lcov VAR(1), Lasso, un-cens, lcov VAR(1), Lasso, un-cens, lcov VAR(1), Lasso, un-cens, lcov VAR(1), Lasso, un-cens, lmat VAR(1), Lasso, un-cens, lmat VAR(1), Lasso, un-cens, lmat VAR(1), Lasso, un-cens, lmat AR(1), ols, un-cens, lcov AR(1), ols, un-cens, lcov AR(1), ols, un-cens, lcov AR(1), ols, un-cens, lcov AR(1), ols, un-cens, lmat AR(1), ols, un-cens, lmat AR(1), ols, un-cens, lmat AR(1), ols, un-cens, lmat Table 9. Summary statistics for weekly aggregated data, 263 training observations, h-step ahead recursive forecasts, all statistics are averaged across forecast iterations. 7. Appendix Proof of Theorem 1. By the definition of σt 2 and ˆσ t 2 one has 2 ˆσ t σt 2 w = (ˆΣ t Σ t )w (ˆΣ t Σ t )w w l1 l ˆΣt Σ t w 2 l 1 n ˆΣt Σ t w 2 l 2 ˆΣt Σ t c 2 Before proving Theorem?? we recall the following result which is an extract of Theorem 2 in Kock and Callot (2012). Lemma 1 (Theorem 2 in Kock and Callot (2012)). Let λ T = 8 ln(1 + T ) 5 ln(1 + k) 4 ln(1 + p) 2 ln(k 2 p)σt 4 /T and 0 < q < 1. Then with
16 16 L. CALLOT, A. B. KOCK, AND M. C. MEDEIROS Buy&Hold ptf risk frobenius Med SFE RMSFE MaxSFE Model h A A A D O D O A No Change, un-cens, lcov No Change, un-cens, lcov No Change, un-cens, lcov No Change, un-cens, lmat No Change, un-cens, lmat No Change, un-cens, lmat VAR(1), alasso (lasso), un-cens, lcov VAR(1), alasso (lasso), un-cens, lcov VAR(1), alasso (lasso), un-cens, lcov VAR(1), alasso (lasso), un-cens, lmat VAR(1), alasso (lasso), un-cens, lmat VAR(1), alasso (lasso), un-cens, lmat VAR(1), Lasso, un-cens, lcov VAR(1), Lasso, un-cens, lcov VAR(1), Lasso, un-cens, lcov VAR(1), Lasso, un-cens, lmat VAR(1), Lasso, un-cens, lmat VAR(1), Lasso, un-cens, lmat AR(1), Lasso, un-cens, lcov AR(1), Lasso, un-cens, lcov AR(1), Lasso, un-cens, lcov.var AR(1), Lasso, un-cens, lmat AR(1), Lasso, un-cens, lmat.var AR(1), Lasso, un-cens, lmat Table 10. Summary statistics for monthly aggregated data, 60 training observations, h-step ahead recursive forecasts, all statistics are averaged across forecast iterations. probability at least 1 2(k 2 p) 1 ln(1+t ) 2(1+T ) 1/A π q (s i ) the following inequalities hold for all i = 1,..., k for some positive constant A. (7) ˆβi βi l1 16 s qκ 2 i λ T i Furthermore, all the above statements hold on one and the same set which has probability at least 1 2(k 2 p) 1 ln(1+t ) 2(1 + T ) 1/A π q ( s). Proof of Theorem??. Since ˆΣT +1 Σ T +1 = vech ˆΣT +1 vech Σ T +1 = ŷt +1 T y T +1 l we shall bound each entry of ŷ T +1 T y T +1. By assumption while y T +1,i = Z T +1β i + ɛ T +1 T,i ŷ T +1 T,i = Z T +1 ˆβ i such that yt +1,i ŷ T +1 T,i Z = T +1 (βi ˆβ i ) + ɛ T +1,i ZT +1 l ˆβi βi l1 + ɛt +1,i Using Lemma 1 this yields that y T +1,i ŷ T +1 T,i 16 Z T +1 l s qκ 2 i λ T + ɛ T +1,i for all i = 1,..., k i
17 ESTIMATION AND FORECASTING LARGE REALIZED COVARIANCE MATRICES 17 with probability at least 1 2(k 2 p) 1 ln(1+t ) 2(1 + T ) 1/A π q ( s). Next, note that by the gaussianity of the covariates and error terms P ( y T l,i x) 2e x2 /2σT 2 for all 1 i k and 1 l p and P ( ɛ T +1,i x) 2e x2 /2σT 2 for all 1 i k. This implies P ( Z T +1 l max 1 i k ɛ T +1,i L) 2kpe L2 /2σ 2 T + 2ke L 2 /2σ 2 T = 2k(p + 1)e L 2 /2σ 2 T Choosing L 2 = 2σT 2 ln(k(p + 1)) ln(t ) yields (8) P ( Z T +1 l max 1 i k ɛ T +1,i L) 2[k(p + 1)] 1 ln(t ) and so ( yt +1,i ŷ T +1 T,i 16 ) 2σT 2 ln(k(p + 1)) ln(t ) s qκ 2 i λ T + 1 for all i = 1,..., k i with probability at least 1 2(k 2 p) 1 ln(1+t ) 2(1+T ) 1/A π q ( s) 2[k(p+1)] 1 ln(t ). References Bickel, P. J., Y. Ritov, and A. B. Tsybakov (2009). Simultaneous analysis of lasso and dantzig selector. The Annals of Statistics 37 (4), Bühlmann, P. and S. Van De Geer (2011). Statistics for High-Dimensional Data: Methods, Theory and Applications. Springer-Verlag, New York. Friedman, J., T. Hastie, and R. Tibshirani (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software 33 (1), Kock, A. and L. Callot (2012). Oracle inequalities for high dimensional vector autoregressions. Aarhus University, CREATES Research Paper 16. Meinshausen, N. and P. Bühlmann (2006). High-dimensional graphs and variable selection with the lasso. The Annals of Statistics 34, Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), Zhao, P. and B. Yu (2006). On model selection consistency of lasso. The Journal of Machine Learning Research 7,
MODELING AND FORECASTING LARGE REALIZED COVARIANCE MATRICES AND PORTFOLIO CHOICE
MODELING AND FORECASTING LARGE REALIZED COVARIANCE MATRICES AND PORTFOLIO CHOICE Laurent A.F. Callot VU University Amsterdam, CREATES, and the Tinbergen Institute E-mail: l.callot@vu.nl Anders B. Kock
More informationESTIMATION AND FORECASTING OF LARGE REALIZED COVARIANCE MATRICES AND PORTFOLIO CHOICE LAURENT A. F. CALLOT
ESTIMATION AND FORECASTING OF LARGE REALIZED COVARIANCE MATRICES AND PORTFOLIO CHOICE LAURENT A. F. CALLOT VU University Amsterdam, The Netherlands, CREATES, and the Tinbergen Institute. ANDERS B. KOCK
More informationEstimation and Forecasting of Large Realized Covariance Matrices and Portfolio Choice. Laurent A. F. Callot, Anders B. Kock and Marcelo C.
Estimation and Forecasting of Large Realized Covariance Matrices and Portfolio Choice Laurent A. F. Callot, Anders B. Kock and Marcelo C. Medeiros CREATES Research Paper 2014-42 Department of Economics
More informationSUPPLEMENT TO MODELING AND FORECASTING LARGE REALIZED COVARIANCE MATRICES AND PORTFOLIO CHOICE
SUPPLEMEN O MODELING AND FORECASING LARGE REALIZED COVARIANCE MARICES AND PORFOLIO CHOICE Laurent A.F. Callot VU University Amsterdam, CREAES, and the inbergen Institute Anders B. Kock Aarhus University
More informationOracle Efficient Estimation and Forecasting with the Adaptive LASSO and the Adaptive Group LASSO in Vector Autoregressions
Oracle Efficient Estimation and Forecasting with the Adaptive LASSO and the Adaptive Group LASSO in Vector Autoregressions Anders Bredahl Kock and Laurent A.F. Callot CREAES Research Paper 2012-38 Department
More informationEstimating Global Bank Network Connectedness
Estimating Global Bank Network Connectedness Mert Demirer (MIT) Francis X. Diebold (Penn) Laura Liu (Penn) Kamil Yılmaz (Koç) September 22, 2016 1 / 27 Financial and Macroeconomic Connectedness Market
More informationVariable Selection for Highly Correlated Predictors
Variable Selection for Highly Correlated Predictors Fei Xue and Annie Qu arxiv:1709.04840v1 [stat.me] 14 Sep 2017 Abstract Penalty-based variable selection methods are powerful in selecting relevant covariates
More informationThe lasso, persistence, and cross-validation
The lasso, persistence, and cross-validation Daniel J. McDonald Department of Statistics Indiana University http://www.stat.cmu.edu/ danielmc Joint work with: Darren Homrighausen Colorado State University
More informationPenalized Estimation of Panel Vector Autoregressive Models: A Lasso Approach
Penalized Estimation of Panel Vector Autoregressive Models: A Lasso Approach Annika Schnücker Freie Universität Berlin and DIW Berlin Graduate Center, Mohrenstr. 58, 10117 Berlin, Germany October 11, 2017
More informationProperties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation
Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation Adam J. Rothman School of Statistics University of Minnesota October 8, 2014, joint work with Liliana
More informationForecasting Large Realized Covariance Matrices: The Benefits of Factor Models and Shrinkage
Forecasting Large Realized Covariance Matrices: The Benefits of Factor Models and Shrinkage Diego S. de Brito Department of Economics PUC-Rio Marcelo C. Medeiros Department of Economics PUC-Rio First version:
More informationLeast Absolute Shrinkage is Equivalent to Quadratic Penalization
Least Absolute Shrinkage is Equivalent to Quadratic Penalization Yves Grandvalet Heudiasyc, UMR CNRS 6599, Université de Technologie de Compiègne, BP 20.529, 60205 Compiègne Cedex, France Yves.Grandvalet@hds.utc.fr
More informationA Blockwise Descent Algorithm for Group-penalized Multiresponse and Multinomial Regression
A Blockwise Descent Algorithm for Group-penalized Multiresponse and Multinomial Regression Noah Simon Jerome Friedman Trevor Hastie November 5, 013 Abstract In this paper we purpose a blockwise descent
More information(Part 1) High-dimensional statistics May / 41
Theory for the Lasso Recall the linear model Y i = p j=1 β j X (j) i + ɛ i, i = 1,..., n, or, in matrix notation, Y = Xβ + ɛ, To simplify, we assume that the design X is fixed, and that ɛ is N (0, σ 2
More informationBAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage
BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage Lingrui Gan, Naveen N. Narisetty, Feng Liang Department of Statistics University of Illinois at Urbana-Champaign Problem Statement
More informationReconstruction from Anisotropic Random Measurements
Reconstruction from Anisotropic Random Measurements Mark Rudelson and Shuheng Zhou The University of Michigan, Ann Arbor Coding, Complexity, and Sparsity Workshop, 013 Ann Arbor, Michigan August 7, 013
More informationMSA220/MVE440 Statistical Learning for Big Data
MSA220/MVE440 Statistical Learning for Big Data Lecture 7/8 - High-dimensional modeling part 1 Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Classification
More informationMining Big Data Using Parsimonious Factor and Shrinkage Methods
Mining Big Data Using Parsimonious Factor and Shrinkage Methods Hyun Hak Kim 1 and Norman Swanson 2 1 Bank of Korea and 2 Rutgers University ECB Workshop on using Big Data for Forecasting and Statistics
More informationA New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables
A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables Niharika Gauraha and Swapan Parui Indian Statistical Institute Abstract. We consider the problem of
More informationVariable Selection for Highly Correlated Predictors
Variable Selection for Highly Correlated Predictors Fei Xue and Annie Qu Department of Statistics, University of Illinois at Urbana-Champaign WHOA-PSI, Aug, 2017 St. Louis, Missouri 1 / 30 Background Variable
More informationGeneralized Elastic Net Regression
Abstract Generalized Elastic Net Regression Geoffroy MOURET Jean-Jules BRAULT Vahid PARTOVINIA This work presents a variation of the elastic net penalization method. We propose applying a combined l 1
More informationAnalysis of Fast Input Selection: Application in Time Series Prediction
Analysis of Fast Input Selection: Application in Time Series Prediction Jarkko Tikka, Amaury Lendasse, and Jaakko Hollmén Helsinki University of Technology, Laboratory of Computer and Information Science,
More informationLeast squares under convex constraint
Stanford University Questions Let Z be an n-dimensional standard Gaussian random vector. Let µ be a point in R n and let Y = Z + µ. We are interested in estimating µ from the data vector Y, under the assumption
More informationA Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models
A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models Jingyi Jessica Li Department of Statistics University of California, Los
More informationNon-linear Supervised High Frequency Trading Strategies with Applications in US Equity Markets
Non-linear Supervised High Frequency Trading Strategies with Applications in US Equity Markets Nan Zhou, Wen Cheng, Ph.D. Associate, Quantitative Research, J.P. Morgan nan.zhou@jpmorgan.com The 4th Annual
More informationDelta Theorem in the Age of High Dimensions
Delta Theorem in the Age of High Dimensions Mehmet Caner Department of Economics Ohio State University December 15, 2016 Abstract We provide a new version of delta theorem, that takes into account of high
More informationRobust Portfolio Risk Minimization Using the Graphical Lasso
Robust Portfolio Risk Minimization Using the Graphical Lasso Tristan Millington & Mahesan Niranjan Department of Electronics and Computer Science University of Southampton Highfield SO17 1BJ, Southampton,
More informationEstimation of the Global Minimum Variance Portfolio in High Dimensions
Estimation of the Global Minimum Variance Portfolio in High Dimensions Taras Bodnar, Nestor Parolya and Wolfgang Schmid 07.FEBRUARY 2014 1 / 25 Outline Introduction Random Matrix Theory: Preliminary Results
More informationHigh Dimensional Inverse Covariate Matrix Estimation via Linear Programming
High Dimensional Inverse Covariate Matrix Estimation via Linear Programming Ming Yuan October 24, 2011 Gaussian Graphical Model X = (X 1,..., X p ) indep. N(µ, Σ) Inverse covariance matrix Σ 1 = Ω = (ω
More informationAn economic application of machine learning: Nowcasting Thai exports using global financial market data and time-lag lasso
An economic application of machine learning: Nowcasting Thai exports using global financial market data and time-lag lasso PIER Exchange Nov. 17, 2016 Thammarak Moenjak What is machine learning? Wikipedia
More informationIdentifying Financial Risk Factors
Identifying Financial Risk Factors with a Low-Rank Sparse Decomposition Lisa Goldberg Alex Shkolnik Berkeley Columbia Meeting in Engineering and Statistics 24 March 2016 Outline 1 A Brief History of Factor
More information2.5 Forecasting and Impulse Response Functions
2.5 Forecasting and Impulse Response Functions Principles of forecasting Forecast based on conditional expectations Suppose we are interested in forecasting the value of y t+1 based on a set of variables
More informationRegression Shrinkage and Selection via the Lasso
Regression Shrinkage and Selection via the Lasso ROBERT TIBSHIRANI, 1996 Presenter: Guiyun Feng April 27 () 1 / 20 Motivation Estimation in Linear Models: y = β T x + ɛ. data (x i, y i ), i = 1, 2,...,
More informationTECHNICAL REPORT NO. 1091r. A Note on the Lasso and Related Procedures in Model Selection
DEPARTMENT OF STATISTICS University of Wisconsin 1210 West Dayton St. Madison, WI 53706 TECHNICAL REPORT NO. 1091r April 2004, Revised December 2004 A Note on the Lasso and Related Procedures in Model
More informationRegression, Ridge Regression, Lasso
Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.
More informationTHE LASSO, CORRELATED DESIGN, AND IMPROVED ORACLE INEQUALITIES. By Sara van de Geer and Johannes Lederer. ETH Zürich
Submitted to the Annals of Applied Statistics arxiv: math.pr/0000000 THE LASSO, CORRELATED DESIGN, AND IMPROVED ORACLE INEQUALITIES By Sara van de Geer and Johannes Lederer ETH Zürich We study high-dimensional
More informationMS-C1620 Statistical inference
MS-C1620 Statistical inference 10 Linear regression III Joni Virta Department of Mathematics and Systems Analysis School of Science Aalto University Academic year 2018 2019 Period III - IV 1 / 32 Contents
More informationPre-Selection in Cluster Lasso Methods for Correlated Variable Selection in High-Dimensional Linear Models
Pre-Selection in Cluster Lasso Methods for Correlated Variable Selection in High-Dimensional Linear Models Niharika Gauraha and Swapan Parui Indian Statistical Institute Abstract. We consider variable
More informationLecture 8: Multivariate GARCH and Conditional Correlation Models
Lecture 8: Multivariate GARCH and Conditional Correlation Models Prof. Massimo Guidolin 20192 Financial Econometrics Winter/Spring 2018 Overview Three issues in multivariate modelling of CH covariances
More informationMachine Learning for OR & FE
Machine Learning for OR & FE Regression II: Regularization and Shrinkage Methods Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com
More informationLearning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text
Learning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text Yi Zhang Machine Learning Department Carnegie Mellon University yizhang1@cs.cmu.edu Jeff Schneider The Robotics Institute
More informationRobust Inverse Covariance Estimation under Noisy Measurements
.. Robust Inverse Covariance Estimation under Noisy Measurements Jun-Kun Wang, Shou-De Lin Intel-NTU, National Taiwan University ICML 2014 1 / 30 . Table of contents Introduction.1 Introduction.2 Related
More informationTime Series Models for Measuring Market Risk
Time Series Models for Measuring Market Risk José Miguel Hernández Lobato Universidad Autónoma de Madrid, Computer Science Department June 28, 2007 1/ 32 Outline 1 Introduction 2 Competitive and collaborative
More informationThe deterministic Lasso
The deterministic Lasso Sara van de Geer Seminar für Statistik, ETH Zürich Abstract We study high-dimensional generalized linear models and empirical risk minimization using the Lasso An oracle inequality
More informationPenalized Estimation of Panel VARs: A Lasso Approach. Annika Schnücker DIW Berlin Graduate Center and Freie Universität Berlin Draft - February 2017
Penalized Estimation of Panel VARs: A Lasso Approach Annika Schnücker DIW Berlin Graduate Center and Freie Universität Berlin Draft - February 2017 Abstract Panel vector autoregressive (PVAR) models account
More informationKeywords: sparse models, shrinkage, LASSO, adalasso, time series, forecasting, GARCH.
l 1 -REGULARIZAION OF HIGH-DIMENSIONAL IME-SERIES MODELS WIH FLEXIBLE INNOVAIONS Marcelo C. Medeiros Department of Economics Pontifical Catholic University of Rio de Janeiro Rua Marquês de São Vicente
More informationLASSO-type penalties for covariate selection and forecasting in time series
LASSO-type penalties for covariate selection and forecasting in time series Evandro Konzen 1 Flavio A. Ziegelmann 2 Abstract This paper studies some forms of LASSO-type penalties in time series to reduce
More informationLog Covariance Matrix Estimation
Log Covariance Matrix Estimation Xinwei Deng Department of Statistics University of Wisconsin-Madison Joint work with Kam-Wah Tsui (Univ. of Wisconsin-Madsion) 1 Outline Background and Motivation The Proposed
More informationL11: Pattern recognition principles
L11: Pattern recognition principles Bayesian decision theory Statistical classifiers Dimensionality reduction Clustering This lecture is partly based on [Huang, Acero and Hon, 2001, ch. 4] Introduction
More informationLecture 6: Methods for high-dimensional problems
Lecture 6: Methods for high-dimensional problems Hector Corrada Bravo and Rafael A. Irizarry March, 2010 In this Section we will discuss methods where data lies on high-dimensional spaces. In particular,
More informationBayesian Compressed Vector Autoregressions
Bayesian Compressed Vector Autoregressions Gary Koop a, Dimitris Korobilis b, and Davide Pettenuzzo c a University of Strathclyde b University of Glasgow c Brandeis University 9th ECB Workshop on Forecasting
More informationAnalysis Methods for Supersaturated Design: Some Comparisons
Journal of Data Science 1(2003), 249-260 Analysis Methods for Supersaturated Design: Some Comparisons Runze Li 1 and Dennis K. J. Lin 2 The Pennsylvania State University Abstract: Supersaturated designs
More informationVector Auto-Regressive Models
Vector Auto-Regressive Models Laurent Ferrara 1 1 University of Paris Nanterre M2 Oct. 2018 Overview of the presentation 1. Vector Auto-Regressions Definition Estimation Testing 2. Impulse responses functions
More informationRegularization Path Algorithms for Detecting Gene Interactions
Regularization Path Algorithms for Detecting Gene Interactions Mee Young Park Trevor Hastie July 16, 2006 Abstract In this study, we consider several regularization path algorithms with grouped variable
More informationVAR Models and Applications
VAR Models and Applications Laurent Ferrara 1 1 University of Paris West M2 EIPMC Oct. 2016 Overview of the presentation 1. Vector Auto-Regressions Definition Estimation Testing 2. Impulse responses functions
More informationExtended Bayesian Information Criteria for Gaussian Graphical Models
Extended Bayesian Information Criteria for Gaussian Graphical Models Rina Foygel University of Chicago rina@uchicago.edu Mathias Drton University of Chicago drton@uchicago.edu Abstract Gaussian graphical
More informationarxiv: v2 [math.st] 2 Jul 2017
A Relaxed Approach to Estimating Large Portfolios Mehmet Caner Esra Ulasan Laurent Callot A.Özlem Önder July 4, 2017 arxiv:1611.07347v2 [math.st] 2 Jul 2017 Abstract This paper considers three aspects
More informationSparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28
Sparsity Models Tong Zhang Rutgers University T. Zhang (Rutgers) Sparsity Models 1 / 28 Topics Standard sparse regression model algorithms: convex relaxation and greedy algorithm sparse recovery analysis:
More informationSparse Permutation Invariant Covariance Estimation: Motivation, Background and Key Results
Sparse Permutation Invariant Covariance Estimation: Motivation, Background and Key Results David Prince Biostat 572 dprince3@uw.edu April 19, 2012 David Prince (UW) SPICE April 19, 2012 1 / 11 Electronic
More informationSelection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty
Journal of Data Science 9(2011), 549-564 Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty Masaru Kanba and Kanta Naito Shimane University Abstract: This paper discusses the
More informationOrthogonal Matching Pursuit for Sparse Signal Recovery With Noise
Orthogonal Matching Pursuit for Sparse Signal Recovery With Noise The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As Published
More information10. Time series regression and forecasting
10. Time series regression and forecasting Key feature of this section: Analysis of data on a single entity observed at multiple points in time (time series data) Typical research questions: What is the
More informationRobust Variable Selection Through MAVE
Robust Variable Selection Through MAVE Weixin Yao and Qin Wang Abstract Dimension reduction and variable selection play important roles in high dimensional data analysis. Wang and Yin (2008) proposed sparse
More informationConvex relaxation for Combinatorial Penalties
Convex relaxation for Combinatorial Penalties Guillaume Obozinski Equipe Imagine Laboratoire d Informatique Gaspard Monge Ecole des Ponts - ParisTech Joint work with Francis Bach Fête Parisienne in Computation,
More informationBayesian variable selection via. Penalized credible regions. Brian Reich, NCSU. Joint work with. Howard Bondell and Ander Wilson
Bayesian variable selection via penalized credible regions Brian Reich, NC State Joint work with Howard Bondell and Ander Wilson Brian Reich, NCSU Penalized credible regions 1 Motivation big p, small n
More informationCommodity Connectedness
Commodity Connectedness Francis X. Diebold (Penn) Laura Liu (Penn) Kamil Yılmaz (Koç) November 9, 2016 1 / 29 Financial and Macroeconomic Connectedness Portfolio concentration risk Credit risk Counterparty
More informationStatistical Inference
Statistical Inference Liu Yang Florida State University October 27, 2016 Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, 2016 1 / 27 Outline The Bayesian Lasso Trevor Park
More informationSpeculation and the Bond Market: An Empirical No-arbitrage Framework
Online Appendix to the paper Speculation and the Bond Market: An Empirical No-arbitrage Framework October 5, 2015 Part I: Maturity specific shocks in affine and equilibrium models This Appendix present
More informationAn algorithm for the multivariate group lasso with covariance estimation
An algorithm for the multivariate group lasso with covariance estimation arxiv:1512.05153v1 [stat.co] 16 Dec 2015 Ines Wilms and Christophe Croux Leuven Statistics Research Centre, KU Leuven, Belgium Abstract
More informationA Modern Look at Classical Multivariate Techniques
A Modern Look at Classical Multivariate Techniques Yoonkyung Lee Department of Statistics The Ohio State University March 16-20, 2015 The 13th School of Probability and Statistics CIMAT, Guanajuato, Mexico
More informationFrontiers in Forecasting, Minneapolis February 21-23, Sparse VAR-Models. Christophe Croux. EDHEC Business School (France)
Frontiers in Forecasting, Minneapolis February 21-23, 2018 Sparse VAR-Models Christophe Croux EDHEC Business School (France) Joint Work with Ines Wilms (Cornell University), Luca Barbaglia (KU leuven),
More informationEstimating Covariance Using Factorial Hidden Markov Models
Estimating Covariance Using Factorial Hidden Markov Models João Sedoc 1,2 with: Jordan Rodu 3, Lyle Ungar 1, Dean Foster 1 and Jean Gallier 1 1 University of Pennsylvania Philadelphia, PA joao@cis.upenn.edu
More informationTechnical Vignette 5: Understanding intrinsic Gaussian Markov random field spatial models, including intrinsic conditional autoregressive models
Technical Vignette 5: Understanding intrinsic Gaussian Markov random field spatial models, including intrinsic conditional autoregressive models Christopher Paciorek, Department of Statistics, University
More informationLecture Notes 1: Vector spaces
Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector
More informationPortfolio Allocation using High Frequency Data. Jianqing Fan
Portfolio Allocation using High Frequency Data Princeton University With Yingying Li and Ke Yu http://www.princeton.edu/ jqfan September 10, 2010 About this talk How to select sparsely optimal portfolio?
More informationBacktesting Marginal Expected Shortfall and Related Systemic Risk Measures
Backtesting Marginal Expected Shortfall and Related Systemic Risk Measures Denisa Banulescu 1 Christophe Hurlin 1 Jérémy Leymarie 1 Olivier Scaillet 2 1 University of Orleans 2 University of Geneva & Swiss
More informationOn Model Selection Consistency of Lasso
On Model Selection Consistency of Lasso Peng Zhao Department of Statistics University of Berkeley 367 Evans Hall Berkeley, CA 94720-3860, USA Bin Yu Department of Statistics University of Berkeley 367
More informationUncertainty quantification and visualization for functional random variables
Uncertainty quantification and visualization for functional random variables MascotNum Workshop 2014 S. Nanty 1,3 C. Helbert 2 A. Marrel 1 N. Pérot 1 C. Prieur 3 1 CEA, DEN/DER/SESI/LSMR, F-13108, Saint-Paul-lez-Durance,
More informationSparse PCA with applications in finance
Sparse PCA with applications in finance A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley Available online at www.princeton.edu/~aspremon 1 Introduction
More informationregression Lie Wang Abstract In this paper, the high-dimensional sparse linear regression model is considered,
L penalized LAD estimator for high dimensional linear regression Lie Wang Abstract In this paper, the high-dimensional sparse linear regression model is considered, where the overall number of variables
More informationCointegrated VAR s. Eduardo Rossi University of Pavia. November Rossi Cointegrated VAR s Financial Econometrics / 56
Cointegrated VAR s Eduardo Rossi University of Pavia November 2013 Rossi Cointegrated VAR s Financial Econometrics - 2013 1 / 56 VAR y t = (y 1t,..., y nt ) is (n 1) vector. y t VAR(p): Φ(L)y t = ɛ t The
More informationVolatility. Gerald P. Dwyer. February Clemson University
Volatility Gerald P. Dwyer Clemson University February 2016 Outline 1 Volatility Characteristics of Time Series Heteroskedasticity Simpler Estimation Strategies Exponentially Weighted Moving Average Use
More informationHigh-dimensional Statistical Models
High-dimensional Statistical Models Pradeep Ravikumar UT Austin MLSS 2014 1 Curse of Dimensionality Statistical Learning: Given n observations from p(x; θ ), where θ R p, recover signal/parameter θ. For
More informationForecasting 1 to h steps ahead using partial least squares
Forecasting 1 to h steps ahead using partial least squares Philip Hans Franses Econometric Institute, Erasmus University Rotterdam November 10, 2006 Econometric Institute Report 2006-47 I thank Dick van
More informationChapter 3. Linear Models for Regression
Chapter 3. Linear Models for Regression Wei Pan Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 Email: weip@biostat.umn.edu PubH 7475/8475 c Wei Pan Linear
More informationEconomics 883 Spring 2016 Tauchen. Jump Regression
Economics 883 Spring 2016 Tauchen Jump Regression 1 Main Model In the jump regression setting we have X = ( Z Y where Z is the log of the market index and Y is the log of an asset price. The dynamics are
More informationRobust methods and model selection. Garth Tarr September 2015
Robust methods and model selection Garth Tarr September 2015 Outline 1. The past: robust statistics 2. The present: model selection 3. The future: protein data, meat science, joint modelling, data visualisation
More informationA Comparative Framework for Preconditioned Lasso Algorithms
A Comparative Framework for Preconditioned Lasso Algorithms Fabian L. Wauthier Statistics and WTCHG University of Oxford flw@stats.ox.ac.uk Nebojsa Jojic Microsoft Research, Redmond jojic@microsoft.com
More informationINSTITUTE AND FACULTY OF ACTUARIES. Curriculum 2019 SPECIMEN SOLUTIONS
INSTITUTE AND FACULTY OF ACTUARIES Curriculum 09 SPECIMEN SOLUTIONS Subject CSA Risk Modelling and Survival Analysis Institute and Faculty of Actuaries Sample path A continuous time, discrete state process
More informationA direct formulation for sparse PCA using semidefinite programming
A direct formulation for sparse PCA using semidefinite programming A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley Available online at www.princeton.edu/~aspremon
More informationON VARIANCE COVARIANCE COMPONENTS ESTIMATION IN LINEAR MODELS WITH AR(1) DISTURBANCES. 1. Introduction
Acta Math. Univ. Comenianae Vol. LXV, 1(1996), pp. 129 139 129 ON VARIANCE COVARIANCE COMPONENTS ESTIMATION IN LINEAR MODELS WITH AR(1) DISTURBANCES V. WITKOVSKÝ Abstract. Estimation of the autoregressive
More informationMarkowitz Efficient Portfolio Frontier as Least-Norm Analytic Solution to Underdetermined Equations
Markowitz Efficient Portfolio Frontier as Least-Norm Analytic Solution to Underdetermined Equations Sahand Rabbani Introduction Modern portfolio theory deals in part with the efficient allocation of investments
More informationComputationally efficient banding of large covariance matrices for ordered data and connections to banding the inverse Cholesky factor
Computationally efficient banding of large covariance matrices for ordered data and connections to banding the inverse Cholesky factor Y. Wang M. J. Daniels wang.yanpin@scrippshealth.org mjdaniels@austin.utexas.edu
More informationThe Constrained Lasso
The Constrained Lasso Gareth M. ames, Courtney Paulson and Paat Rusmevichientong Abstract Motivated by applications in areas as diverse as finance, image reconstruction, and curve estimation, we introduce
More informationUNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013
UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 Exam policy: This exam allows two one-page, two-sided cheat sheets; No other materials. Time: 2 hours. Be sure to write your name and
More informationLinear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,
Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,
More informationA Bootstrap Test for Causality with Endogenous Lag Length Choice. - theory and application in finance
CESIS Electronic Working Paper Series Paper No. 223 A Bootstrap Test for Causality with Endogenous Lag Length Choice - theory and application in finance R. Scott Hacker and Abdulnasser Hatemi-J April 200
More informationPrinciples of forecasting
2.5 Forecasting Principles of forecasting Forecast based on conditional expectations Suppose we are interested in forecasting the value of y t+1 based on a set of variables X t (m 1 vector). Let y t+1
More informationRegularization: Ridge Regression and the LASSO
Agenda Wednesday, November 29, 2006 Agenda Agenda 1 The Bias-Variance Tradeoff 2 Ridge Regression Solution to the l 2 problem Data Augmentation Approach Bayesian Interpretation The SVD and Ridge Regression
More informationSparse representation classification and positive L1 minimization
Sparse representation classification and positive L1 minimization Cencheng Shen Joint Work with Li Chen, Carey E. Priebe Applied Mathematics and Statistics Johns Hopkins University, August 5, 2014 Cencheng
More information