Constructed Analogs and Linear Regression
|
|
- Andrea Lang
- 5 years ago
- Views:
Transcription
1 JULY 2013 T I P P E T T A N D D E L S O L E 2519 Constructed Analogs and Linear Regression MICHAEL K. TIPPETT International Research Institute for Climate and Society, Columbia University, Palisades, New York, and Center of Excellence for Climate Change Research, Department of Meteorology, King Abdulaziz University, Jeddah, Saudi Arabia TIMOTHY DELSOLE George Mason University, Fairfax, Virginia, and Center for Ocean Land Atmosphere Studies, Calverton, Maryland (Manuscript received 6 August 2012, in final form 25 November 2012) ABSTRACT The constructed analog procedure produces a statistical forecast that is a linear combination of past predictand values. The weights used to form the linear combination depend on the current predictor value and are chosen so that the linear combination of past predictor values approximates the current predictor value. The properties of the constructed analog method have previously been described as being distinct from those of linear regression. However, here the authors show that standard implementations of the constructed analog method give forecasts that are identical to linear regression forecasts. A consequence of this equivalence is that constructed analog forecasts based on many predictors tend to suffer from overfitting just as in linear regression. Differences between linear regression and constructed analog forecasts only result from implementation choices, especially ones related to the preparation and truncation of data. Two particular constructed analog implementations are shown to correspond to principal component regression and ridge regression. The equality of linear regression and constructed analog forecasts is illustrated in a Ni~no-3.4 prediction example, which also shows that increasing the number of predictors results in low-skill, highvariance forecasts, even at long leads, behavior typical of overfitting. Alternative definitions of the analog weights lead naturally to nonlinear extensions of linear regression such as local linear regression. 1. Introduction A general prediction problem is to find the best estimate of a quantity y given a related quantity x. We refer to vectors y and x as the predictand and predictor, respectively. Examples of typical earth science prediction problems are as follows: x is the current sea surface temperature and y is its future state (Penland and Magorian 1993); x is a prescribed CO 2 concentration and y is global surface temperature (Krueger and Von Storch 2011); x is a large-scale climate feature and y is an associated small-scale climate feature (Robertson et al. 2012). In principle, the probability distribution of y for a particular value of predictor x 5 x 0 (the conditional distribution) can be computed from physical laws or Corresponding author address: M. K. Tippett, International Research Institute for Climate and Society, The Earth Institute of Columbia University, Lamont Campus, 61 Route 9W, Palisades, NY tippett@iri.columbia.edu estimated from data. In either case, the mean of that distribution (the conditional mean) is the best forecast in the sense of minimizing the expected squared error. When x and y have a joint Gaussian distribution, the best forecast, as well as its uncertainty, is given by linear regression (LR). The idea of conditional averaging is also found in the constructed analog (CA) method (Van den Dool 1994, 2006), a statistical forecast method that has been applied in a variety of geophysical problems (e.g., Van den Dool et al. 2003; Maurer and Hidalgo 2008; Hawkins et al. 2011). A prediction y CA is made for a particular value of the predictor x 5 x 0 by searching through historical data for values of y corresponding to values of x that are close to x 0, so-called analogs. The CA method expresses the current predictor state x 0 as a weighted linear combination of past states and makes a prediction by applying those same weights to the corresponding values of y, an averaging procedure reminiscent of the conditional mean. The CA has previously been described as differing from LR in two fundamental ways. First, it has been DOI: /MWR-D Ó 2013 American Meteorological Society
2 2520 M O N T H L Y W E A T H E R R E V I E W VOLUME 141 claimed that by making no assumption of a linear relation between predictor and predictand, CA captures nonlinearity. Second, it has been claimed that since CA is not based on minimizing the mean squared error of the predictions, there is no danger of overfitting. Here we show that typical implementations of CA do not have these properties, and, in fact, CA forecasts are identical to LR forecasts. The paper is organized as follows. In section 2 we review the least squares problems that arise in the formulations of LR and CA, and use the matrix pseudoinverse to show that simple (without predictor truncation or regularization) implementations of the two methods give identical forecasts. In section 3, we identify situations where the simple implementation overfits the data and show that a recommended CA implementation is the same as principal component regression. In section 4, we show that another common CA implementation corresponds to ridge regression. In section 5, we show that LR and CA predictions of the Ni~no-3.4 index are identical and may have large variance even at long leads. In section 6, we present and illustrate some nonlinear regression methods that follow naturally from modifications to CA. A summary and discussion are given in section Linear regression, constructed analogs, and pseudoinverses We use the following matrix notation for the training data. Let X be the N x 3 N t matrix of predictor data; N x is the number of predictor variables and N t is the number of time samples. Each column of X contains the predictor variables (x) at a particular time; each row of X contains the time series of a particular predictor variable. Likewise, let Y be the N y 3 N t matrix of predictand data; N y is the number of predictand variables. Let x 0 be the N x 3 1 column vector of predictor variables to be used in a forecast. We assume that predictors and predictands are expressed as anomalies. More generally, a row of ones can be included in X to account for an intercept term. Linear regression finds the N y 3 N x matrix A of regression coefficients such that the norm of the residuals ky 2 AXk 2 (1) is minimized. The notation kk 2 denotes the square of the Frobenius norm, which is the sum of the squares of the entries of the matrix or vector to which it is applied. The linear regression forecast y LR is y LR 5 Ax 0. (2) Practically, computing the matrix of regression coefficients by direct minimization of (1) may be ill posed (there is no unique solution when N x. N t ) or ill advised (overfitting can lead to poor performance in independent data when N x is comparable to N t ). The CA method also involves a linear least squares minimization problem. In the CA method, x 0 is expressed as a weighted sum of past states (columns of X), and a prediction is formed by applying those same weights to the columns of Y. Specifically, CA finds the N t 3 1 column-vector a of weights that minimizes kx 0 2 Xak 2, (3) and then makes a prediction y CA by applying those weights to the columns of Y: y CA 5 Ya. (4) The linear least squares problems appearing in the formulations of LR and CA appear quite different. For instance, the matrix A of LR coefficients multiplies the data on the left to combine different predictors, and the vector a of CA weights multiplies the data on the right to combine different times. Also, CA involves fitting x 0 while LR fits Y. One of the least squares problems is always underdetermined and the other overdetermined unless N x 5 N t. We will use the pseudoinverse of the data matrix X to solve both linear least squares problems and show that the resulting LR and CA forecasts are identical. In particular, (1) is minimized by A 5 YX 1, (5) where X 1 is the pseudoinverse of X, a quantity that we will define and discuss later (Hansen 1998). When N x. N t, (1) is underdetermined, its minimizer is not unique, and A 5 YX 1 is the minimizer with minimum value of kak 2. The pseudoinverse commutes with the transpose in that, (X 1 ) T 5 (X T ) 1. For this reason, the minimizer of (3) can also be expressed using the pseudoinverse, and a 5 X 1 x 0. (6) When N x, N t, (3) is underdetermined, and a 5 X 1 x 0 is the minimizer with minimum norm. We refer to these direct minimizing solutions as providing simple implementations of LR and CA. Substituting the simple minimizers of (5) and (6) into the definitions of the LR and CA predictions, (2) and (4), respectively, we see that y LR 5 Ax 0 5 YX 1 x 0 5 Ya 5 y CA. (7)
3 JULY 2013 T I P P E T T A N D D E L S O L E 2521 Remarkably, the simple linear regression and constructed analog predictions are identical. 3. Connection to principal component regression While the simple LR implementation does solve the least squares problem and find the best fit to the data, it does so using all of the predictors. Such an approach is ill advised when the number of predictors is comparable to the number of samples since overfitting may result in poor predictions on independent data. To see this point more clearly, let us return to the matter of actually defining the pseudoinverse. The pseudoinverse of X is defined using its singular value decomposition (SVD): X 5 USV T, (8) where U and V are orthogonal square matrices of size N x 3 N x and N t 3 N t, respectively, and S is a diagonal N x 3 N t matrix with nonnegative entries (Golub and Van Loan 1996). The so-called economical SVD is X 5 _ U _ S _ V T, (9) where _ U and _ V retain the columns of U and V, respectively, corresponding to the nonzero diagonal elements of S, and the elements of the square diagonal matrix _ S are strictly positive; the number of positive diagonal entries of S is at most min(n x, N t 2 1) for anomaly data. The pseudoinverse of X is defined to be X 1 5 _ V _ S 21 _ U T. (10) The matrix _ S is square with positive diagonal entries and is thus invertible. Therefore, the simple LR and CA forecasts are y LR 5 y CA 5 YX 1 x 0 5 Y _ V _ S 21 _ U T x 0. (11) In the language of principal component analysis (PCA), the columns of the matrices US/ _ pffiffiffiffiffi N t and N t _V are the empirical orthogonal function (EOFs) and principal components (PCs), pffiffiffiffiffi respectively, of the anomaly data X. The factors of N t serve to normalize the PCs to have unit variance since the columns of V _ are unit vectors with zero mean. Principal component regression (PCR) arises from taking the PCs as predictors rather than the original data in X. Ifweweretouse all of the PCs as predictors [simple PCR (SPCR)], we would find the matrix A SPCR of regression coefficients that minimizes q ffiffiffiffiffi 2 Y 2 A SPCR N _V T t. (12) This linear least squares problem can be solved by finding the pseudoinverse of N t _V T, which is V/ _ N t. Therefore, A SPCR 5 p 1 ffiffiffiffiffiy V. _ (13) N t The simple PCR forecast y SPCR is obtained by applying A SPCR to the PC amplitudes of x 0 which are N t _S 21 U _ T x 0. Therefore, q ffiffiffiffiffi y SPCR 5 A SPCR N _S 21 _ t U T x 0 5 Y V _ S _ 21 U _ T x 0 5 YX 1 x 0 5 y LR 5 y CA. (14) Therefore, the LR and CA forecasts with the simple minimizers are the same as the simple PCR forecast, which uses all of the PCs as predictors. Such an approach overfits the data and has poor prediction skill on independent data unless the number of samples is substantially larger than the number of predictors. To obtain more robust CA weights in the case where the number N x of predictors is comparable or exceeds the number N t of samples, Van den Dool (2006) proposed projecting x 0 and X on to a truncated set of EOFs. We use the tilde notation to denote such a truncation with ~ X 5 U S V T and ~x 0 5 U U T x 0,andthe double-dot notation to denote the truncation of the SVD. Computing the CA weights ~a with the truncated data gives ~a 5 ~ X 1 ~x 0 5 V S 21 U T ~x 0, (15) and applying them to Y gives as prediction, ~y CA 5 Y~a 5 Y V S 21 U T ~x 0 5 A ~ PCR S 21 U T ~x 0, (16) where A ~ PCR 5 Y V/ pffiffiffiffiffi N t and N t S 21 U T ~x 0 are the (truncated) PC amplitudes of x 0. From the previous discussion leading to (14) we recognize (16) as the PCR forecast based on the truncated set of PCs. Computing CA weights with data projected on to a truncated set of EOFs gives the same forecast as PCR using the same truncated set of PCs. The choice of the number of PCs to use in the calculation of the CA weights has exactly the same effect on the forecast as the choice of the number of PCs to use in PCR. In both cases, using too many PCs leads to overfitting.
4 2522 M O N T H L Y W E A T H E R R E V I E W VOLUME Connection to ridge regression Another approach to the linear least squares problems in (1) and (3) that appear in the formulations of LR and CA is ridge regression, also known as Tikhonov regularization. The regularized solutions of (1) and (3) are and A d 5 YX T (XX T 1 di) 21, (17) a d 5 (X T X 1 di) 21 X T x 0, (18) respectively, where I is the appropriately sized identity matrix and the ridge parameter d is a positive scalar (Hansen 1998). The regularized solutions are welldefined irrespective of the parameters N x and N t.the matrix A d is precisely that used in ridge regression, and Van den Dool (2006) suggested using a d in CA. Remarkably, the resulting forecasts y LA,d and y CA,d are identical: y CA,d 5 Ya d 5 Y(X T X 1 di) 21 X T x 0 5 YX T (XX T 1 di) 21 x 0 5 A d x 0 5 y LR,d, (19) where we have used the push-through matrix identity (X T X 1 di) 21 X T 5 X T (XX T 1 di) 21. Use of ridging in computing the CA weights or in computing the LR coefficients results in identical forecasts. The ridge regression solution is directly related to the pseudoinverse-based solution since an equivalent definition of the pseudoinverse is X 1 5 lim (X T X 1 di) 21 X T 5 lim X T (XX T 1 di) 21. Consequently, (20) lim y CA,d 5 lim y LR,d 5 y CA 5 y LR. (21) The ridge regression forecast in the limit of d going to zero is the same as the LR or CA forecast with a simple minimizer. This result is consistent with the interpretation of ridge regression as solving the least squares problems subject to a constraint on the size of the solution (DelSole 2007). 5. Example: Ni~no-3.4 prediction A typical application of CA and LR is the prediction of the Ni~no-3.4 index (Van den Dool 2006). We consider forecasts made in the beginning of July and take as predictors the gridded April June sea surface temperature (SST) anomaly in the region from 408Sto408Nfrom the extended reconstructed SST (ERSST) dataset, version 3b (Smith and Reynolds 2004). The historical data used to form X come from the 49-yr period , and the anomalies are computed with respect to the same period. The predictand y is the 3-month average of Ni~no- 3.4 anomaly with respect to the period taken from the extended Kaplan dataset (Kaplan et al. 1998) at leads extending to lead 22; denoting the July September 2005 as the zero-month lead forecast, lead 22 is April June Here the initial condition x 0 is the April June 2005 SST anomaly, and y consists of the Ni~no-3.4 index from April June 2005 to April June 2007, 25 leads. Forecasts are made based on varying number of areaweighted EOFs; no ridging is used. Figure 1 shows that CA and PCR forecasts based on the same number of EOFs are identical. On the other hand, forecasts based on different numbers of EOFs can vary greatly. Forecasts using 10 EOFs show little variability, while those with 25 or more show considerable variability. This particular set of forecasts verifies well against observations out to a lead of nearly two years. The skill of forecasts made in July for the following March May (lead 8) was computed for period using the entire dataset and using leave-one-out cross validation (CV) applied to the LR coefficients and CA weights; the PCs were computed using the full dataset. The CV skill of the 10 EOF forecasts is the highest, and as the number of EOFs increases, the resulting forecasts have lower CV skill and greater variance (Table 1). On the other hand, the in-sample correlation increases as the number of EOFs increases, and the in-sample ratio of forecast to climatological variance is equal to the in-sample correlation. The variance of the cross-validated forecasts is greater than the climatological variance when 25 or more EOFs are used. The reason for this behavior is that the insample explained variance and the variance of the regression coefficient estimates, both of which are increasing functions of the number of predictors, contribute to the variance of the cross-validated forecasts. The behavior of the CV forecasts, especially those with more than 10 EOFs, is consistent with that of overfitting with in-sample skill being substantially greater than the CV skill, and the CV skill being inconsistent with the variance. 6. Nonlinear CA Our demonstration that CA forecasts are identical to LR forecasts depends on the weights being defined as the solution of the least squares problem in (3), and a particular solution being chosen in the underdetermined
5 JULY 2013 T I P P E T T A N D D E L S O L E 2523 TABLE 1. Skill and ratio of forecast to climatological variance of in-sample and leave-one-out cross-validated (CV) forecasts made in the beginning of July for the following March May average (lead 8) of the Ni~no-3.4 index during the period EOFs Correlation (in sample) Correlation (CV) Ratio of forecast to climatological variance (in sample) Ratio of forecast to climatological variance (CV) FIG. 1. Constructed analog (CA) and principal component regression (PCR) forecasts along with observations (obs) of the threemonth-average Ni~no-3.4 index. Forecasts are made at the beginning of July and extend through April June of The numbers in the legend indicate the number of EOFs retained. case. Other characterizations of the weights lead to quite different methods. Before considering other methods of computing weights, we examine the properties of the CA weights in more detail, focusing on the case when N x, N t. In this case, (3) does not have a unique solution and using the pseudoinverse or ridge regression selects a particular solution for the weights. The simple minimizer weights are a 5 _ V _ S 21 _ U T x 0. (22) The form of (22) means that for any x 0, the vector a of weights is a linear combination of the columns of _ V. Since the columns of _ V span the same linear space as the rows of X, the weights are a linear combination of the rows of X. In other words, for some N x 3 1 vector b, a 5 X T b; (23) in particular, b 5 _ U _ S 22 _ U T x 0, and in the case that the predictors are PCs, b 5 x 0. Equation (23) means that the weights, viewed as a function of the data, lie on a hyperplane perpendicular to the (N t 1 1) 3 1 vector [b T, 21]. Because the weights are linear functions of the data, data with values near x 0 do not receive the largest weights, nor do data far from x 0 receive the smallest weights. The CA weights do not measure the distance of x 0 to the training data values. In particular, if x 0 is a natural analog and has the same values as a column of X, the weights are not concentrated on that column of X. Modifying the definition of the CA weights, so that they are a function of the distance between the data and x 0, results in nonlinear statistical prediction algorithms with weights that depend nonlinearly on the data. Importantly, in the case when N x, N t, such a modification requires neither changing the least squares problem in (3) or the forecast equation in (4), but rather involves constructing alternative solutions to (3), that is, ones without the constraint that kak 2 be minimized. For instance, in the k-nearest neighbors (KNN) algorithm, the elements of the weight vector are all zero except for those corresponding to the k columns of X that are closest to x 0, which have value 1/k (Hastie et al. 2009). Explicitly, the ith KNN weight is 8 >< 1 a KNN,i 5 k, if x i 2 C k (x 0 ) >: 0, otherwise (24) where C k (x 0 ) is the set of k columns of X nearest to x 0. The KNN prediction is the average of the columns of Y corresponding to the k columns of X nearest to x 0. Kernel methods generalize KNN by using weights that are a smoothly decreasing function of the distance between the columns of X and x 0. In particular, the ith kernel smoother (KS) weight is a KS,i 5 K(x i, x 0, l) N t å j51 K(x j, x 0, l), (25) where the kernel function K(x, x 0, l) is a smoothly decreasing, positive function of the distance between x and x 0, and l is a parameter that determines how quickly the kernel function decreases to zero. Local linear regression (LLR) is another kernel method and computes the weights using generalized least squares with data close to x 0 receiving more emphasis. Specifically, a LLR 5 W 1/2 (XW 1/2 ) 1 x 0, (26) where the matrix W is a N t 3 N t diagonal matrix that depends on x 0 and whose ith diagonal entry is W ii 5 K(x i, x 0, l). (27)
6 2524 M O N T H L Y W E A T H E R R E V I E W VOLUME 141 coefficients, these parameters should be chosen objectively in a way that avoids overfitting. The CA, KNN, GKS, and LLR weights are quite different for x as shown in Fig. 2b. The sum of the weights is one for all methods due to the intercept term. A clear feature of the CA weights is that they are a linear function of the data values and display no maximum near x 0. This behavior is general as discussed earlier. The KNN weights are zero except for the five data points nearest to x 0 where they are 1 /5. The GKS weights have largest values near x 0 anddecreasetozeroasthedistance to x 0 increases. The LLR weights are locally linear near x 0 with values that go to zero far from x Summary and discussion FIG. 2. (a) Data (plus signs) generated by (28) fit by linear regression (LR) constructed analog (CA), k-nearest neighbors (KNN), Gaussian kernel smoother (GKS), and local linear regression (LLR). The truth curve is the expected value of y given x. (b) The CA, KNN, GKS, and LLR weights for x The LLR weights are divided by 4 for display purposes. We applied these methods to 30 samples of univariate data generated by y 5 x 1 0:8x 3 1, (28) where x and are Gaussian distributed with mean zero and unit variance. A row of ones is included in X to account for a possible intercept term. Figure 2a shows that LR CA fails to capture the nonlinear relation. The KKN fit with k 5 5 is noisy and piecewise constant with discontinuities. A Gaussian kernel smoother (GKS) with a standard deviation of 0.35 and LLR (with the same Gaussian kernel) give similar results with LLR showing an advantage near the boundaries of the data. It is important to note that the performance of KNN depends on the choice of k, while the performance of the GKS and LRR depends on the kernel parameter l. Here we have selected fairly arbitrary values for these parameters that give good performance. However, like the regression While the constructed analog (CA) statistical forecast method has previously been described as having properties that are distinct from those of linear regression (LR; Van den Dool 2006), we have shown here that, with comparable treatment of the data, CA and LR produce identical forecasts, and therefore the properties of CA are the same as those of LR. In particular, CA forecasts are linear functions of the predictors and subject to overfitting. When EOF truncation is used in the CA calculation, the resulting forecast is the same as that given by principal component regression (PCR) based on the same EOFs. Likewise, using ridging in the calculation of CA weights results in the same forecast as does ridge regression. These results were illustrated in an example where sea surface temperature was used to predict the Ni~no-3.4 index. The CA and PCR forecasts based on the same number of PCs are identical. When many PCs were used, the forecasts show high variance, even at long leads, but low cross-validated skill, a symptom of overfitting. The equivalence between LR and CA depends on the precise definition of the weights. Allowing the weights to depend nonlinearly on the data leads naturally to generalizations of CA such as kernel smoothers and local linear regression, which we have illustrated with an example. In practice, LR forecasts are observed to differ from CA forecasts. Moreover, forecasts from different implementations of LR also differ. For instance, LR-based statistical forecasts of ENSO including CA have quite different properties (Barnston et al. 2012). Use of distinct datasets may explain some of these differences. However, it must be recognized that many linear regression forecasts, with significant variations in skill, can be constructed from a given dataset of predictors and predictands. There are two primary sources of variety. First, the predictors or predictands can be truncated, and the regression developed on the truncated data. Principal component analysis and canonical correlation
7 JULY 2013 T I P P E T T A N D D E L S O L E 2525 analysis are commonly used methods for truncating the data that enter a LR. The resulting forecasts depend on the truncation choices as illustrated here in the Ni~no-3.4 example where the forecasts depend strongly on the number of principal components retained as predictors. Linear inverse models and autoregressive methods usually project both the predictors and predictands onto EOFs (DelSole and Chang 2003); CA generally only projects the predictors, thus leading to different forecasts. Second, there are a variety of methods for estimating the LR coefficients. In addition to the classic least squares method, there are shrinkage methods like ridge and lasso (Hastie et al. 2009). The CA often uses ridge; PCR does not, again leading to different forecasts. Appropriate choices of data truncation and coefficient estimation method are key to developing a skillful LR forecast. Acknowledgments. The authors thank Huug van den Dool for his generous and helpful comments, and two anonymous reviewers for their useful suggestions. MKT is supported by grants from the National Oceanic and Atmospheric Administration (Grants NA05OAR and NA08OAR ) and the Office of Naval Research (Grant N ). TD gratefully acknowledges support from grants from the NSF (Grant ), the National Oceanic and Atmospheric Administration (Grant NA09OAR ), and the National Aeronautics and Space Administration (Grant NNX09AN50G). The views expressed herein are those of the authors and do not necessarily reflect the views of NOAA or any of its subagencies. REFERENCES Barnston, A. G., M. K. Tippett, M. L. L Heureux, S. Li, and D. G. DeWitt, 2012: Skill of real-time seasonal ENSO model predictions during Is our capability increasing? Bull. Amer. Meteor. Soc., 93, DelSole, T., 2007: A Bayesian framework for multimodel regression. J. Climate, 20, , and P. Chang, 2003: Predictable component analysis, canonical correlation analysis, and autoregressive models. J. Atmos. Sci., 60, Golub, G. H., and C. F. Van Loan, 1996: Matrix Computations. 3rd ed. The Johns Hopkins University Press, 694 pp. Hansen, P., 1998: Rank-Deficient and Discrete Ill-Posed Problems: Numerical Aspects of Linear Inversion. Society for Industrial and Applied Mathematics, 247 pp. Hastie, T., R. Tibshirani, and J. Friedman, 2009: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, 768 pp. Hawkins, E., J. Robson, R. Sutton, D. Smith, and N. Keenlyside, 2011: Evaluating the potential for statistical decadal predictions of sea surface temperatures with a perfect model approach. Climate Dyn., 37, Kaplan, A., M. A. Cane, Y. Kushnir, A. C. Clement, M. B. Blumenthal, and B. Rajagopalan, 1998: Analyses of global sea surface temperature J. Geophys. Res., 103 (C9), Krueger, O., and J.-S. Von Storch, 2011: A simple empirical model for decadal climate prediction. J. Climate, 24, Maurer, E. P., and H. G. Hidalgo, 2008: Utility of daily vs. monthly large-scale climate data: An intercomparison of two statistical downscaling methods. Hydrol.EarthSyst.Sci.,12, Penland, C., and T. Magorian, 1993: Prediction of Ni~no-3 sea surface temperatures using linear inverse modeling. J. Climate, 6, Robertson, A. W., J.-H. Qian, M. K. Tippett, V. Moron, and A. Lucero, 2012: Downscaling of seasonal rainfall over the Philippines: Dynamical versus statistical approaches. Mon. Wea. Rev., 140, Smith, T. M., and R. W. Reynolds, 2004: Improved extended reconstruction of SST ( ). J. Climate, 17, Van den Dool, H., 1994: Searching for analogues, how long must we wait? Tellus, 46A, , 2006: Empirical Methods in Short-Term Climate Prediction. Oxford University Press, 240 pp., J. Huang, and Y. Fan, 2003: Performance and analysis of the constructed analogue method applied to U.S. soil moisture over J. Geophys. Res., 108, 8617, doi: / 2002JD
Multimodel Ensemble forecasts
Multimodel Ensemble forecasts Calibrated methods Michael K. Tippett International Research Institute for Climate and Society The Earth Institute, Columbia University ERFS Climate Predictability Tool Training
More informationstatistical methods for tailoring seasonal climate forecasts Andrew W. Robertson, IRI
statistical methods for tailoring seasonal climate forecasts Andrew W. Robertson, IRI tailored seasonal forecasts why do we make probabilistic forecasts? to reduce our uncertainty about the (unknown) future
More informationStatistical foundations
Statistical foundations Michael K. Tippett International Research Institute for Climate and Societ The Earth Institute, Columbia Universit ERFS Climate Predictabilit Tool Training Workshop Ma 4-9, 29 Ideas
More informationStatistical downscaling daily rainfall statistics from seasonal forecasts using canonical correlation analysis or a hidden Markov model?
Statistical downscaling daily rainfall statistics from seasonal forecasts using canonical correlation analysis or a hidden Markov model? Andrew W. Robertson International Research Institute for Climate
More informationInter-comparison of Historical Sea Surface Temperature Datasets
Inter-comparison of Historical Sea Surface Temperature Datasets Sayaka Yasunaka 1, Kimio Hanawa 2 1 Center for Climate System Research, University of Tokyo, Japan 2 Graduate School of Science, Tohoku University,
More informationGI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis. Massimiliano Pontil
GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis Massimiliano Pontil 1 Today s plan SVD and principal component analysis (PCA) Connection
More informationThe Two Types of ENSO in CMIP5 Models
1 2 3 The Two Types of ENSO in CMIP5 Models 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 Seon Tae Kim and Jin-Yi Yu * Department of Earth System
More informationPrincipal Component Analysis of Sea Surface Temperature via Singular Value Decomposition
Principal Component Analysis of Sea Surface Temperature via Singular Value Decomposition SYDE 312 Final Project Ziyad Mir, 20333385 Jennifer Blight, 20347163 Faculty of Engineering Department of Systems
More informationENSO Prediction with Markov Models: The Impact of Sea Level
849 ENSO Prediction with Markov Models: The Impact of Sea Level YAN XUE,* ANTS LEETMAA, AND MING JI NOAA/NWS/National Centers for Environmental Prediction, Washington, D.C. (Manuscript received 10 November
More informationpresented by: Latest update: 11 January 2018
Seasonal forecasts presented by: Latest update: 11 January 2018 The seasonal forecasts presented here by Seasonal Forecast Worx are based on forecast output of the coupled ocean-atmosphere models administered
More informationPotentially predictable components of African summer. rainfall in a SST-forced GCM simulation
Potentially predictable components of African summer rainfall in a SST-forced GCM simulation MICHAEL K. TIPPETT AND ALESSANDRA GIANNINI International Research Institute for Climate Prediction, Columbia
More informationEstimation of Seasonal Precipitation Tercile-Based Categorical Probabilities from Ensembles
10 J O U R A L O F C L I M A T E VOLUME 0 Estimation of Seasonal Precipitation Tercile-Based Categorical Probabilities from Ensembles MICHAEL K. TIPPETT, ATHOY G. BARSTO, AD ADREW W. ROBERTSO International
More informationProperties of Matrices and Operations on Matrices
Properties of Matrices and Operations on Matrices A common data structure for statistical analysis is a rectangular array or matris. Rows represent individual observational units, or just observations,
More informationOn Sampling Errors in Empirical Orthogonal Functions
3704 J O U R N A L O F C L I M A T E VOLUME 18 On Sampling Errors in Empirical Orthogonal Functions ROBERTA QUADRELLI, CHRISTOPHER S. BRETHERTON, AND JOHN M. WALLACE University of Washington, Seattle,
More informationDS-GA 1002 Lecture notes 10 November 23, Linear models
DS-GA 2 Lecture notes November 23, 2 Linear functions Linear models A linear model encodes the assumption that two quantities are linearly related. Mathematically, this is characterized using linear functions.
More informationEstimation of seasonal precipitation tercile-based categorical probabilities. from ensembles. April 27, 2006
Estimation of seasonal precipitation tercile-based categorical probabilities from ensembles MICHAEL K. TIPPETT, ANTHONY G. BARNSTON AND ANDREW W. ROBERTSON International Research Institute for Climate
More informationLinear Methods for Regression. Lijun Zhang
Linear Methods for Regression Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Linear Regression Models and Least Squares Subset Selection Shrinkage Methods Methods Using Derived
More informationExtended-Range Prediction with Low-Dimensional, Stochastic-Dynamic Models: A Data-driven Approach
DISTRIBUTION STATEMENT A: Distribution approved for public release; distribution is unlimited. Extended-Range Prediction with Low-Dimensional, Stochastic-Dynamic Models: A Data-driven Approach Michael
More informationMachine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling
Machine Learning B. Unsupervised Learning B.2 Dimensionality Reduction Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University
More informationObserved ENSO teleconnections with the South American monsoon system
ATMOSPHERIC SCIENCE LETTERS Atmos. Sci. Let. 11: 7 12 (2010) Published online 8 January 2010 in Wiley InterScience (www.interscience.wiley.com) DOI: 10.1002/asl.245 Observed ENSO teleconnections with the
More information8.6 Bayesian neural networks (BNN) [Book, Sect. 6.7]
8.6 Bayesian neural networks (BNN) [Book, Sect. 6.7] While cross-validation allows one to find the weight penalty parameters which would give the model good generalization capability, the separation of
More informationInverse Theory. COST WaVaCS Winterschool Venice, February Stefan Buehler Luleå University of Technology Kiruna
Inverse Theory COST WaVaCS Winterschool Venice, February 2011 Stefan Buehler Luleå University of Technology Kiruna Overview Inversion 1 The Inverse Problem 2 Simple Minded Approach (Matrix Inversion) 3
More informationNOTES AND CORRESPONDENCE. Improving Week-2 Forecasts with Multimodel Reforecast Ensembles
AUGUST 2006 N O T E S A N D C O R R E S P O N D E N C E 2279 NOTES AND CORRESPONDENCE Improving Week-2 Forecasts with Multimodel Reforecast Ensembles JEFFREY S. WHITAKER AND XUE WEI NOAA CIRES Climate
More informationStructure in Data. A major objective in data analysis is to identify interesting features or structure in the data.
Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two
More informationNonlinear atmospheric teleconnections
GEOPHYSICAL RESEARCH LETTERS, VOL.???, XXXX, DOI:10.1029/, Nonlinear atmospheric teleconnections William W. Hsieh, 1 Aiming Wu, 1 and Amir Shabbar 2 Neural network models are used to reveal the nonlinear
More informationSeasonal forecasts presented by:
Seasonal forecasts presented by: Latest Update: 9 February 2019 The seasonal forecasts presented here by Seasonal Forecast Worx are based on forecast output of the coupled ocean-atmosphere models administered
More informationSome theoretical considerations on predictability of linear stochastic dynamics
Tellus (2003), 55A, 148 157 Copyright C Blackwell Munksgaard, 2003 Printed in UK. All rights reserved TELLUS ISSN 0280 6495 Some theoretical considerations on predictability of linear stochastic dynamics
More informationCHAPTER 2 DATA AND METHODS. Errors using inadequate data are much less than those using no data at all. Charles Babbage, circa 1850
CHAPTER 2 DATA AND METHODS Errors using inadequate data are much less than those using no data at all. Charles Babbage, circa 185 2.1 Datasets 2.1.1 OLR The primary data used in this study are the outgoing
More informationSeasonal forecasts presented by:
Seasonal forecasts presented by: Latest Update: 10 November 2018 The seasonal forecasts presented here by Seasonal Forecast Worx are based on forecast output of the coupled ocean-atmosphere models administered
More informationAverage Predictability Time. Part II: Seamless Diagnoses of Predictability on Multiple Time Scales
1188 J O U R N A L O F T H E A T M O S P H E R I C S C I E N C E S VOLUME 66 Average Predictability Time. Part II: Seamless Diagnoses of Predictability on Multiple Time Scales TIMOTHY DELSOLE George Mason
More informationChap.11 Nonlinear principal component analysis [Book, Chap. 10]
Chap.11 Nonlinear principal component analysis [Book, Chap. 1] We have seen machine learning methods nonlinearly generalizing the linear regression method. Now we will examine ways to nonlinearly generalize
More informationNOTES AND CORRESPONDENCE
5666 J O U R N A L O F C L I M A T E VOLUME 20 NOTES AND CORRESPONDENCE Comments on Testing the Fidelity of Methods Used in Proxy-Based Reconstructions of Past Climate : The Role of the Standardization
More informationLecture 6: Methods for high-dimensional problems
Lecture 6: Methods for high-dimensional problems Hector Corrada Bravo and Rafael A. Irizarry March, 2010 In this Section we will discuss methods where data lies on high-dimensional spaces. In particular,
More informationThe Coupled Model Predictability of the Western North Pacific Summer Monsoon with Different Leading Times
ATMOSPHERIC AND OCEANIC SCIENCE LETTERS, 2012, VOL. 5, NO. 3, 219 224 The Coupled Model Predictability of the Western North Pacific Summer Monsoon with Different Leading Times LU Ri-Yu 1, LI Chao-Fan 1,
More informationWhy Has the Land Memory Changed?
3236 JOURNAL OF CLIMATE VOLUME 17 Why Has the Land Memory Changed? QI HU ANDSONG FENG Climate and Bio-Atmospheric Sciences Group, School of Natural Resource Sciences, University of Nebraska at Lincoln,
More informationIV. Matrix Approximation using Least-Squares
IV. Matrix Approximation using Least-Squares The SVD and Matrix Approximation We begin with the following fundamental question. Let A be an M N matrix with rank R. What is the closest matrix to A that
More informationTHE SINGULAR VALUE DECOMPOSITION MARKUS GRASMAIR
THE SINGULAR VALUE DECOMPOSITION MARKUS GRASMAIR 1. Definition Existence Theorem 1. Assume that A R m n. Then there exist orthogonal matrices U R m m V R n n, values σ 1 σ 2... σ p 0 with p = min{m, n},
More informationEnsemble Hindcasts of ENSO Events over the Past 120 Years Using a Large Number of Ensembles
ADVANCES IN ATMOSPHERIC SCIENCES, VOL. 26, NO. 2, 2009, 359 372 Ensemble Hindcasts of ENSO Events over the Past 120 Years Using a Large Number of Ensembles ZHENG Fei 1 (x ), ZHU Jiang 2 (Á ô), WANG Hui
More informationB553 Lecture 5: Matrix Algebra Review
B553 Lecture 5: Matrix Algebra Review Kris Hauser January 19, 2012 We have seen in prior lectures how vectors represent points in R n and gradients of functions. Matrices represent linear transformations
More informationIntroduction to Machine Learning
10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what
More informationCh.3 Canonical correlation analysis (CCA) [Book, Sect. 2.4]
Ch.3 Canonical correlation analysis (CCA) [Book, Sect. 2.4] With 2 sets of variables {x i } and {y j }, canonical correlation analysis (CCA), first introduced by Hotelling (1936), finds the linear modes
More informationDS-GA 1002 Lecture notes 12 Fall Linear regression
DS-GA Lecture notes 1 Fall 16 1 Linear models Linear regression In statistics, regression consists of learning a function relating a certain quantity of interest y, the response or dependent variable,
More informationSingular Value Decomposition. 1 Singular Value Decomposition and the Four Fundamental Subspaces
Singular Value Decomposition This handout is a review of some basic concepts in linear algebra For a detailed introduction, consult a linear algebra text Linear lgebra and its pplications by Gilbert Strang
More informationEnsemble square-root filters
Ensemble square-root filters MICHAEL K. TIPPETT International Research Institute for climate prediction, Palisades, New Yor JEFFREY L. ANDERSON GFDL, Princeton, New Jersy CRAIG H. BISHOP Naval Research
More informationChapter 3 Transformations
Chapter 3 Transformations An Introduction to Optimization Spring, 2014 Wei-Ta Chu 1 Linear Transformations A function is called a linear transformation if 1. for every and 2. for every If we fix the bases
More informationUncertainty in Ranking the Hottest Years of U.S. Surface Temperatures
1SEPTEMBER 2013 G U T T O R P A N D K I M 6323 Uncertainty in Ranking the Hottest Years of U.S. Surface Temperatures PETER GUTTORP University of Washington, Seattle, Washington, and Norwegian Computing
More informationJuly Forecast Update for Atlantic Hurricane Activity in 2017
July Forecast Update for Atlantic Hurricane Activity in 2017 Issued: 4 th July 2017 by Professor Mark Saunders and Dr Adam Lea Dept. of Space and Climate Physics, UCL (University College London), UK Forecast
More informationIntroduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin
1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)
More informationInter-comparison of Historical Sea Surface Temperature Datasets. Yasunaka, Sayaka (CCSR, Univ. of Tokyo, Japan) Kimio Hanawa (Tohoku Univ.
Inter-comparison of Historical Sea Surface Temperature Datasets Yasunaka, Sayaka (CCSR, Univ. of Tokyo, Japan) Kimio Hanawa (Tohoku Univ., Japan) < Background > Sea surface temperature (SST) is the observational
More informationRobust GEFA Assessment of Climate Feedback to SST EOF Modes
ADVANCES IN ATMOSPHERIC SCIENCES, VOL. 28, NO. 4, 2011, 907 912 Robust GEFA Assessment of Climate Feedback to SST EOF Modes FAN Lei 1 ( [), Zhengyu LIU 2,3 (4ffi ), and LIU Qinyu 1 (4 ) 1 Physical Oceanography
More informationLinear Models. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.
Linear Models DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Linear regression Least-squares estimation
More informationBred Vectors: A simple tool to understand complex dynamics
Bred Vectors: A simple tool to understand complex dynamics With deep gratitude to Akio Arakawa for all the understanding he has given us Eugenia Kalnay, Shu-Chih Yang, Malaquías Peña, Ming Cai and Matt
More informationThe two types of ENSO in CMIP5 models
GEOPHYSICAL RESEARCH LETTERS, VOL. 39,, doi:10.1029/2012gl052006, 2012 The two types of ENSO in CMIP5 models Seon Tae Kim 1 and Jin-Yi Yu 1 Received 12 April 2012; revised 14 May 2012; accepted 15 May
More informationDepartment of Civil, Construction and Environmental Engineering, North Carolina State University, Raleigh, North Carolina
JUNE 2010 D E V I N E N I A N D S A N K A R A S U B R A M A N I A N 2447 Improving the Prediction of Winter Precipitation and Temperature over the Continental United States: Role of the ENSO State in Developing
More informationLecture Notes 1: Vector spaces
Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector
More informationFig.3.1 Dispersion of an isolated source at 45N using propagating zonal harmonics. The wave speeds are derived from a multiyear 500 mb height daily
Fig.3.1 Dispersion of an isolated source at 45N using propagating zonal harmonics. The wave speeds are derived from a multiyear 500 mb height daily data set in January. The four panels show the result
More informationWhat is one-month forecast guidance?
What is one-month forecast guidance? Kohshiro DEHARA (dehara@met.kishou.go.jp) Forecast Unit Climate Prediction Division Japan Meteorological Agency Outline 1. Introduction 2. Purposes of using guidance
More informationAugust Forecast Update for Atlantic Hurricane Activity in 2015
August Forecast Update for Atlantic Hurricane Activity in 2015 Issued: 5 th August 2015 by Professor Mark Saunders and Dr Adam Lea Dept. of Space and Climate Physics, UCL (University College London), UK
More informationBindel, Fall 2016 Matrix Computations (CS 6210) Notes for
1 A cautionary tale Notes for 2016-10-05 You have been dropped on a desert island with a laptop with a magic battery of infinite life, a MATLAB license, and a complete lack of knowledge of basic geometry.
More informationSingular Value Decomposition
Chapter 6 Singular Value Decomposition In Chapter 5, we derived a number of algorithms for computing the eigenvalues and eigenvectors of matrices A R n n. Having developed this machinery, we complete our
More information4 Bias-Variance for Ridge Regression (24 points)
Implement Ridge Regression with λ = 0.00001. Plot the Squared Euclidean test error for the following values of k (the dimensions you reduce to): k = {0, 50, 100, 150, 200, 250, 300, 350, 400, 450, 500,
More informationSVD, PCA & Preprocessing
Chapter 1 SVD, PCA & Preprocessing Part 2: Pre-processing and selecting the rank Pre-processing Skillicorn chapter 3.1 2 Why pre-process? Consider matrix of weather data Monthly temperatures in degrees
More informationSeparation of a Signal of Interest from a Seasonal Effect in Geophysical Data: I. El Niño/La Niña Phenomenon
International Journal of Geosciences, 2011, 2, **-** Published Online November 2011 (http://www.scirp.org/journal/ijg) Separation of a Signal of Interest from a Seasonal Effect in Geophysical Data: I.
More informationPredictability of Week 3-4 Average Temperature and Precipitation over the. Contiguous United States
1 Predictability of Week 3-4 Average Temperature and Precipitation over the 2 Contiguous United States 3 Timothy DelSole, Laurie Trenary, 4 5 George Mason University, Fairfax, Virginia and Center for Ocean-Land-Atmosphere
More informationAnalysis of Incomplete Climate Data: Estimation of Mean Values and Covariance Matrices and Imputation of Missing Values
1MARCH 001 SCHNEIDER 853 Analysis of Incomplete Climate Data: Estimation of Mean Values and Covariance Matrices and Imputation of Missing Values APIO SCHNEIDER Atmospheric and Oceanic Sciences Program,
More informationAnuMS 2018 Atlantic Hurricane Season Forecast
AnuMS 2018 Atlantic Hurricane Season Forecast Issued: May 10, 2018 by Dale C. S. Destin (follow @anumetservice) Director (Ag), Antigua and Barbuda Meteorological Service (ABMS) The *AnuMS (Antigua Met
More informationAnuMS 2018 Atlantic Hurricane Season Forecast
AnuMS 2018 Atlantic Hurricane Season Forecast : June 11, 2018 by Dale C. S. Destin (follow @anumetservice) Director (Ag), Antigua and Barbuda Meteorological Service (ABMS) The *AnuMS (Antigua Met Service)
More informationPCA, Kernel PCA, ICA
PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per
More informationCoupled ocean-atmosphere ENSO bred vector
Coupled ocean-atmosphere ENSO bred vector Shu-Chih Yang 1,2, Eugenia Kalnay 1, Michele Rienecker 2 and Ming Cai 3 1 ESSIC/AOSC, University of Maryland 2 GMAO, NASA/ Goddard Space Flight Center 3 Dept.
More informationCombining Deterministic and Probabilistic Methods to Produce Gridded Climatologies
Combining Deterministic and Probabilistic Methods to Produce Gridded Climatologies Michael Squires Alan McNab National Climatic Data Center (NCDC - NOAA) Asheville, NC Abstract There are nearly 8,000 sites
More informationSeasonal Climate Watch June to October 2018
Seasonal Climate Watch June to October 2018 Date issued: May 28, 2018 1. Overview The El Niño-Southern Oscillation (ENSO) has now moved into the neutral phase and is expected to rise towards an El Niño
More informationLeast Squares Optimization
Least Squares Optimization The following is a brief review of least squares optimization and constrained optimization techniques. I assume the reader is familiar with basic linear algebra, including the
More information15 Singular Value Decomposition
15 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing
More informationVector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.
Vector spaces DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Vector space Consists of: A set V A scalar
More informationLinear Systems. Carlo Tomasi
Linear Systems Carlo Tomasi Section 1 characterizes the existence and multiplicity of the solutions of a linear system in terms of the four fundamental spaces associated with the system s matrix and of
More informationApril Forecast Update for Atlantic Hurricane Activity in 2016
April Forecast Update for Atlantic Hurricane Activity in 2016 Issued: 5 th April 2016 by Professor Mark Saunders and Dr Adam Lea Dept. of Space and Climate Physics, UCL (University College London), UK
More informationClimate Outlook for Pacific Islands for May - October 2015
The APEC CLIMATE CENTER Climate Outlook for Pacific Islands for May - October 2015 BUSAN, 23 April 2015 Synthesis of the latest model forecasts for May - October 2015 (MJJASO) at the APEC Climate Center
More informationDISCUSSION OF: A STATISTICAL ANALYSIS OF MULTIPLE TEMPERATURE PROXIES: ARE RECONSTRUCTIONS OF SURFACE TEMPERATURES OVER THE LAST 1000 YEARS RELIABLE?
Submitted to the Annals of Applied Statistics arxiv: math.pr/0000000 DISCUSSION OF: A STATISTICAL ANALYSIS OF MULTIPLE TEMPERATURE PROXIES: ARE RECONSTRUCTIONS OF SURFACE TEMPERATURES OVER THE LAST 1000
More information28th Conference on Hurricanes and Tropical Meteorology, 28 April 2 May 2008, Orlando, Florida.
P2B. TROPICAL INTENSITY FORECASTING USING A SATELLITE-BASED TOTAL PRECIPITABLE WATER PRODUCT Mark DeMaria* NOAA/NESDIS/StAR, Fort Collins, CO Jeffery D. Hawkins Naval Research Laboratory, Monterey, CA
More informationCS6964: Notes On Linear Systems
CS6964: Notes On Linear Systems 1 Linear Systems Systems of equations that are linear in the unknowns are said to be linear systems For instance ax 1 + bx 2 dx 1 + ex 2 = c = f gives 2 equations and 2
More informationLecture: Face Recognition and Feature Reduction
Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab Lecture 11-1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed
More informationMultiple Ocean Analysis Initialization for Ensemble ENSO Prediction using NCEP CFSv2
Multiple Ocean Analysis Initialization for Ensemble ENSO Prediction using NCEP CFSv2 B. Huang 1,2, J. Zhu 1, L. Marx 1, J. L. Kinter 1,2 1 Center for Ocean-Land-Atmosphere Studies 2 Department of Atmospheric,
More informationStatistical Downscaling of Pattern Projection Using Multi-Model Output Variables as Predictors
NO.3 KANG Hongwen, ZHU Congwen, ZUO Zhiyan, et al. 293 Statistical Downscaling of Pattern Projection Using Multi-Model Output Variables as Predictors KANG Hongwen 1 (xù ), ZHU Congwen 2 (6l ), ZUO Zhiyan
More informationData Mining and Analysis: Fundamental Concepts and Algorithms
Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA
More informationPotential impact of initialization on decadal predictions as assessed for CMIP5 models
GEOPHYSICAL RESEARCH LETTERS, VOL. 39,, doi:10.1029/2012gl051974, 2012 Potential impact of initialization on decadal predictions as assessed for CMIP5 models Grant Branstator 1 and Haiyan Teng 1 Received
More informationNOTES AND CORRESPONDENCE. On the Seasonality of the Hadley Cell
1522 JOURNAL OF THE ATMOSPHERIC SCIENCES VOLUME 60 NOTES AND CORRESPONDENCE On the Seasonality of the Hadley Cell IOANA M. DIMA AND JOHN M. WALLACE Department of Atmospheric Sciences, University of Washington,
More informationSeasonal Climate Watch September 2018 to January 2019
Seasonal Climate Watch September 2018 to January 2019 Date issued: Aug 31, 2018 1. Overview The El Niño-Southern Oscillation (ENSO) is still in a neutral phase and is still expected to rise towards an
More informationPreferred spatio-temporal patterns as non-equilibrium currents
Preferred spatio-temporal patterns as non-equilibrium currents Escher Jeffrey B. Weiss Atmospheric and Oceanic Sciences University of Colorado, Boulder Arin Nelson, CU Baylor Fox-Kemper, Brown U Royce
More informationApril Forecast Update for North Atlantic Hurricane Activity in 2019
April Forecast Update for North Atlantic Hurricane Activity in 2019 Issued: 5 th April 2019 by Professor Mark Saunders and Dr Adam Lea Dept. of Space and Climate Physics, UCL (University College London),
More informationSkill of multi-model ENSO probability forecasts. October 19, 2007
Skill of multi-model ENSO probability forecasts MICHAEL K. TIPPETT AND ANTHONY G. BARNSTON International Research Institute for Climate and Society, Palisades, NY, USA October 19, 2007 Corresponding author
More informationSeasonal Climate Watch July to November 2018
Seasonal Climate Watch July to November 2018 Date issued: Jun 25, 2018 1. Overview The El Niño-Southern Oscillation (ENSO) is now in a neutral phase and is expected to rise towards an El Niño phase through
More informationEl Niño Southern Oscillation: Magnitudes and asymmetry
JOURNAL OF GEOPHYSICAL RESEARCH, VOL. 115,, doi:10.1029/2009jd013508, 2010 El Niño Southern Oscillation: Magnitudes and asymmetry David H. Douglass 1 Received 5 November 2009; revised 10 February 2010;
More informationAnuMS 2018 Atlantic Hurricane Season Forecast
AnuMS 2018 Atlantic Hurricane Season : August 12, 2018 by Dale C. S. Destin (follow @anumetservice) Director (Ag), Antigua and Barbuda Meteorological Service (ABMS) The *AnuMS (Antigua Met Service) is
More informationLearning with Singular Vectors
Learning with Singular Vectors CIS 520 Lecture 30 October 2015 Barry Slaff Based on: CIS 520 Wiki Materials Slides by Jia Li (PSU) Works cited throughout Overview Linear regression: Given X, Y find w:
More informationLinear Regression Linear Regression with Shrinkage
Linear Regression Linear Regression ith Shrinkage Introduction Regression means predicting a continuous (usually scalar) output y from a vector of continuous inputs (features) x. Example: Predicting vehicle
More informationMachine Learning (BSMC-GA 4439) Wenke Liu
Machine Learning (BSMC-GA 4439) Wenke Liu 02-01-2018 Biomedical data are usually high-dimensional Number of samples (n) is relatively small whereas number of features (p) can be large Sometimes p>>n Problems
More informationJuly Forecast Update for North Atlantic Hurricane Activity in 2018
July Forecast Update for North Atlantic Hurricane Activity in 2018 Issued: 5 th July 2018 by Professor Mark Saunders and Dr Adam Lea Dept. of Space and Climate Physics, UCL (University College London),
More informationPrincipal Component Analysis (PCA) of AIRS Data
Principal Component Analysis (PCA) of AIRS Data Mitchell D. Goldberg 1, Lihang Zhou 2, Walter Wolf 2 and Chris Barnet 1 NOAA/NESDIS/Office of Research and Applications, Camp Springs, MD 1 QSS Group Inc.
More informationInitialization and Predictability of a Coupled ENSO Forecast Model*
MAY 1997 CHEN ET AL. 773 Initialization and Predictability of a Coupled ENSO Forecast Model* DAKE CHEN, STEPHEN E. ZEBIAK, AND MARK A. CANE Lamont-Doherty Earth Observatory, Columbia University, Palisades,
More informationFiltering of GCM simulations of Sahel precipitation
GEOPHYSICAL RESEARCH LETTERS, VOL.???, XXXX, DOI:10.1029/, Filtering of GCM simulations of Sahel precipitation Michael K. Tippett International Research Institute for Climate and Society, Columbia University,
More information