Spatial Correlation of Acceleration Spectral Ordinates from European Data

Spatial Correlation of Acceleration Spectral Ordinates from European Data S. Esposito & I. Iervolino Università degli Studi di Napoli Federico II, Italy SUMMARY Spatial correlation of spectral acceleration (Sa) is required in modelling hazard for risk assessment of spatially distributed systems. The estimation of an appropriate correlation model is still a research task in earthquake engineering, because of several issues related to: data, statistical approaches and estimation tools, and tests to evaluate the estimated models. This paper focuses on the evaluation of the spatial correlation of Sa using records belongs the European Strong motion Database (ESD). The estimation has been performed on multiple earthquakes, considering a ground motion prediction equations (GMPE) fitted to the same records. Results show that the decay rate of correlation, as a function of inter-site distance, tends to increase with structural period. Based on that, a simple linear relationship for the correlation length as a function of structural period is derived, discussed, and compared with previous research on the same topic. Keywords: spectral acceleration, ground motion residuals, semivariogram. 1. INTRODUCTION The modeling of spatial correlation of ground motion intensity measures (IMs) has become a relevant topic in seismic risk analysis due to the requirement to extend risk assessment, usually related to sitespecific structures, to spatially distributed systems such as building portfolios and lifelines. In particular, on the hazard side, probabilistic seismic hazard analysis (PSHA; McGuire, 4) refers to ground motion prediction equations (GMPEs), which provide probabilistic distribution of the chosen IM (e.g., elastic spectral acceleration, Sa) conditional on earthquake magnitude, source-to-site distance, and other parameters such as local geological conditions. Since it was demonstrated that GMPE residuals are spatially correlated (e.g., Boore et al., 3), it may be the case that neglecting such a correlation bias the loss assessment of distributed systems (e.g., Esposito, 11). Spatial correlation models available in literature (e.g., Goda and Hong, 8; Jayaram and Baker, 9; Sokolov et al, 1) depend on inter-site separation distance and provide the limit at which correlation may technically considered to be lost (i.e., distance beyond which IMs may be considered uncorrelated). Moreover, if Sa is of concern, the correlation may depend also on the oscillation period Sa refers to. It is important to note that those models provide different correlation length even if estimating correlation for the same IM. This is supposed to depend on several factors such as the dataset used, the GMPE chosen to compute intraevent residuals, and the working assumptions of the estimation. For example, Sokolov et al. (1), starting from the strong-motion database of TSMIP network in Taiwan, investigated the dependency of spatial correlation on site classes and geological structures, asserting that a single generalized spatial model may not be adequate for all of Taiwan territory. In some cases (e.g., Wang and Takada, 5, Jayaram and Baker, 9) existing GMPEs are used, while, in others (e.g., Goda and Hong, 8, Goda and Atkinson, 9, Sokolov et al., 1), ad-hoc fit on the chosen dataset is adopted. Generally, regressions analysis used to develop prediction equations does not incorporate the correlation structure of residuals as a hypothesis. Hong et al. (9) and Jayaram and Baker (1), evaluated the influence of considering the correlation in fitting a

GMPE, finding a minor influence on regression coefficients and a more significant effect on the variance components. Goda and Atkinson (1), investigated the influence of the estimation approach, emphasizing its importance when residuals are strongly correlated. Herein, the evaluation of the spatial correlation of Sa (5% damped) residuals is carried out using the European Strong motion Database (ESD; http://www.isesd.hi.is/). The GMPE with respect to which residuals are computed is those of Akkar and Bommer (1) and only the records used to fit the considered GMPE are employed to estimate spatial correlation. In the paper, the first part briefly describes the geostatistical framework adopted to estimate the correlation. Subsequently, the working assumptions and a description of the considered dataset, are given. Then, results of the estimation of correlation lengths for spectral acceleration at nine structural periods, ranging from.1 s to.85 s are discussed. Finally, a simple linear relationship for the correlation length as a function of structural period is derived and compared with previous research on the same topic.. MODELING SPATIAL CORRELATION.1. Geostatistical framework The spatial modeling of IMs for earthquake engineering purposes is mostly based on estimation of correlations between residuals of ground motion parameters at two different sites via geostatistical tools. In particular, focusing on spectral acceleration, GMPEs generally model the logarithms of Sa for a specific structural period, T, and related heterogeneity, at a site p due to earthquake as in Eqn..1: ( ) log Sa( T ) = log Sa( T ) M, R, θ + η + ε (.1) where log Sa( T ) (,, ) ( M ), source-to-site distance ( R ), and others ( ) M R θ is the mean of the logs conditional on parameters such as magnitude θ ; η denotes the inter-event residual, which is a constant term for all sites in a given earthquake and represents the systematic deviation of the specific seismic event from the overall mean; and ε is the intraevent variability of ground motion. ε and η are usually assumed to be independent random variables, normally distributed with zero mean and standard deviation σ intra and σ inter, respectively. Consequently, log Sa( T ) is modeled as a normal random variable with mean log Sa( T ) ( M, R, θ ) and standard deviation σ T, where T inter intra σ = σ + σ. If hazard assessment at two or more sites at the same time is of concern, the oint probability density function (PDF) of IMs at all locations can be conveniently modeled with a multivariate normal distribution (Jayaram and Baker, 8). In this case, it is assumed that the logs of IM form a Gaussian random field (GRF), defined as a set of random variables, one for each site u in the study area S R u p = 1,, n are modeled as. In particular, ground motion parameters at n spatial locations { } a collection of random variables characterized by a mean vector provided by the GMPE and a covariance matrix, Σ, as in Eqn.., where the first term produces perfectly correlated inter-event residuals (Malhotra, 8), while the second term (symmetrical) produces spatially correlated intraevent residuals. In Eqn.., spatial correlation for intraevent residuals is function of the separation distance h (under the hypothesis of second-order stationarity and isotropy of the GRF); i.e., the spatial heterogeneity does not depend on the direction considered. p

( h ) ρ( h ) 1 1 1 1 ρ 1 1n 1 1 1 ρ ( h1 ) 1 σ inter σ Σ = + intra 1 1 1 ρ( h ) ρ( h ) 1 n1 n (.) The covariance matrix typically depends on parameters obtained from a non-negative definite spatial parametric covariance model (Cressie, 1993) that may be fitted on empirical observations. To quantify spatial variability of geo-referenced data the semivariogram is often used, γ ( h), that under the hypothesis of second-order stationary and isotropy mentioned above, it is strictly related to the correlation coefficient and it is defined as in Eqn. (.3): where ( ) ( h) = Var( ) 1 ( h) γ ε ρ (.3) Var ε is the homoscedastic variance of intraevent residuals, and ρ ( h) denotes the spatial correlation coefficient between intraevent residuals at two generic sites separated by h -distance. The estimation of correlation based on semivariograms usually deploys in three steps: (1) computing the empirical semivariogram 1 ˆ γ ( h) ; () choosing a functional form to fit it; and (3) estimating the parameters of the chosen function fitting empirical data. Regarding point (1): empirical semivariograms are computed as a function of site-to-site separation distance, with different possible estimators. The classical estimator is the method-of-moments ε u+ h ε u (Matheron, 196), which is defined for an isotropic random field in Eqn. (.4), where ( ) ( ) is the difference between intraevent residuals at sites separated by h -distance value, N ( h) is the set of pairs of sites, separated by the same h -distance in the dataset, at which the residual of interest are measured, and N ( h) is the cardinal of N ( h). 1 ˆ γ ( h) = ( h) ( ) ( h) N ε + ε u u (.4) N ( h) Since this estimator can be badly affected by atypical observations (Cressie, 1993), Cressie and Hawkins (198) proposed a more robust, in the statistical sense (i.e., less sensitive to outliers) estimator. Consistent with Esposito and Iervolino (11), both estimators are used in the following for the evaluation of intraevent spatial correlation of Sa( T ). Because earthquake records are not available on a regular grid, to compute the empirical semivariogram it is necessary to define tolerance bins around each possible h -value and, hence, to divide data in classes. For the selection of distance a rule of thumb is to choose the maximum bin size as a half of the maximum distance between sites in the dataset, and to set the number of bins so that there are at least thirty pairs per bin (Journel and Huibregts, 1978). Point (), concerning the interpretation of experimental semivariograms, consists in the identification of a model among the family of functions able to capture and emulate its trend. The three basic stationary and isotropic models are: exponential, spherical, and Gaussian (Goovartes, 1997). In 1 As assumed in Esposito and Iervolino (11), in this work the estimation of spatial correlation was performed using all data available from different earthquakes and regions (deemed homogeneous), differently from studies focusing on earthquake-specific data (e.g., Jayaram and Baker, 9). This assumption allows to neglect the subscript in the following equations assuming a common semivariogram for different events (i.e. invariant through earthquakes).

particular, the exponential model which is the most common one in literature, is described in Eqn. (.5): γ 3 hb / ( h) c ce ( 1 e ) = + (.5) where c is defined nugget, that is, the limit value of the semivariogram when h tends to zero, c e is the sill, or the population variance of the random field (Barnes, 1991), and b is the practical range, or correlation length, defined as the inter-site distance at which γ ( h) equals 95% of the sill, or in other words, the distance at which the correlation may be considered technically lost and residuals of ground motion intensity may be considered independent (under the GRF assumption). Regarding point (3): several goodness-of-fit criteria to find the best parametric model have been proposed in geostatistical literature. Studies dealing with earthquake data sometimes use visual or trial and error approaches in order to appropriately model the semivariogram structure at short site-to-site distances, where it is significant (Jayaram and Baker, 9). In this work experimental semivariograms are fitted using recursively the least squares method (LSM) on data obtained with the robust estimator... Estimating correlation on multievent data As assumed in Esposito and Iervolino (11), in this work the estimation of spatial correlation was performed using all data available from different earthquakes and regions. Moreover in order to estimate only one parameter, that is the range, empirical semivariograms were computed starting from normalized intraevent residuals obtained, expressed for a single earthquake and a generic site p, as * p ε = ε σ, where σ p is the standard deviation of the intraevent residual at the site p (in the study, the intraevent standard deviation is common for all sites consistent with GMPEs used to compute residuals ). In fact the standardization enables to not estimate the sill, as it should be equal to one. Therefore, equation (.3) becomes equation (.6), where the superscript represents an empirical estimate. ˆ γ ( h) 1 ˆ ρ( h) = (.6) Normalized intraevent residuals from multiple events (and regions) are, then, pooled to fit a unique correlation model. This because geostatistical estimation needs a relatively large number of data to model the semivariogram (i.e., such to have more than thirty pairs in each h bin), which are not available for individual events in the chosen datasets (to follow). The use of all data implies the assumption of the same isotropic semivariogram with the same parameters for all earthquakes; therefore the experimental semivariogram becomes that of Eqn. (.7) where N ( h ) is the number of pairs in the specific h bin from all earthquakes and the differences, * * ε q ε, are taken considering only pairs from the same earthquake without mixing up residuals from different events. Therefore individual events are kept separated, and pairs in the specific bin are then pooled to obtain empirical semivariogram points. This is formally shown in Eqn. (.8), where n represents the number of records for the th event. There are different alternatives for standardization (e.g., Jayaram and Baker, 1; Goda and Atkinson, 1), but they lead to results which seem to be not significantly affected by a choice with respect to another (Esposito and Iervolino, 11).

1 * * ˆ γ ( h) = ε ε q N h (.7) ( ) N( h) {, * * * * ε, ε q : ε ε q ;, 1, ; 1, } ( ) ( ) N h = = h p q= n = k (.8) 3. DATA AND ANALYSIS 3.1. Dataset The considered recordings from ESD dataset exactly correspond to data used in Esposito and Iervolino (11); i.e., records used to fit the Akkar and Bommer (1) GMPE (except that, only earthquakes for which more than one record was available were considered). The experimental semivariograms were obtained using a bin width of 4 km. Characteristics of the dataset, and distribution of data pairs as a function of separation distance bins, are shown in Fig. 3.1. Magnitude 8 7 6 5 1 3 Number of data pairs 15 1 5 ESD 4 5 1 15 R b (km) 5 1 15 h (km) Figure 3.1. a) The ESD subset with respect to M, R b (i.e. the closest horizontal distance to the vertical proection of the rupture, or Joyner-Boore distance) and local site conditions: rock (3), stiff soil (), soft soil (1), and very soft soil (); b) histograms of the number of data pairs as a function of site-to-site separation distance bin (4 km). 3.. Results The estimation was performed for nine structural periods ranging between.1 s and.85 s. Because the GMPE used to obtain intraevent residuals refers to geometric mean of horizontal components, the correlation was estimated for this IM. Both estimators (classical and robust discussed above) were used; no significant differences, as expected, were found in the shape of the experimental semivariogram. The robust estimator was, then, considered to fit the correlation function. The exponential model was chosen to fit empirical points, since this model is widely adopted in literature. Moreover, the choice of using the same model for all periods allows to compare results and to investigate possible dependency of the model parameters on the fundamental period. Assuming that there is no nugget effect (as this study does not investigate variations at a smaller scale with respect to that of the tolerance); the only parameter to estimate is the range, b in Eqn. (.5). To this aim, LSM was used iteratively to fit the model on the empirical semivariogram. In particular, the range was defined starting from the value estimated by the LSM applied until half of the maximum separation distance of residuals in the dataset. The range obtained from this step was, then, used as a lower bound for the limit distance to apply LSM again on the semivariogram. This latter step allowed to adust the fitting and to obtain the final practical range values, that is, the distances at which correlation may be considered to be lost for each spectral ordinate. Following this procedure, the estimated ranges for the estimated exponential models were retrieved for ESD. Fig. 3. shows empirical semivariograms with both estimations and fitted models and, as a

summary, practical ranges are given in Table 3.1. (a) (b) (c) (d) (e) (f) (g) (h) (i) (a) Sa(.1s) (b) Sa(.s) (c) Sa(.3s) (d) Sa(.5s) (e) Sa(1s) (f) Sa(1.5s) (g) Sa(s) (h) Sa(.5s) (i) Sa(.85s) Figure 3.. Empirical semivariograms and fitted exponential models. 3.3. Predictive equation Empirical results demonstrate that correlation length tends to increase with period. Except for high frequencies at which there is no clear trend. This seems to be consistent with past studies of ground motion coherency (Zerva and Zervas, ). In fact, the coherency describes the degree of correlation between amplitudes and phases angles of two time histories at each of their component frequencies.

Considering that coherency decreases with increasing distance between measuring points and with increasing frequency, it may be reasonable to expect more coherent ground motion, as spectral acceleration evaluated at longer periods exhibits more correlated peak amplitudes. This same issue aspect was also discussed in Jayaram and Baker (9) where, generally, the estimated ranges increased with period. Table 3.1. Estimated ranges for the ESD dataset. Database Period [s] Range [km] ESD.1 13.7. 11.6.3 15.3.5 1.5 1 33.9 1.5 7. 39..5 4.5.85 48.8 A simple linear predictive model was fitted 3 using LSM in order to capture the trend of the range as a function of structural period, as expressed in Eqn. 3.1 and shown in Fig. 3.3a. Thus, based on the results provided, the correlation between normalized intraevent residuals separated by h may be ρ ( ht, ) = exp 3 hb T. obtained as ( ( )) ( ) 11.7 1.7 bt = + T (3.1) Finally, to compare results shown herein with existing studies, in Fig. 3.3b estimated ranges are compared with some correlation lengths available in literature; those models were chosen considering: (i) Californian dataset from Goda and Hong (8), (ii) all earthquakes model from Hong et al. (9), and (iii) the predictive model based on all earthquakes from Jayaram and Baker (9a). Results of this study seem to be comparable with ranges estimated in literature in terms of both trend as a function of T, and the estimated values of correlation length. It is worth to mention, for completeness, that an exception in this respect is represented by the Goda and Atkinson (9, 1) models, in which ranges are larger (never below about 6 km) and the dependence of the correlation on period is not significant. 5 5 4 4 Range (km) 3 1 Estimated Ranges ESD Linear Model ESD.5 1 1.5.5 3 T (sec) Range (km) 3 Estimated Ranges ESD 1 Goda and Hong 8 Jayaram and Baker 9 Hong et al. 9.5 1 1.5.5 3 T (sec) Figure 3.3 (a) Linear model and estimated; (b) comparison with available models from literature. 3 Note that for the fitting of the predictive model the range obtained in Esposito and Iervolino (11) for PGA has been included. Therefore the estimation was performed based on ten points.

4. CONCLUSIONS The study presented focus on the assessment of intraevent spatial correlation of elastic spectral acceleration at periods ranging between.1 s and.85 s. A subset of the European Strong Motion Database was used to compute residuals starting from a GMPE calibrated on the same data. Consistent with the available literature on the topic, hypotheses of stationarity and isotropy of the random fields were retained to compute experimental semivariograms of standardized intraevent residuals (with respect to the standard deviation estimated by the GMPE). Because a relatively small number of records for each earthquake was available, records from multiple events and regions were pooled to develop models based on large numbers of observations. Exponential correlation models were calibrated on empirical data findings. The choice of using the same functional form (exponential) for all periods allowed to compare results and to investigate possible dependency of the parameters on the period the spectral ordinate refers to. In fact, it was found that, as expected to some extent, practical ranges tend to increase as structural period increase, and simple linear functions were fitted to capture this trend. Results were also compared with results from main references about the same topic finding, generally, consistency. Starting from the correlation models provided, it is possible to get the oint probability density function for Sa( T ) at all locations in a region of interest, for earthquake engineering and seismic risk assessment applications. ACKNOWLEDGEMENT This study was supported by AMRA scarl (http://www.amracenter.com) under the frame of SYNER-G (7th framework programme of the European Community for research, technological development and demonstration activities; proect contract no. 4461). Authors want to thank Sinan Akkar (Middle East Technical University, Turkey) for kindly providing us with datasets used in this study. REFERENCES Akkar, S., and Bommer, J. J. (1). Empirical Equations for the Prediction of PGA, PGV and Spectral Accelerations in Europe, the Mediterranean Region and the Middle East. Seismological Research Letters 81:, 195 6. Barnes, R. J. (1991). The Variogram Sill and the Sample Variance. Mathematical Geolology 3:4, 673-678. Boore, D. M., Gibbs, J. F., Joyner, W. B., Tinsley, J. C., and Ponti, D.J. (3). Estimated ground motion from the 1994 Northridge, California, earthquake at the site of the Interstate 1 and La Cienega Boulevard bridge collapse, West Los Angeles, California. Bulletin of Seismological Societyof America 93:6, 737-751. Cressie, N. (1993). Statistics for Spatial Data. Revised edition. Wiley, New York. Cressie, N., and Hawkins, D. M. (198). Robust estimation of variogram. Mathematical Geolology 1:, 115-15. Esposito, S. and Iervolino, I. (11). PGA and PGV spatial correlation models based on European multievent datasets. Bulletin of Seismological Societyof America 11:5, 53-541. Esposito, S. (11). Systemic seismic risk analysis of gas distribution networks. PhD Thesis, University of Naples Federico II, Naples, Italy. Advisor: I. Iervolino. Available at: wpage.unina.it/iuniervo Goda, K., and Hong, H. P. (8). Spatial correlation of peak ground motions and response spectra. Bulletin of Seismological Societyof America 98:1, 354-365. Goda, K., and Atkinson, G. M. (9). Probabilistic characterization of spatially correlated response spectra for earthquakes in Japan. Bulletin of Seismological Societyof America 99:5, 33-3. Goda, K., and Atkinson, G. M. (1). Intraevent spatial correlation of ground-motion parameters using SK-net data. Bulletin of Seismological Societyof America 1:6, 355-367. Goovartes, P. (1997). Geostatistics for Natural Resources Evaluation. Oxford University Press, New York. Hong, H.P., Zhang, Y., and Goda, K. (9). Effect of spatial correlation on estimated ground motion prediction equations. Bulletin of Seismological Societyof America 99:A, 98-934. Jayaram, N., and Baker, J.W. (8). Statistical tests of the oint distribution of spectral acceleration values. Bulletin of Seismological Societyof America 98:5, 31-43.

Jayaram, N., and Baker, J.W. (9). Correlation model for spatially distributed ground-motion intensities. Earthuake Engineering and Structural Dynamics 38:15, 1687 178. Jayaram, N., and Baker, J.W. (1). Considering spatial correlation in mixed-effects regression and impact on ground-motion models. Bulletin of Seismological Societyof America 1:6, 395-333. Journel, A.G., and Huibregts, C. J. (1978). Mining Geostatistics. Academic Press, London.. Malhotra, P. (8). Seismic design loads from site-specific and aggregate hazard analyses. Bulletin of Seismological Societyof America 98:4, 1849-186. Matheron, G. (196). Traité de géostatistique appliquée. Editions Technip, Paris, France (in French). McGuire, R. K. (4). Seismic Hazard and Risk Analysis. Earthquake Engineering Research Institute, MNO- 1, Oakland, California. Sokolov, V., Wenzel, F., Jean, WY., and Wen, K. L. (1). Uncertainty and spatial correlation of earthquake ground motion in Taiwan. Terrestrial Atmospheric and Oceanic Sciences 1:6, 95-91. Wang, M and Takada, T. (5). Macrospatial correlation model of seismic ground motions. Earthquake Spectra 1:4, 1137-1156. Zerva, A. and Zervas, V. (). Spatial variation of seismic ground motion, Applied Mechanics Reviewers 55:3, 71-97.