Stephen Hernandez, Emily E. Brodsky, and Nicholas J. van der Elst,

1 The Magnitude-Frequency Distribution of Triggered Earthquakes 2 Stephen Hernandez, Emily E. Brodsky, and Nicholas J. van der Elst, 3 Department of Earth and Planetary Sciences, University of California, Santa Cruz, CA 95064 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 Abstract Large dynamic strains carried by seismic waves are known to trigger seismicity far from their source region. It is unknown, however, whether surface waves trigger only small earthquakes, or whether they can also trigger societally significant earthquakes. To address this question, we use a mixing model approach in which total seismicity is decomposed into 2 broad subclasses: triggered events directly initiated or advanced by far-field dynamic stresses, and untriggered/ambient events consisting of everything else. The b value of a composite data set is decomposed into a weighted sum of b values of its constituent components, b T and b U. For populations of Californian earthquakes subjected to the largest dynamic strains, the fraction of earthquakes that are likely triggered, f T, is estimated via inter-event time ratios and used to invert for b T. The distribution of b T is found by nonparametric bootstrap resampling of the data, giving a 90% confidence interval for b T of [0.74, 2.79]. This wide interval indicates the data are consistent with a single-parameter Gutenberg-Richter distribution governing the magnitudes of both triggered and untriggered earthquakes. Triggered earthquakes therefore seem just as likely to be societally significant as any other population of earthquakes. Keywords: Gutenberg-Richter, magnitude-frequency distribution, earthquake triggering, R statistic, bootstrap 22 1

23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 Introduction Transient strains delivered by large amplitude seismic waves are frequently associated with seismicity rate increases in the far-field [Hill et. al., 1993; Velasco et. al., 2008]. This triggering phenomenon is frequently attributed to dynamic stresses since static stresses decay quickly at such large distances ( 2-3 fault lengths) [King et. al., 1994]. One of several outstanding problems associated with remote dynamic triggering is whether the magnitudes of triggered earthquakes are significantly different from the magnitudes of ambient seismicity. For instance, Parsons and Velasco [2011] investigated whether large (M W 7) events are capable of dynamically triggering other large (5 M W 7) earthquakes in the far-field, and found that they were unable to observe near-instantaneous triggering of large events in the far-field. However, stochastic models of seismicity employ a cascade of small events culminating in large, societally significant earthquakes [Felzer et. al., 2002; Felzer et. al., 2004]. With this prospect in mind, we seek to determine if populations of earthquakes including many remotely triggered events have a different magnitude distribution than those that contain very few triggered earthquakes. This article begins with an overview of earthquake magnitude-frequency distributions, followed by a discussion of a mixing model that relates the observable b-value of a mixed group of triggered and untriggered earthquakes to the parameter of interest (the b-value of the triggered events b T ). Utilizing this model requires constructing populations of earthquakes with a known fraction of triggered events. We therefore proceed to form groups of earthquakes that have been affected by dynamic strains of similar amplitude. We can then measure the fraction of triggered events in each of these groups using an inter-event time statistic and proceed to interpret the composite b-value with the aid of a statistical simulation. 45 2

46 47 48 Earthquake Magnitude Distributions The magnitude-frequency distribution of earthquakes over broad swaths of regions and time can be represented by the cumulative Gutenberg-Richter (GR) distribution 49 50 51 log 10 (N > M ) = a b M (1) 52 53 54 where a and b are constants and M M C, where M C is the magnitude of completeness of the catalog [Ishimoto and Iida, 1939; Gutenberg and Richter, 1942]. The Aki-Utsu maximum likelihood estimator for the parameter b is 55 56 b = log 10 (e) M M C (2) 57 58 59 60 61 62 63 64 65 66 where M is the mean magnitude [Aki, 1965; Utsu, 1965]. The Gutenberg-Richter distribution is a representation of the exceedance probabilities for a given range of magnitudes. Differences in b-value, therefore, represent differences in the relative hazard of large earthquakes in different regions. Characterizing and identifying differences in b-values has implications for both hazard analysis and the physical mechanisms of earthquake nucleation [Frohlich and Davis, 1993; Utsu, 1999]. Mixing Model and Fraction Triggered Since the maximum likelihood estimate of the b-value is tied to the mean magnitude, the b value of a dataset composed of both triggered and untriggered events can be expressed as 67 3

68 1 b mix = f T 1 b T + (1 f T ) 1 b U (3) 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 where b mix, b T, b U are the b values of the mixed (composite), triggered, and untriggered events and f T is the estimated fraction of the mixed dataset consisting of triggered earthquakes (Appendix A). To use equation (3), we need to construct populations of earthquakes with differing fractions triggered and then perform two distinct tasks: (1) measure b mix in the combined population and (2) determine the fraction of triggered events. Additionally, we need to find a group of earthquakes with a very low fraction of triggering in order to estimate the untriggered b-value b U. We will accomplish all of these goals by capitalizing on the previous observation that the fraction of triggered events in the far-field is a well-defined function of the peak amplitude of the seismic waves, i.e., larger amplitude waves trigger more events [van der Elst and Brodsky, 2010]. Therefore, we can construct groups of earthquakes which follow dynamic strains from distant earthquakes. The groups of earthquakes following large amplitude shaking will have a large (and measurable) triggered fraction and can be used in conjunction with equation (3) to measure the b-value of triggered earthquakes, b T. Those following small or extremely distant earthquakes will have a very low triggered fraction and can be used to approximate b U. Note that this definition of the untriggered population may include many earthquakes that are triggered by other, unidentified local mainshocks. It has been proposed elsewhere that the fraction of locally triggered events catalog-wide is >80% and so essentially any group of earthquakes will contain aftershocks [Marsan et. al., 2008]. However, for the purpose of this study we are asking if a group of identifiable, remotely triggered events has magnitude behavior that is distinct from 4

90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 other groups of earthquakes. This is a key question for both operational forecasting and physical understanding of the dynamic triggering process. Data and Analysis Method Our data consist of source parameters accessed from the Advanced National Seismic Network (ANSS) composite catalog for Californian and Nevadan seismic networks (32 o to 42 o N, 124 o to 114 o W). Event location, depth, origin time, and preferred magnitude are extracted for 1984-2011. Events with depth greater than 30 km and magnitude less than M2.0 are discarded to yield a complete, uniform catalog. The b value and 95% confidence level (as prescribed by Aki [1965]) for the composite catalog are 1.019 ± 0.005. Our local target catalog is partitioned into grids of size 0.1 x0.1. For each node, we loop through a global catalog of 'test triggers.' Events with magnitude M W >5 and origin time after 1984 qualify for inclusion as a test trigger. Depths are limited to those shallower than 50 km because of their greater relative efficiency at generating surface waves. Additionally, each global test trigger must be at least 400 km from the centroid of the local bin. In order to identify earthquake populations with strong triggering, we need to estimate the local peak ground velocity from a distant earthquake. We make this estimate by inverting the standard empirical magnitude regression for surface waves 107 108 109 log 10 (A) = M S 1.66 log 10 (Δ) 2 (4) 110 111 112 where A is the peak amplitude (in microns) for the 20-second Rayleigh surface wave and Δ is the epicentral distance in degrees [Lay and Wallace, 1994]. We estimate strain, ε, by assuming ε = V/C S, where V is the particle velocity (approximately equal to 2 π f A [Aki and Richards, 2002]) 5

113 114 115 116 117 118 119 and C S is the Raleigh wave phase velocity estimated at 3.5 x 10 9 µm/s. Second order effects due to faulting style and subsequent radiation pattern can result in errors as high ±20% for individual events [Gomberg and Agnew, 1996], but the global average curve will accurately predict the average behavior of a large group of potential triggering events. For a group of target areas that have all experienced a given amplitude of shaking we can measure the inter-event time ratio, or R, as a tool to measure the fractional rate change [van der Elst and Brodsky, 2010]. The time ratio is defined as 120 121 R t 2 t 1 + t 2 (5) 122 123 124 125 126 127 128 129 130 131 where t 1 and t 2 are, respectively, the times to the first earthquake before (with magnitude M 1 ) and the first earthquake after (with magnitude M 2 ) the arrival of seismic energy from some potential far-field trigger (Figure 1). In the presence of triggering, there will be a slight bias toward small values of t 2, leading to a larger proportion of small-valued R-ratios and a mean value of R less than 0.5. Prior work has shown that measurements of the mean value of R can be used to estimate rate changes induced by static stresses near a mainshock [Felzer and Brodsky, 2005] or changes in response to dynamic stresses like the passage of transient surface waves [van der Elst and Brodsky, 2010]. In particular, 132 133 R = 1 δλ 2 [(δλ +1) ln(δλ +1)-δλ] (6) 134 6

135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 where δλ = (λ 2 - λ 1 )/λ 1, and λ 1 and λ 2 are the rates of seismicity before and after the arrival of the purported trigger [van der Elst and Brodsky, 2010, equation (2)]. For the purpose of utilizing equation (3), we need to measure the fraction of a dataset comprised by triggered earthquakes, i.e., f T = where δλ is estimated from measured R ratios (Appendix B). δλ δλ +1 (7) The finite length of an earthquake catalog introduces a bias in the calculation of R values. Unequal conditioning of t 1 s and t 2 s (Figure 1) can be introduced as the test trigger time moves away from the middle of the local catalog. We remove this bias by truncating our local catalog to windows symmetric about the arrival time of each test trigger. The analysis is run for symmetric windows of fixed lengths ±1, 2, and 4 years, and also for symmetric windows of varying lengths. The choice of window length has a significant effect on the number of samples generated overall. For example, a ±1 year window applied to grid nodes with very low seismicity rates (e.g., the San Joaquin Valley in Central California) will exclude R-ratios that a longer window (e.g., ±2 or ±4 years) would not. Conversely, an exceedingly large window has the potential of excluding R ratio calculations using data near the beginning and end of the catalog. Since the choice of window length does not significantly alter the conclusions of the study, we report in the main text results generated by imposing a 2 year symmetric window as it maximizes the number of samples. Additional time window results are in the supplementary materials. Once f T has been measured for each group of earthquakes with similar values of dynamic strain, we then measure the observable, composite b-value (b mix ) and proceed to infer a range of values for b T that is consistent with the data. 157 7

158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 Results Our primary objective is to evaluate the change in b-value, or mean magnitude, of earthquakes relative to imposed strain. However, as discussed above, we first need to establish measurements of the fraction triggered for our selected dataset. As anticipated, the triggering intensity increases monotonically for strains larger than ~10-7 (Figure S2). This reaffirms the results of van der Elst and Brodsky [2010] with a different method to eliminate the finite catalog bias than was originally used. We now proceed to explore the magnitude distribution of each dataset. Figure 2 shows the Aki-Utsu b-value as a function of peak dynamic strain. The b-values for the M 1 and M 2 populations of earthquakes are plotted along with their 95% confidence intervals based on 10,000 bootstraps [Efron and Tibshirani, 1994]. In general, the b-values for either population do not show significant variations with dynamic strains. These results suggest that increased triggering does not significantly perturb the magnitude distribution. We now have measurements of b mix. The final requirement to infer b T is to measure b U (our reference background seismicity parameter). We initially attempted to define b U using events from the population of M 1 events. In that case, we supposed that the magnitude of the event immediately preceding the arrival of dynamic stress was an accurate representation of the steady-state distribution in a system unperturbed by far-field transient waves. As can be seen in Figure 2, it appears using only the M 1 s to measure b biases the measured values above the catalog average b-value of 1.019 ± 0.005. This sampling bias arises from the definition of R. As seismicity rates are high after local mainshocks, inter-event times are low. Therefore, distant earthquakes are unlikely to occur soon enough after a large 8

181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 magnitude, local mainshock to include these events in the estimation of the pre-trigger magnitudes (M 1 s). The local aftershocks effectively shield the larger magnitude mainshock from being sampled as an example of the pre-trigger seismicity by the R-statistic. As a result, the b- value inferred solely from the M 1 s is systematically higher (lower mean magnitude) than the b- value of the M 2 s. A similar shielding effect exists from foreshock activity for the M 2 s and therefore both the M 1 - and M 2 -inferred b-values are systematically higher (mean magnitudes are lower) than the catalog-averaged b-value with systematic offset between them, even for the very low perturbing strains (For details see Appendix C). We therefore need to incorporate this bias into our comparisons by only comparing similarly conditioned magnitudes. Since the M 2 group of magnitudes will contain the highest number of triggered events, the most meaningful reference magnitudes must also come from the M 2 data set. We take the values of M 2 from the lowest strain dataset (which contains the smallest triggered fraction) as the value of b U. Next we apply the mixing model to data from the ANSS catalog to extract b T from the 3 observable variables: b mix, b U, and f T. As an initial example, the triggered fraction and b mix are estimated from the R-ratios and M 2 data associated with the largest 314 strains in the dataset. The largest 314 strains are used because they correspond to strain measurements, ε, greater than 10-6 (the last bin in Figure 2). A reference b-value, b U, against which we will compare estimates of b T, must also be estimated. We choose data from the M 2 magnitudes associated with the 10 5 smallest strains ( ε < ~1.25 x10-9, equivalent to the data in the first bin of M 2 data in Figure 2) to represent the parameter b U. Table 1 summarizes the data for all components of the mixing model for the Northern California-Nevada (Lat. > 37 ), Southern California-Nevada (Lat. < 37 ), and full data sets (32 9

204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 < Lat. < 42 ). We focus here on results from Northern California-Nevada, i.e., latitudes > 37, because these data contain a particularly robust triggering signal. We obtain b mix = 1.26, b U = 1.11, and a value of f T indicating ~25% of the 314 events are triggered earthquakes, where b mix and b U are systematically higher than the catalog-averaged b = 1.019 ± 0.005 because of sampling biases (Appendix C). The b T associated with these data and inverted from the mixing model is 2.03. At face value, this observation suggests that b-values of data with high percentage of triggered earthquakes are on average larger than the b-values of untriggered/ambient seismicity and is consistent with prior results from previous studies addressing this same question [Parsons and Velasco, 2011; Parsons et. al., 2012]. To test the significance of these results, we employ the statistics of nonparametric bootstrap resampling. Suppose f T * and b mix * indicate a size N boot vector of bootstrap resamplings (with replacement) of the original data, R-ratios and M 2 s, respectively. Then, for some arbitrarily large value of N boot, the distribution of b T can be approximated by b T *, where each b Ti * is inverted from equation (3) as before. Furthermore, we partition the Northern California-Nevada data set into four discrete amplitude bins, each with 500,000 bootstrap iterations, to make four independent estimates of the distribution of b T (Figure 3). The individual curves are then combined to generate a marginal distribution of b T (Figure 4). The resulting values confirm that the magnitude distribution of the remotely triggered earthquakes is indistinguishable from the untriggered population. We also use a synthetic catalog of magnitudes to test whether our analysis is sensitive enough to resolve when b T is genuinely different from b U. We produce the synthetics mimicking the data in each amplitude range of Figure 3 by using a Monte Carlo simulation to produce 226 magnitudes following a Gutenberg-Richter distribution with a prescribed value of b T for the 10

227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 triggered fraction (f T ) of the group and a distinct value of b U for the untriggered (1-f T ) fraction. The total number of events in each group matches the observed number and f T is set to match the observed mean value. We then invert the synthetics for b T. In order to reproduce the observational uncertainty of f T, for each group we draw an inferred value of f T from the observed empirical distribution of f T and then invert using a fixed value of b U and Eq. 3. This procedure is repeated a total of 500,000 times for each amplitude range of interest. Those individual curves for b T are then stacked and weighted to produce a marginal distribution of b T as was done in Figure 4. We find that for prescribed values of b T > 1.5, the distribution of synthetic triggered magnitudes is markedly different from the marginal distribution generated from observable data (Figure S4). This lends credence to the methodology s ability to resolve differences between b T and b U should they exist. Implications Our results indicate there are no discernible differences between the magnitude distributions of data sets that likely contain a high proportion of triggered events, versus data sets that do not contain high proportions of triggered events. The observed composite b values show no significant variation with respect to calculated peak dynamic strains (Figure 2). The small variations that do exist do not show a monotonic increase with the fraction of likely triggered events in the population, meaning that this conclusion is not likely to change with increased statistical resolution. Inverting for b T directly using a mixing model does not alter these conclusions (Figure 4). These interpretations are significantly different than that of Parsons and Velasco [2011] where it is suggested that dynamically triggered earthquakes are preferentially small. These studies differ in several fundamental ways. For instance, the statistical treatment here includes 11

250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 2074 examples of potential triggers, including some very weak triggers, and deals explicitly with the fact that any observed group of earthquakes is a mixture of triggered and untriggered events via our mixing model. The previous work had only 205 examples of very strong triggering. While this paper s approach includes a larger number of cases, the potential exists for contamination from background seismicity. Secondly, the R-statistic uses an optimized, adaptive time window to measure rate changes and therefore is inherently more sensitive than the counting method of Parsons and Velasco. Finally, we focus on small earthquakes as diagnostic of the magnitude distribution; this contrasts with the approach of Parsons and Velasco where only 5 < M W < 7 events are examined. The advantage of this paper s approach is that larger numbers ensure statistical robustness since larger magnitude earthquakes are intrinsically rare and may not be expected to be observed during a short time interval. However, if larger magnitude earthquakes have distinct physical nucleation mechanisms our approach has no bearing on them. Thus far, there is little direct evidence of a break-down in earthquake self-similarity, therefore a separate process governing large earthquakes could be a relatively unlikely possibility, but one that must remain open nonetheless. Conclusions We find that the statistical evidence for fundamentally different underlying distributions between triggered and untriggered earthquakes is weak. This strongly supports the idea that the magnitudes of triggered and untriggered earthquakes are randomly drawn from a single parameter GR distribution, at least for moderate magnitudes. Therefore, remotely triggered earthquakes are likely to be as large as any other group of seismicity. However, since the total number of triggered events is small, the probability of observing a remotely triggered large earthquake is accordingly small. In summary, we have opted for large numbers in our statistical 12

273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 treatment and arrived at conclusions distinct from prior efforts. This suggests that the question of the magnitude distribution of dynamically triggered events is still an open problem. References Aki, K. (1965), Maximum likelihood estimate of b in the formula log N = a bm and its confidence limits, Bull. Earthquake Res. Inst. Tokyo Univ., 43, 237 239. Aki, K., and P. G. Richards (2002), Quantitative Seismology, 2nd ed., 700 pp., Univ. Sci. Books, Sausalito, Calif. Brodsky, E. E. (2006), Long-range triggered earthquakes that continue after the wave train passes, Geophys. Res. Lett., 33, L15313, doi:10.1029/2006gl026605. Efron, B and R.J. Tibshirani (1994), An Introduction to the Bootstrap, 436 pp., Monographs on Statistics and Applied Probability, CRC Press, Boca Raton, FL. Felzer, K. R., and E. E. Brodsky (2005), Testing the stress shadow hypothesis, J.Geophys. Res., 110, B05S09, doi:10.1029/2004jb003277. Felzer, K. R., T. W. Becker, R. E. Abercrombie, G. Ekstrom, and J. R. Rice (2002), Triggering of the 1999 MW 7.1 Hector Mine earthquake by aftershocks of the 1992 MW 7.3 Landers earthquake, J. Geophys. Res., 107 (B9), 2190, doi:10.1029/2001jb000911. Felzer, K. R., R. E. Abercrombie, and G. Ekstrom (2004), A common origin for aftershocks, foreshocks, and multiplets, Bull. Seismol. Soc. Am., 94, 88 98. Frohlich, C., and S. D. Davis (1993), Teleseismic b values; or much ado about 1.0, J. Geophys. Res., 98 (B1), 631 644, doi:10.1029/92jb01891. Gomberg, J., and D. Agnew (1996), The accuracy of seismic estimates of dynamic strains: An evaluation using strainmeter and seismometer data from Pinon Flat Observatory, California, Bull. Seismol. Soc. Am., 86, 212 220. 13

296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 Gutenberg, B., and C. F. Richter (1944), Frequency of earthquakes in California, Bull. Seismol. Soc. Am., 4, 185 188. Hill, D. P., et al. (1993), Seismicity remotely triggered by the magnitude 7.3 Landers, California, earthquake, Science, 260(5114), 1617 1623. Ishimoto, M., and K. Iida (1939), Observations of earthquakes registered with the microseismograph constructed recently, Bull. Earthquake Res. Inst. Univ. Tokyo, 17, 443 478. King, G. C. P., R. S. Stein, and J. Lin (1994), Static stress changes and the triggering of earthquakes, Bull. Seismol. Soc. Am., 84, 935 953. Lay, T., and T. C. Wallace (1995), Modern Global Seismology, Academic, San Diego, Calif. Marsan, D., and O. Lengliné (2008), Extending earthquakes' reach through cascading, Science, 319, 1076 1079. Parsons, T., and A. A. Velasco (2011), Absence of remotely triggered large earthquakes beyond the mainshock region, Nature Geoscience, 4, 321 316, doi:10.1038/ngeo1110. Parsons, T., J. O. Kaven, A. A. Velasco, and H. Gonzalez-Huizar (2012), Unraveling the apparent magnitude threshold of remote earthquake triggering using full wavefield surface wave simulation, Geochem. Geophys. Geosyst., 13, Q06016, doi:10.1029/2012gc004164. van der Elst, N. J., and E. E. Brodsky (2010), Connecting near-field and far-field earthquake triggering to dynamic strain, J. Geophys. Res., 115, B07311, doi:10.1029/2009jb006681. Velasco, A. A., S. Hernandez, T. Parsons, and K. Pankow (2008), Global ubiquity of dynamic earthquake triggering, Nature Geoscience, 1, 375 379. 14

318 319 320 Utsu, T. (1965). A method for determining the value of b in the formula log n = a - bm showing the magnitude-frequency relation for earthquakes, Geophys. Bull. Hokkaido Univ., 13, 99 103 (in Japanese with English summary). Table 1: Data summary for largest amplitude bin (ε > 1x10-6 ) Northern California- Nevada, N=314 Southern California- Nevada, N=141 Full ANSS, N=455 b mix 1.26 1.32 1.28 b U 1.11 1.06 1.08 δλ 0.34-0.11 0.18 f T 0.25 Undef. a 0.15 b T 2.03 Undef a. 273.99 b a: The mixing model requires that b T be a positive quantity which is not possible if an overall negative rate change is observed. b: The model outputs an unrealistic large value of b T due to the low overall fraction of triggered events and the large value of b mix relative to b U. For this reason, we focus on the Northern California data with a large triggering signal. 321 322 323 324 325 326 327 328 Figure 1: Schematic cartoon of hypothetical synthetic data from a single grid node and demonstrating the procedure for measuring R ratios. The vertical black dashed line is the approximate arrival time of seismic energy from one far-field event, i.e., test trigger. For each test trigger, t 1, t 2, and M 2 are the primary variables used to assess triggering intensity (δλ) and magnitude distribution of triggered earthquakes over a broad region. In this hypothetical example, a symmetric window of ±1 day (24 hours) is imposed. In practice, we impose a symmetric window of ±2 years. 15

329 330 331 332 333 334 335 336 337 338 339 340 341 342 Figure 2: b-value for the Northern California-Nevada target catalog as a function of peak dynamic strain and for a fixed symmetric window of ±2 years to correct for finite catalog effects. Each decade is evenly binned into thirds. The b-value of the M 2 s, a proxy for the mean magnitude of the dataset, is consistently lower (corresponding to higher mean magnitude) than the b value for the M 1 s, due to a sampling bias (Appendix C). The variances for the estimates associated with the largest strain are largest because of the small sample size. Figure 3: Distributions of b T as inverted from distinct amplitude (strain) bins. Data come from the Northern California-Nevada data set. Four disjoint datasets with amplitudes ranges i)10-7 - 2.15x10-7 (N=4780), ii.) 2.15x10-7 - 4.64x10-7 (N=1974), iii.) 4.64x10-7 - 10-6 (N=905), iv.) >10-6 (N=314) are constructed. We treat each disjoint subset as independent measurements of f T and b mix. A distribution for b T is then estimated after generating 500,000 bootstraps of the data. The data are smoothed via Gaussian kernel density estimation. Figure 4: The distributions of b U and the marginal distribution of b T, generated via bootstrap resampling The distribution of b T is a weighted sum of densities generated from individual 343 344 345 346 347 348 amplitude bins (Figure 3), i.e. Density stack = N amplitudes n f i Ti N Density i where N amplitudes is the number i=1 of amplitudes bins, n i is the number of measurements in bin i, f T is the mean estimate of the fraction triggered in bin i, Density i is the probability density estimate of b T using data from bin i and N is the total number of measurements. The 90% confidence interval (vertical red dashed lines) spans a wide margin and the distribution for b T (blue curve) cannot be distinguished from our preferred distribution for b U (green curve). b T is heavy-tailed toward large values of b T. 16

4 3.5 Untriggered Triggered FIGURE 1 Magnitude 3 2.5 2 Magnitude 4 3.5 3 2.5 20 15 10 5 0 5 10 15 20 Hours since shaking Untriggered Triggered M1 M2 2 t1 t2 2 1.5 1 0.5 0 0.5 1 1.5 2 Hours since shaking

b mix value 1.5 1.45 1.4 1.35 1.3 1.25 1.2 1.15 1.1 1.05 1 FIGURE 2 M 1 M 2 10 9 10 8 10 7 10 6 Peak Dynamic Strain

Density 2 FIGURE 3 1.8 10 07 to 2.15x10 07 1.6 1.4 1.2 4.64x10 07 to 10 06 1 2.15x10 07 to 4.64x10 07 0.8 0.6 0.4 > 10 06 0.2 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 b T

Density 1.5 FIGURE 4 100 1 50 0.5 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 b value

1 2 3 4 5 Supplemental Materials Appendix A: Mixing Model Suppose a sequence of earthquake magnitudes, M mix i, exists such that it is composed entirely of either triggered, M T j, or untriggered, M U k, events. The sum of magnitudes of the mixed (composite) catalog can be expressed as 6 7 8 n tot n T n U mix T U M i = M j + M k i j k (A1) 9 10 where n T and n U are the total number of triggered and untriggered observations, and n tot = n T + n U. Equation (A1) can be recast as a weighted sum of the means of the individual components 11 12 13 M mix = f T M T + (1 f T ) M U (A2) 14 15 where f T = n T /n tot. Finally, since the Aki-Utsu equation (equation (2) of main text) relates the b- value of a given dataset to the mean magnitude of that dataset, it is trivial to show that 16 17 18 1 b mix = f T 1 b T + (1 f T ) 1 b U (A3) 19 20 In this formulation, we are assuming that the minimum magnitude of completeness is equivalent for both subcatalogs, M T and M U. Equation (A3) is the same as our mixing model equation in the 1

21 22 23 24 25 26 27 28 29 30 main text. Application of a maximum likelihood methodology yields similar results [Kijko and Smit, 2012; Appendix A]. Appendix B: Estimating f T Inverting for b T from the mixing model equation derived in Appendix A requires an estimate of the fraction of the data attributed to triggered earthquakes, or f T. For a given source volume with a distribution of faults, we model the seismicity of the volume as a Poisson process with an average intensity parameter λ. Dynamic strains traveling through a source volume can activate faults and induce a step change in the intensity parameter from λ 1 to λ 2. For a stepwise homogenous Poisson process, we define the fractional rate change induced by some far-field trigger as 31 32 33 δλ = λ 2 λ 1 λ 1 = λ 2 λ 1 1 (B1) 34 35 36 Over one pretrigger recurrence interval (time τ = λ 1-1 ), the expected number of earthquakes in the posttrigger aftermath is n after = δλ+1. The expected number of earthquakes in the time interval of length τ prior to the putative trigger is n before = λ 1 λ 1-1 =1. We can therefore define f T as 37 38 f T = n T n tot = n tot n U n tot = n after n before n after = (δλ +1) 1 δλ +1 = δλ δλ +1 (B2) 39 40 41 42 The mean of a sample of R-ratios is used to estimate δλ (via equation (6) of the main text) and f T. Appendix C: Sampling Biases Aftershock Shielding Effect 2

43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 Throughout the entire range of strains, the M 1 population shows a consistently higher b- value (corresponding to a lower mean magnitude) than its equivalent population of M 2 magnitudes (Figure 2). Multiple strategies for correcting the finite catalog effect do not alter this general observation (Figure S1). This shielding effect likely stems from the increase in rate during aftershock sequences. Figure S3 shows a schematic diagram demonstrating the origin of the systematic offset of magnitudes between the M 1 s and M 2 s. The magnitudes of strong local mainshocks are generally higher than the bulk of the seismicity immediately preceding these mainshocks. Consistent with Omori s Law, the time to the next earthquake after a mainshock will, on average, be shorter than the time to the first earthquake that preceded it. As dynamic waves from far-field ruptures periodically perturb local source volumes, they may arrive in the vicinity of the timing of some local mainshock. If so, these far-field dynamic waves are accordingly less likely to fall within the interval between the timing of the mainshock and the mainshock s first aftershock, versus arriving within the interval between the timing of the mainshock and the first earthquake that precedes it; a large magnitude, local mainshock is systematically biased to be labeled an `M 2 versus being designated as an `M 1. In other words, the large mainshock is shielded from being labeled an M 1 by ensuing aftershocks. Foreshock Shielding Effect The presence of foreshocks induces a similar shadowing effect. Figure 2 in the main text shows that the b-values of the M 2 s (~1.1) are systematically larger than the b-value of the overall catalog (~1.0). Since foreshock sequences follow an Inverse Omori Law, as far-field triggers approach the time of a local mainshock, they will be immersed in the rate increases (in an average sense) preceding the local mainshock. As the forehocks are less abundant than the 3

65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 aftershocks, the shielding effect is less severe and therefore the M 2 s do not show a significant perturbation from catalog averaged values. ETAS Simulation We can reproduce both of these shielding effects in an epidemic-type aftershock model (ETAS) [Ogata, 1998]. We produced a synthetic catalog following the procedure of Brodsky [2011] and measured the magnitude distribution of the events prior and after randomly selected times. We found for 100 simulations that the M 1 b-value = 1.22±0.04 and the M 2 b-value = 1.0±0.03, with 1 standard deviation reported. As discussed in Brodsky [2011], a standard ETAS simulation has too few foreshocks (relative to aftershocks) compared to the observations. This disparity is due to either completeness problems or a physical propensity for foreshocks. The overabundance of foreshocks is an interesting issue in itself, however, here we are only concerned with its effect on the sampling of magnitudes. Therefore, the standard ETAS simulation only reproduced the aftershock shielding and did not explain the deviation of the M 1 s in the observations. We therefore performed an additional set of modified ETAS simulations where 25% of the aftershocks were randomly assigned to occur as foreshocks. In this case, the observed aftershock to foreshock ratio was close to the catalog values (~2) and the observed M 1 b-value = 1.19±0.03 (1 std.) and the M 2 b-value = 1.10±0.04 (1 std.), as required by the catalog. Therefore, we conclude that all of the trends visible in Figure 2 of the main text are explained as selection issues and there is no observable distinction between triggered and non-triggered groups. 85 4

Table S1: Data summary for Northern California-Nevada, with different amplitude bins 1x10-7 < ε < 2.15x10-6 N=4780 2.15x10-7 < ε < 4.64x10-6 N=1974 4.64x10-7 < ε < 1x10-6 N=905 ε < 1x10-6 N=314 b mix 1.10 1.16 1.14 1.26 b U 1.11 1.11 1.11 1.11 f T 0.06 0.13 0.15 0.25 b T 0.98 1.65 1.27 2.03 86 87 88 89 90 Auxiliary References Brodsky, E. E. (2011), The spatial density of foreshocks, Geophys. Res. Lett., 38, L10305, doi:10.1029/2011gl047253. Kijko, A. and A. Smit (2012), Extension of the Aki-Utsu b-value estimator for incomplete catalogs, Bull. Seismol. Soc. Am., 102, 1283 1287. 91 92 93 94 95 96 97 98 99 100 101 Ogata, Y. (1998), Space-time point-process models for earthquake occurrences, Ann. Inst. Stat. Math., 50 (2), 379 402. Figure S1: b-value for the Northern California-Nevada target catalog as a function of peak dynamic strain. Each decade is evenly binned into thirds. The y-axes are the same for each subplot. (a), b-values from magnitudes (M 1 and M 2 ) corresponding to R-ratios calculated with a ±1 year symmetric window; (b), as in (a), corresponding to 2-year symmetric windows; (c), as in (a), corresponding to 4-year symmetric windows; (d), as in (a), corresponding to symmetric windows of variable length. In variable symmetry, window lengths will depend on the trigger time s distance from the beginning or end of the catalog, i.e., ± min{t1, T2}, where T 1 = t 0 - t BEG (t BEG = 01/01/1984), and T 2 = t END - t 0 (t END = 12/31/2010). For example, if a far-field stress arrives at a particular spatial grid at t 0 =January 30, 1984, then min{t1,t2} = 30 days. In order 5

102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 to maintain symmetry and to remove the finite catalog bias, all seismicity not within 30 days of t 0 is ignored. In all cases, the b-value of the M 2 s, a proxy for the mean magnitude of the dataset, is consistently lower (corresponding to higher mean magnitude) than the b-values for the M 1 s. These results are consistent even when applying symmetric windows of differing lengths. Figure S2: Triggering intensity (rate change) for the Northern California-Nevada (Lat. 37 ) target catalog as a function of peak dynamic strain. Each decade is evenly binned into thirds. Triggering intensity calculated from R ratios ± 1, 2, and 4 years from the arrival of the putative trigger as well as variable time windows as explained in the caption of Figure S1. Gaps correspond to negative values that have been excluded after a logarithmic transformation was applied. Figure S3: Schematic diagram for the origin of the systematic offset of magnitudes between the M 1 s and M 2 s. Teleseismic waves arriving in our target catalog are represented by black arrows. For each far field event, a unique R-ratio is calculated and an M 1 and M 2 within the local catalog are identified. When a local mainshock occurs, the time to the next event is, on average, substantially reduced as aftershocks tend to cluster in time and space. It is therefore unlikely that teleseismic surface waves will arrive in the time interval between the local mainshock and its first local aftershock. Therefore, the large local mainshocks in a target catalog are rarely designated as M 1 s, and almost always designated M 2 s. Hence, the systematic offset in magnitudes between the two datasets. We call this the aftershock shielding effect. Figure S4: Marginal distributions of b T using synthetically derived magnitude sequences. Four tests were conducted, each assuming a different value of b T : b T =1.25, b T =1.5, b T =1.75, b T =2.0 (peach, pink, teal, and black). Each curve is the sum of 4 amplitude bins as in figure 3 in the main text. Within each amplitude bin, the requisite number of triggered earthquake are 6

125 126 127 128 129 generated, then contaminated with magnitudes drawn from a distribution with a different b-value (b U = 1.11). This process is repeated 500,000 times and a density estimate is generated using a Gaussian kernel. The resulting 4 density estimates for b T are then stacked to give a final curve. The only free parameter is the input b T. Thick blue curve are results from observable data within the Northern California-Nevada region. 130 7

1 year symmetric window, b M Vs. Strain FIGURE S21 1.6 1.8 2 year symmetric window, b M Vs. Strain 1.4 1.6 1.2 1 0.8 0.6 0.4 M1 M2 1.4 1.2 1 0.8 0.6 0.4 0.2 10 9 8 7 6 5 log (Peak Dynamic Strain) 10 0.2 10 9 8 7 6 5 log (Peak Dynamic Strain) 10 b value + 95% C.I. b value + 95% C.I. M1 M2 4 year symmetric window, b Vs. Strain Variable symmetric window, b Vs. Strain M M 1.8 1.4 1.6 1.3 1.4 1.2 1 0.8 0.6 M1 M2 1.2 1.1 1 0.9 0.8 0.7 0.4 10 9 8 7 6 5 log 10 (Peak Dynamic Strain) 10 9 8 7 6 5 log 10 (Peak Dynamic Strain) b value + 95% C.I. b value + 95% C.I. M1 M2

Peak Dynamic Strain 10 3 10 9 10 8 10 7 10 6 Fractional Rate Change 10 2 10 1 1 year Symmetric Window 2 year Symmetric Window 4 year Symmetric Window Variable Symmetric Window 10 0 FIGURE S2

Magntiude M1 M2 Far-field strain arrives FIGURE S3 M1 Time M2 Large local event falls into M2 population multiple times... Never into the M1 population M1 M2 Robust aftershock sequence, elevated seismicity rates }

Density FIGURE S4 1.4 1.2 b T = 1.25 b T = 1.50 1 0.8 b = 1.75 T b T = 2.00 Observation 0.6 0.4 0.2 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 b value