Supplementary Figure 1 The correlation of n-score cutoff and FDR in both CID-only and CID-ETD fragmentation strategies. A bar diagram of different n-score thresholds applied in the search, plotted against the corresponding FDRs. The actual dataset (shown as grey bars) is fitted to a power function (shown as blue lines). The red triangles indicate the natural log scaled n-score cutoffs at FDR 5% and 1%. These n-score cutoff values are calculated based on the fitted curve.
Supplementary Figure 2 Illustrative CID-only and CID-ETD-MS/MS spectra of an interpeptide cross-link. The MS2 product ion spectra are reconstructed based on the deconvoluted neutral monoisotopic masses. The four signature cross-linker-cleaved fragment ions are labeled as α S, α L, β S, and β L. Peptide fragment ions are labeled as b-/y-/c-/z- ion series.
Supplementary Figure 3 The long-distance interpeptide cross-link of eef2. The cross-link is shown in magenta and the amino acid sequence between the two linked lysines is colored in blue.
Supplementary Figure 4 The schematic positioning of SERBP1 onto the rabbit 80S ribosome in complex with a 34-nt mrna fragment (PDB 4UJE). Our cross-linking data on SERBP1 is highly in agreement with the cyro-em structure of the 80S ribosome bound mrna fragment. mrna is shown in blue. mrna contacting proteins that are previously reported are shown as cyan (from left to right, RPS14, RPS26 and RPS30). SERBP1, RPS28, RPS3, RPS12 and RPS27A are presented as red, yellow, yellow-orange, orange, and magenta, respectively.
Supplementary Figure 5 A zoomed-in view of the five intraprotein cross-links of SERBP1. The five intra-protein cross-links of SERBP1 are SERBP1-K281-K299, SERBP1-K286 to K299, SERBP1- K299 to K303, K310-K321, K320-K327). All four observed intra-protein cross-links are located at the C- terminus of the protein indicates SERBP1 may possess a globular structure in this domain as opposed to the extended conformation at the remaining region. The color scheme is shown the same as Supplementary Figure 4.
Supplementary Figure 6 Illustrative CID and HCD MS 2 spectra of an interpeptide cross-link. The MS2 product ion spectra are reconstructed based on the deconvoluted neutral monoisotopic masses. The four signature ions generated by the cross-linker cleavage are indicated in purple under both HCD and CID conditions.
Supplementary Note 1. Comparison of XlinkX and MeroX software solutions Although both XlinkX and MeroX search cross-links derived from MS-cleavable crosslinkers, they use completely different concepts in the algorithm design thus resulting in major differences in the software applicability and performance. XlinkX obtains the precursor masses of both peptides connected by the cross-linker, and as a consequence, it reduces the database size from n-square to 2n (whereby in proteomics, n is exceedingly large). In contrast, MeroX uses the four signature peaks in a scoring algorithm to enhance the cross-link identification. However, it still considers peptide combinations for cross-link precursor determination, thus still using an nsquare database. To illustrate our point, we performed a direct comparison between XlinkX and MeroX searching against difference sizes of databases. As shown in Supplementary Table 1, the searching time of MeroX increases exponentially with the size of database due to the n-square problem whereas the searching time of XlinkX increases marginally, and in only a linear fashion. XlinkX successfully reduces the database size from n-square to 2n, thus enabling the direct cross-link identification from human cell lysate. XlinkX performs similarly well when a small database is used, in comparison to other search engines including MeroX. However, the major difference is its capability to directly search against large databases. Supplementary Note 2. Comparison of CID and HCD fragmentation strategies in generating the cross-linker-cleaved signature ions DSSO was initially described as a CID-cleavable cross-linker, where the C-S bonds within the cross-linker spacer arm preferential cleave under lower-energy CID fragmentation. Since HCD is essentially beam type CID, we hypothesized that HCD might work in a similar way as CID on cleaving the C-S bond of DSSO cross-linker. We investigated the difference between CID and HCD in generating the four cross-linker-cleaved signature ions, by performing LC/MS 2 experiments on DSSO cross-linked BSA (bovine serum albumin) using a variety of fragmentation conditions, including CID (with normalized collision energies of 20%, 25%, 30%, 35%) and HCD (with normalized collision energies of 20%, 25%, 30%, and 35%). As shown in Supplementary Figure 6, both CID and HCD are able to generate the four signature peaks, as well as a number of b-/y- ions. Noticeably, the relative intensities of these four peaks are higher in CID than HCD. Next, we compared the number of cross-link identification under the above 1
mentioned conditions. As shown in Supplementary Table 2, CID 35 and HCD 25 offered the two highest numbers of cross-linked peptide identifications, however, the numbers have a higher variation at different HCD energies when compared to CID. We therefore decided to use CID 35 throughout our experiments. Supplementary Table 1. Comparison of the software performance between XlinkX and MeroX Database size (# of proteins) XlinkX searching time MeroX searching time XlinkX search space (# of peptides MeroX search space (# of peptides) 1 1 min 10 sec 311 73,920 10 1 min 10 sec 35 min 5445 41,209,581 50 1 min 20 sec 2 hr 47 min 11,528 275,784,355 100 1 min 40 sec 10 hr 19 min 18,353 --- 500 8 min After 2 days 67,198 --- 10% finished Human (40,000) 2 hr Not working 5,397,443 --- Supplementary Table 2. Comparison of CID and HCD fragmentation at varying collision energies fragmentation type CID 20 CID 25 CID 30 CID 35 # of cross-links 19 22 27 28 fragmentation type HCD 20 HCD 25 HCD 30 HCD 35 # of cross-links 24 30 22 12 2