Detailed AFM Force Spectroscopy of the Interaction between. CD44-IgG-Fusion Protein and Hyaluronan

Electronic Supplementary Material to: Detailed AFM Force Spectroscopy of the Interaction between CD44-IgG-Fusion Protein and Hyaluronan Aernout A. Martens a, Marcel Bus a, Peter C. Thüne b,, Tjerk H. Oosterkamp, c Louis C.P.M. de Smet *,a a) Department of Chemical Engineering, Delft University of Technology, Julianalaan 136, 2628BL Delft, The Netherlands b) Department of Chemical Engineering and Chemistry, Catalysis & Energy, Eindhoven University of Technology, 5600 MB Eindhoven, The Netherlands c) Kamerling Onnes Laboratory, Leiden University, Niels Bohrweg 2, 2333 CA, Leiden, The Netherlands Current address: Fontys Hogescholen, Rachelsmolen 1, 5612 MA Eindhoven, The Netherlands. * Corresponding author. Tel.: +31 152782636; fax: +31 152788668. E-mail address: l.c.p.m.desmet@tudelft.nl (L.C.P.M. de Smet). 1/14

A) In-situ Modification of AFM Probes AFM head AFM head HA-coated Si c a b pipette AFM pedistal base AFM pedistal base Figure S1. Schematic representation of the arrangement of dish a (a cap of a 1 ml Eppendorf tube) and Petri dishes b and c for the modification of the AFM probe (left) making it possible to move a freshly prepared AFM probe to the HA-coated surface without drying it (right). B) XPS Data of Coating Steps As can be seen in the XPS data (Fig. S2a) the clean silicon wafer gives mainly silicon and silicon oxide bonds with some adsorbed carbon and carboxyl groups. After APTES coating (Fig. S2b) the amount of C-C bonds increases and so do nitrogen bonds in a ratio corresponding to APTES which contains an aliphatic carbon chain ending with a primary amine. Silicon signal decreases corresponding to the screening of the silica surface by the APTES layer. After Hyaluronan immobilization (Fig. S2c and d), more C-O bonds are detected, as should be expected for glucosaminoglycans. Also C-C bonds and nitrogen bonds are detected, that are alslo present in hyaluronan. The XPS measurements for rooster comb hyaluronan are comparable to those of Streptocuccus hyaluronan. 2/14

Figure S2. Summarized XPS data of Silicon wafer sample after the different modification steps during the stages of coating: (a) clean wafer, (b)aptes coating, (c)hyaluronan coatings: rooster comb HA or (d) streptococcus HA. The different atomic bonds are represented on the abscissa and the relative amount of each is given on the ordinate in arbitrary units. C) Retract Curve Data Treatment and Selection of Single-rupture Events Deflection as a function of Z-piezo position was obtained for a total of 800 approach and retract cycles. All data treatment was performed in MS Excel. The following data treatment is generally applied to all retract curve data, unless stated differently. This general approach is discussed for the example of a 4s approach-retract cycle. Any differences between the general data treatment (4s) and that of other cycle times are discussed after this general explanation. The raw AFM output data was contained in columns (Fig. S3) A: Z-piezo position (nm), B: deflection during approach (na) and C: deflection during retraction (na). Figure S3. AFM output: in column A: Z-piezo position, B: forward curve C: retract curve. 3/14

Figure S4 a) Screen shot of the MS excel sheet for removing the sinus-like background from the AFM data. In column J, the Z-Piezo position (nm). In column K deflection (na). In column M, fit parameters were defined. The fitted curve consisted of a super position of a sinusoidal part, and a linear part. For the sinus, amplitude was determined by M7, wavelength in M8, phase in M9. For the linear part, slope was determined in M11 and offset in M12. Portions without events in the retract curve were visually handpicked and the background curve fitted to the eventless data using the Excel Solver function. This was done by changing all the fit parameters until the sum (M14) of squared differences (column O) of the selected portions was minimal, b) Retract curve (brown) and the fitted background curve (blue), c) retract curve, after subtraction of the fitted background curve (data in columns R: Z-position and S: deflection), d) enlargement of figure 4c: rescaling the deflection so single rupture events can be observed at 157 nm, 251 nm, 517 nm, and 770 nm. This figure regards an approach and retract cycle of 4 s. 4/14

The raw data appeared to show a sinusoid (Fig. 4b), probably due to interference of the laser signal bounced off the cantilever with that bounced off the mirroring silica surface. Before any analysis on minute rupture events could be done, the interference had to be fitted out. Removal of the sinus resulted in curves like Fig. S4c. Upon rescaling of the Y axis of Fig. S4c, single rupture events become apparent (Fig. S4d). Figure S5: The data in column S was filtered for single rupture events. For the sake of illustration, data corresponding to Z-piezo position 3.003 149.136 was omitted as indicated by downward pointing arrows and only a portion of data is shown in which the single rupture event ending at 157 nm is selected. In every column U - AC, data was treated, accumulated or filtered. If data did not pass, the cell value was set to 0. Cells containing filtered data in columns V - AC were conditionally formatted as: if value > 0, then fill color = orange. 5/14

Every column from U AC contains a filter designed to accumulate and select for clean, unconvoluted single rupture events. We define a value of a cell at position (column, row) as: Column row. For example, the value at cell U7 = U 7 = 0.00220. Description of column functions: All columns in Fig. S5 process data for n>5, sometimes using threshold values stated in row 5 (yellow). Colum U: During retraction, the cantilever moves upward and downward. Only the upward differences in deflection (ΔDFL) are counted, the downward differences are filtered out and set to 0. If S n > S n-1, then U n = S n-1 S n, otherwise U n = 0. written in Excel for cell U20 as: =IF(S20>S19,S20-S19,0) Column V: Because of the digital nature of the AFM data output, the smallest ΔDFL in the raw output data = 0.00153 na. Larger ΔDFL were multiples of 0.00153 na. ΔDFL smaller than ½ 0.00153 na will not show up at all in the raw AFM output (two subsequent DFL data points in column K will have the same value). After fitting and subtracting the background, small ΔDFL < 0.00153 appeared at values for n where values are 0 according in the raw data (column K). To correct for this, we applied a filter in column V that removes insignificant upward ΔDFL < V 5, with V 5 the threshold value = 0.0005 na. For n > 5: 6/14

If U n < V 5, then V n = U n, otherwise V n = 0 written in excel for cell V20 as: =IF(U20<$V$5,0,U20) Column W: All subsequent ΔDFL are accumulated. If there is a gap in which ΔDFL = 0, then accumulation stops. In effect, whole, uninterrupted upward motions starting from the lowest point and ending at the highest point are accumulated. If V n = 0 then W n = 0, otherwise, W n = V n + W n-1 written in Excel for cell W20 as: =IF(V20=0,0,W19+V20) Column X: As long as the total ΔDFL is accumulating in column W, it is not transported to column X. Column X selects only the last value: total ΔDFL (tδdfl) of each accumulation in column W. If W n+1 = 0, then X n = W n, otherwise X n = 0 Written in Excell for cell X20 as: =IF(W21=0,W20,0) 7/14

Column Y: Data from column X was rounded to 3(=Y 5 ) digits. Now, instead of having continuous data we have binned, discrete data useful for rupture counts and making histograms. The bin size here is 0.001 na measured deflection. Written in Excel for cell Y20 as: =ROUND(X20,$Y$5) Until here, the columns served to accumulate all upward tδdfl. From here on, the columns serve to filter: to reject tδdfl events that may be convoluted with each other, or other effects. Fig. S6 shows a graphical impression of the textual description of the filters. Figure S6. Graphical example representation of the data filtration performed in the columns Z, AA, AB and AC; tδdfl events are accepted or rejected according to the text. 8/14

Column Z: Sustained tδdfl filter: the average level before and after a counted tδdfl were compared. If the tδdfl was not sustained, then it was not counted as a rupture event. For n>5: If average (S n-4, S n-3, S n-2 ) < average (S n+2, S n+3, S n+4 ) Z 5, then Z n = Y n, otherwise Z n = 0 Written in Excel for cell Z20 as: =IF((S17+S16+S18)<((S23+S22+S24)-3*$Z$5),Y20,0) With the threshold value Z 5 = 0.006 na, being the minimal step size that has to be sustained between the two averages for the tδdfl to be counted as a rupture event. Column AA: There are very small tδdfl that can still satisfy the criterion in column Z if they are just next to a large tδdfl that was accepted. In column AA we filter out these small tδdfl by comparing to direct neighboring tδdfl and select for only the largest one, (the true reason for the satisfaction of the criterion in column Z). If Z n-6 +Z n-5 +Z n-4 +Z n-3 +Z n-2 +Z n-1 + Z n-1 +Z n-2 +Z n-3 +Z n-4 +Z n-5 +Z n-6 > Z n, then AA n = 0, otherwise AA n = Z n Written in Excel for cell AA20 as: =IF((SUM(Z14:Z19)+SUM(Z21:Z26))>Z20,0,Z20) i.e., if any other larger tδdfl is near, the smaller is not counted. 9/14

Column AB: If a rupture event does not return the cantilever to its equilibrium deflection, then not all force is released upon rupture, meaning that there may be other bonds to the surface that take over part of the force contained in the ruptured bond. Thus, the measured rupture force may be smaller than the force that was contained in the ruptured bond. So, this filter selects only the rupture events that bring the cantilever deflection (average DFL after rupture) within a threshold value (AB 5 ) of the equilibrium position. If average (S n+2, S n+3, S n+4 ) > AB 5, then AB n = AA n, otherwhise AB n = 0 Written in Excel for cell AB20 as: =IF(((S22+S23+S24)/3)>$AB$5,AA20,0) Column AC: The slope before and after the rupture event must be negative or at the most horizontal. This means that the average of the data points before the rupture is higher than the lowest (first) point in the rupture and that the average after the rupture is lower than the highest (last) point of the rupture. Thus, the difference between the average before and after the rupture is smaller than tδdfl. This filter can also be described differently: if the sustained difference is larger than AC 5 tδdfl, then it is rejected. If AB n < AC 5 * [average (Sn +2, Sn +3, Sn +4 ) average (Sn -2, Sn -3,Sn -4 )], then AC n = 0, Otherwhise AC n = AB n 10/14

With AC 5 the factor between tδdfl and the difference in averages. Written in Excel for cell AC20 as: =IF(3*AB20<$AC$5*((S22+S23+S24)-(S16+S17+S18)),0,AB20) Now, only non convoluted tδdfl rupture events have been selected and they are listed in column AC. The same approach was used to analyze data for all of the different approach and retract rates, leading to a collection of rupture events at different loading rate. The above description for data treatment regards a 4 second approach and retract cycle. However, approach-retract cycles were run in 2, 4, 8, 16,32,and 60 s, (and only retract was recorded in half those times). The data resulting from measurements at other rates, were filtered similarly with some minor differences. As can be seen in Table S1, threshold values Z 5 and AC 5 were slightly different. Also filter AC was only applied to cycle times of 2 and 4 seconds. For 8 60 seconds the filtering up to AB was stringent enough. Table S1. Threshold values for filters used for the different measurement rates. 11/14

One more difference, that is not obvious from Talbe S1, is that for the cycles of 2 seconds, the Z filter, 11 data points before and after were averaged instead of three: average (n-12 to n-2) and average (n+2 to n+12) were subtracted to see if the step size persisted. This was necessary because the signal to noise ratio of the AFM data output was worse at higher retract speeds. The reason behind this being: the AFM measures the position of the cantilever at a fixed frequency. Therefore, at high retract rates, there will be a smaller number of cantilever position data to average for each data point. Written in Excel for cell Z20 as: =IF(AVERAGE(S8:S18)<(AVERAGE(S22:S32)-1*$Z$5),Y20,0) For cycles of 1 second, (not shown) rupture events could not properly be automatically distinguished from noise. We considered 3 methods of analyzing this collection: 1) by pairing each individual rupture event to its own loading rate (derived from the slope just before rupture) and making a scatter plot, 2) by grouping the data for each separate retract rate, making a histogram of rupture force counts and plotting the most probable rupture force against the loading rate determined from the retract rate and the cantilever spring constant, and 3) similarly, plotting the most probable rupture force against the average loading rate within the group determined by averaging the mini slopes just before rupture. In the scatter plot according to method 1, we found a cloud and not a clear trend. We think that the reason for this is that after the fitting and subtracting of the sinus-like background, there were still residual fluctuations in the data. This does not influence the height of the 12/14

ruptures that are counted, but does influence the mini slopes just before rupture, making them individually unreliable. Still, the sinus-fitting procedure should lead to a symmetric distribution in errors, in such a way that the average mini slope before rupture is still reliable. Therefore we considered plotting the most probable rupture force per group (data grouped according to retract rate) against a loading rate based on the retract rate and the cantilever spring constant (method 2), as well as against the loading rate according to the average mini slope just before rupture (method 3). Both were plotted in the article (Fig. 4). D) Error Analysis We estimate that the error in the spring constant has the largest contribution to the overall error. Based on the number of significant figures in the spring constant we calculated this error to be (0.0005/0.013 ) 3.85%. It should be noted that this error affects the loading rate and the rupture force. Hence this error will affect both the slope and the off-set on the linear abscissa. From Fig. S6 it becomes clear that the errors in the slope (f β ) and the off-set on the linear abscissa (F r0 ) are also 3.85%. The values of f β and F r0 are used to calculate the barrier height (E b ) via E b / kt = -ln(f r0 ) + ln(f β ) + 22. It must be noted that Evans reports an error of ±1 in the determination of the value of 22 (reference 17 of the paper). The calculated error in -ln(f r0 ) and ln(f β ) were found to be ~ 0.04 for the solid-circle and open-circle data. Consequently the error in the E b is ±1.08 = ±1. As only the slope is used to calculate the distance from the energy minimum to the barrier top (x β ), the error in x β amounts ~ 3.85%. 13/14

Figure S6. The solid and open circles correspond to the data of Fig. 4 in the manuscript. The solid lines are the corresponding trend lines with equations presented in bold. To this plot, 4 data series have been added that correspond to the upper errors (diamonds) and lower errors (triangles) using an percentage error of 3.85%. The dashed lines represent the related trend lines with their equations in regular text. From this analysis we calculated the error in the slopes of the plots also to be ~3.85 = 4%. For matters of clarity the errors on the x-axis are not presented in this figure. 14/14