Electronic Supplementary Material (ESI) for Environmental Science: Water Research & Technology. This journal is The Royal Society of Chemistry 206 Table S. Summary of datasets used in this study No DOI number for paper Publication year Disinfectant residual type Country of sampling Sampling points Sequencing instrument Hypervariable region amplified Reference number Number of treatment plants Average % of matches in SILVA database 0.06/j.watres.205.2.00 205 Chlorinated UK POU Illumina Miseq V4 5 89.02% ± 3.79% 2 doi: 0.06/j.watres.204.0.049 204 Chlorinated UK DWDS Roche 454 FLX V-V3 28 3 0.06/j.watres.203..027 204 Chlorinated/Chloraminated USA POU Roche 454 FLX Titanium V-V2 32 5 7.5% ± 6.22% 79.69% ± 2.42% 4 0.06/j.ecoenv.204.07.029 204 Chlorinated China DWTP outlet, POU Roche 454 FLX Titanium V3-V4 37 * 6.6% ± 2.48% 5 0.28/AEM.0892-2 202 Chlorinated/Chloraminated USA DWDS Roche 454 FLX Titanium V4-V5 33 89.56% ± 3.88% 6 0.37/journal.pone.04087 205 Chlorinated/Chloraminated USA DWTP outlet, POU Illimina Miseq V4 34 7 0.02/acs.est.5b0352 205 Chlorinated China DWTP outlet, POU Roche 454 FLX Titanium V-V3 38 5 * 67.2% ± 8.08% 85.28% ± 8.29% 8 0.06/j.watres.203.0.07 203 Chlorinated China DWTP outlet Roche 454 FLX V-V3 0 96.92% 9 0.02/es5009467 204 No disinfectant residual Netherlands POU Roche 454 FLX Titanium V4-V6 26 ** 63.6% ± 4.24% 0 0.28/mBio.035-4 204 Chloraminated USA DWTP outlet, POU Roche 454 GS-FLX V4-V5 7 0./462-2920.2739 205 No disinfectant residual Netherlands DWDS Illumina MiSeq/Roche 454 GS-FLX 2 0.02/es502646d 204 Chlorinated/Chloraminated USA DWDS V4 and V5- V6 Illumina MiSeq V4 35 6 32** 69.98% ± 2.3% 45.22% ± 6.65% 68.93% ± 2.65% 3 0.007/s274-03-32-5 203 Chloraminated China DWTP outlet, POU Roche 454 GS-FLX V3-V5 39 29.9% ± 0.64% 4 0.28/AEM.0297-5 205 Chloraminated Australia DWTP outlet, DWDS Ion Torrent V3 43 2 90.24% ± 6.82% *Both studies sampled at the same treatment plant **Plant from study No.9 may have been sampled in study No.
Figure S: The number of samples retained (primary Y-axes, blue squares) with increasing subsampling depths and their corresponding Good s coverage (secondary Y-axes, red squares) is shown. A subsampling depth of 500 sequences per sample was chosen as this allowed for an average Good s coverage 0.8 while retaining maximum number of samples (42 out of 45) in the analyses. Number'of'samples' Average'Good's'coverage' 45.00 0.90 40 0.80 Number of samples 35 30 0.70 0.60 0.50 0.40 0.30 Average Good's coverage 25 0.20 0.0 20 0.00 0 250 500 750 000 250 500 750 2000 Subsampling depth!
Figure S2: Workflow illustrating data collection, data processing and data analysis steps. Data$collec)on$ Data$processing$ Data$analysis$ 6S,rRNA,gene:based, studies.,, Samples,from,full,scale, drinking,water,systems,, Sampling,points:,DWTP outlet,, DWDS,,Point,of,use., Microbial,regrowth, control,strategies:, Chlorinated, Chloraminated, Disinfectant, residual,free, Single,end, Fastq,files, sickle,v..33,,,(min,quality, score=28;,min, length=50bp),, Paired,end, Fastq,files, pear,v.0.8.46, (min,quality, score=28;,min, length=50bp),,, Fastq,to,FASTA,format,using, the,fastq_to_fasta,command, from,fastx:toolkit,v.0.0.3.2,, In,mothur,,unique.seqs., blastn,against,the,silva,9,ssuref_nr, database,(iden\ty,,97%,,e: value<0.000005.),, Extrac\on,of,full,length,sequences,from, SILVA,database., Data,colla\on, In,mothur:, Best:match,sequences,were,aligned,against,the, SILVA,seed,alignment.,, screen.seqs,command,(ver\cal,=t,and,trump,=.),, OTU,clustering,(cutoff,97%),, Taxonomic,classifica\on,(command,classify.seq), using,silva,taxonomy., Subsampled, OTU,table, (a), Full,dataset, (b), Taxonomic, classifica\on, Fasta,file, (OTUs), (3), Alpha,diversity,es\mates,(observed, OTUs,,Inverse,Simpson,index,, Shannon,index,,Pielou s,evenness), Beta,diversity,(Bray,Cur\s,distances), PERMANOVA,(func\on,Adonis),, Mean,rela\ve,abundance,of,taxonomic, groups,,otus,,poten\al,opportunis\c, pathogens,,target,groups,(nitrifiers,and, predatory,bacteria), Phylogene\c,tree,construc\on,in, RaxML, Es\ma\on,of,poten\al, contamina\on, Tax4Fun,,metabolic,profile, predic\on, Explanatory, variables, (4), (a), (4), (b), (3), (b), (a),
Figure S3: Heatmap of OTU abundance (OTUs with relative abundance >0.0% in the subsampled OTU table). Abundances were scaled with a Z-score transformation to improve visualization. The sampling location dendrogram was generated with Bray Curtis distances and UPGMA clustering method. Grouping information is indicated by the color legend (Chl: chlorinated, Chm: chloraminated, Drf: disinfectant residual-free). Sampling)loca,ons)
Figure S4: Dendrogram of sampling locations generated with Bray Curtis distances and UPGMA clustering method. Color legends indicate type of source water (SW: surface water; GW: groundwater; Mix: mixed source water including surface water, groundwater and desalinated seawater), and disinfection group (Chl: chlorinated, Chm: chloraminated, Drf: disinfectant residual-free). Reference numbers of each dataset are at the bottom of the plot.