NORMAN Databases Workshop: UBA, Berlin, 8-9 June 2017 Non-target Screening (CWG-NTS) NORMAN Suspect List Exchange: Present status and future plans Participants: Eawag, Uni Athens, EI, US EPA & many others Emma Schymanski, Reza Aalizadeh, Nikiforos Alygizakis, Natalia Glowacka, Luboc Cirka, Ildiko Ipolyi, Jaroslav Slobodnik, Nikolaos Thomaidis, Juliane Hollender and others Image www.seanoakley.com/
What are Suspect and Non-Target Screening? Sampling extraction (SPE) HPLC separation HR-MS/MS HPLC separation and HR-MS/MS KNOWNS SUSPECTS No Prior Knowledge TARGET ANALYSIS SUSPECT SCREENING NON-TARGET SCREENING Targets found Suspects found Masses of interest (Molecular formula) DATABASE SEARCH STRUCTURE GENERATION Candidate selection (retention time, MS/MS, calculated properties) Time, Confirmation Effort and & quantification Number of compounds Compounds. present
Suspect Screening Examples
NORMAN Collaborative Non-Target Screening Trial TARGET ANALYSIS SUSPECT SCREENING NON-TARGET SCREENING # Substances Quantified Suspects (n=11) (n=14) Semi/Non- Quantified (n=12) Isomer mixes (n=5) Identified Non-targets (n=6) Unidentified Non-targets (n=13) Formula (n=7) Schymanski et al. 2015, ABC, DOI: 10.1007/s00216-015-8681-7 Level 1 Levels 2-3 Level 4 Level 5
NORMAN Databases Suspect List Exchange initiated in 2015
NORMAN Suspect List Exchange (2016) http://www.norman-network.com/?q=node/236 Full Lists InChIKeys References
NORMAN Suspect List Exchange (2016) Contributions so far.eu PFAS Suspect List of fluorinated substances Antibiotic Suspect List (ITN MSCA ANSWER)
NORMAN Suspect List Exchange (NEW in 2017)
NORMAN Suspect List Exchange (NEW in 2017) 3,333 Cosmetic products ~2,600 PFAS 24,883 Substances (Expo, Hazard Scores)
The Chemical Identity Crisis Schymanski & Williams, 2017, ES&T DOI: 10.1021/acs.est.7b01908
Curation and Merging Workflow 15(+) lists => one Aalizadeh et al. in prep. Fourches et al 2010, 2016 19,492-549 -4,310 14,633 1. Fill missing information for all entries 2. Standardize, generate 2D structure 3. Remove salts, solvent etc. 4. Fix valences, add Hs 5. Optimize 3D structure 6. Store original data 7. Create StdInChIs 8. Remove duplicates 9. Create MSready SMILES 10. Validate MSready list Add to sheets Removed And Duplicate_pairs 4310 entries each Manually fix problem entries No structure Add to sheet Missing info 549 entries
Overlap of SusDat with Public Databases Aalizadeh et al. in prep.
Overlap of MS-ready identifiers via CAS and/or Name Aalizadeh et al. in prep.
Mixtures (work in progress) Mixture identification and curation o Step 1: Identify all mixture components o Step 2: Create SMILES, split into components, retain non-salts o Step 3: Create MS-ready identifiers, compare retained components o Single structure => all the same, treat as individual o Otherwise save as Level 7 for validation X
Validation Level Aalizadeh et al. in prep.; modified validation level concept from the CompTox Chemistry Dashboard
NORMAN Suspect List Exchange Data table ( SusDat ) SCREEN SMART OR BIG OR BOTH? All suspect lists available in one table: o http://www.norman-network.com/datatable/ o Quick search options on every field, e.g. name, mass, MERGING SEVERAL LISTS IS NOT TRIVIAL! WORK IN PROGRESS!!! SCREENSHOT NEEDS UPDATING THIS IS BEING UPDATED ATM
NORMAN-SusDat the new data table Part 1: Name, CAS, Validation level, various structural identifiers, source
NORMAN-SusDat the new data table Part 2: Calculated values o Basic mass spectral values (monoisotopic mass, [M+H] +, [M-H] - ) o Predicted RTI in positive ESI and Negative ESI o Uncertainty comment regarding RTIs (covered, outside, proof needed)
NORMAN-SusDat the new data table Part 3: Calculated Toxicity and log Kow values
NORMAN-SusDat the new data table Potential additional parameters o Further MS-based parameters o Common adduct masses: Na+, NH 4 +, K+, +Cl-, +FA-, o Major fragments o Direct link to CompTox Chemistry Dashboard (see next talk) o Direct link to MoNA via InChIKey, e.g. atrazine: http://mona.fiehnlab.ucdavis.edu/spectra/browse?inchikey=mxwjvtooroxgiu o Additional Parameters for NORMAN Prioritization Efforts o PBT, vpvb, CMR, ED; maybe EC50 (or P-PNEC) o Fugacity modelling results (relevant matrix water/soil/air) o Can calculate with ChemProp and/or others (rcdk, MATLAB?) o Substance use and hazard scores (e.g. list from KEMI) o Trade-off between data amount and useability
Market List (Stellan Fischer, KEMI) Exposure Score & Hazard Scores for Suspected chemicals TODO! UVCB: "Unknown or Variable composition, Complex reaction products or Biological materials" More substances with hazard scores
NORMAN Suspect List Exchange Future plans o Collect more suspect and target lists o NORMAN-SusDat online o Add additional parameters discussed here progressively o Ideal what can and cannot be ionized in various measurements o Separate KEMI_Market List: Database of Industrial Chemicals o CompTox Dashboard is the source of structures for this list (Batch CAS) o Database of predicted transformation products from suspects? o NORMAN-TransSusDat? o What is the best way to implement this? o Collaboration with US-EPA o Registration of individual lists, esp. PFAS, high priority NORMAN lists o Standardize / compare curation and merging workflow o Improve handling of UVCBs o Create additional website with information about all values and curation o Publish the exchange and curation workflow (manuscript in prep. already)
Acknowledgements Suspect Exchange task partners: Reza Aalizadeh Nikos Thomaidis Stellan Fischer, KEMI Additional contributors: Jaroslav Slobodnik, Natalia Glowacka, Lubos Cirka, Ivan Spanik, Nikiforos Alygizakis and others at EI Questions?.eu Tony Williams, Andrew McEachran, Jon Sobus, US EPA M. Stravs, E. Müller, T. Schulze, S. Neumann Contact: emma.schymanski@eawag.ch Contribute: suspects@normandata.eu C. Ruttkies, S. Wolf, S. Neumann EU Grant 603437