Welcome! Mass Spectrometry meets ChemInformatics WCMC Metabolomics Course 2013 Tobias Kind. Course 3: Mass spectral and molecular database search

Similar documents
Welcome! Course 7: Concepts for LC-MS

De Novo Metabolite Chemical Structure Determination. Paul R. West Ph.D. Stemina Biomarker Discovery, Inc.

What s New in NIST11 (April 3, 2011)

Making Sense of Differences in LCMS Data: Integrated Tools

INTERNATIONAL METABOLOMICS SOCIETY. Justin van der Hooft EMN Committee member. Glasgow Polyomics, University of Glasgow, UK

Compounding insights Thermo Scientific Compound Discoverer Software

Mass Spectrometry. Hyphenated Techniques GC-MS LC-MS and MS-MS

RMassBank: Automatic Recalibration and Processing of Tandem HR-MS Spectra for MassBank

Thermo Scientific LTQ Orbitrap Velos Hybrid FT Mass Spectrometer

MassHunter TOF/QTOF Users Meeting

Metabolomics in an Identity Crisis? Am I a Feature or a Compound? The world leader in serving science

Application Note LCMS-116 What are we eating? MetaboScape Software; Enabling the De-replication and Identification of Unknowns in Food Metabolomics

MetWorks Metabolite Identification Software

WADA Technical Document TD2003IDCR

Agilent METLIN Personal Metabolite Database and Library MORE CONFIDENCE IN COMPOUND IDENTIFICATION

Choosing the metabolomics platform

An Effective Workflow for Impurity Analysis Incorporating High Quality HRAM LCMS & MSMS with Intelligent Automated Data Mining

Overview of NETCHEM MSc & PhD courses: mass spectrometry in EFSC

Agilent MassHunter Quantitative Data Analysis

Die Nadel im Heuhaufen

LC-MS Based Metabolomics

High-Field Orbitrap Creating new possibilities

GAS CHROMATOGRAPHY MASS SPECTROMETRY. Pre-Lab Questions

Agilent TOF Screening & Impurity Profiling Julie Cichelli, PhD LC/MS Small Molecule Workshop Dec 6, 2012

Multi-residue analysis of pesticides by GC-HRMS

Computational Methods for Mass Spectrometry Proteomics

Finnigan LCQ Advantage MAX

Analysis of Polar Metabolites using Mass Spectrometry

Mass Spectrometry (MS)

profileanalysis Innovation with Integrity Quickly pinpointing and identifying potential biomarkers in Proteomics and Metabolomics research

Analytical Technologies and Compound Identification. Daniel L. Norwood, MSPH, PhD SCĪO Analytical Consulting, LLC.

(Refer Slide Time 00:09) (Refer Slide Time 00:13)

for the Novice Mass Spectrometry (^>, John Greaves and John Roboz yc**' CRC Press J Taylor & Francis Group Boca Raton London New York

TANDEM MASS SPECTROSCOPY

MassHunter Software Overview

Thermo Finnigan LTQ. Specifications

SUSPECT AND NON-TARGET SCREENING OF ORGANIC MICROPOLLUTANTS IN WASTEWATER THROUGH THE DEVELOPMENT OF A LC-HRMS BASED WORKFLOW

Accelerate Unknown Detection in Emerging Drug Testing Using Thermo Scientific Compound Discoverer and mzcloud

Physicochemical Prediction of Metabolite Fragmentation in Electrospray Tandem Mass Spectrometry TANAKA WATARU. Doctor of Philosophy

Automated and accurate component detection using reference mass spectra

Rapid and Accurate Forensics Analysis using High Resolution All Ions MS/MS

WADA Technical Document TD2015IDCR

Last updated: Copyright

Computational mass spectrometry for small molecules

Overview. Introduction. André Schreiber 1 and Yun Yun Zou 1 1 AB SCIEX, Concord, Ontario, Canada

A Description of the CPTAC Common Data Analysis Pipeline (CDAP)

Identifying Disinfection Byproducts in Treated Water

Identification and Characterization of an Isolated Impurity Fraction: Analysis of an Unknown Degradant Found in Quetiapine Fumarate

Application Note. Authors. Abstract. Introduction. Environmental

Overview. Introduction. André Schreiber AB SCIEX Concord, Ontario (Canada)

NOTE BY THE TECHNICAL SECRETARIAT

MS-based proteomics to investigate proteins and their modifications

HOWTO, example workflow and data files. (Version )

Skyline Small Molecule Targets

Agilent All Ions MS/MS

Mass Spectrometry and Proteomics - Lecture 2 - Matthias Trost Newcastle University

A Platform to Identify Endogenous Metabolites Using a Novel High Performance Orbitrap MS and the mzcloud Library

Types of Analyzers: Quadrupole: mass filter -part1

Mass Spectrometry (MS)

Using the ACD/MS Manager Software with Agilent 1100 Series LC/MS Systems. Application Note

Protein Quantitation II: Multiple Reaction Monitoring. Kelly Ruggles New York University

Mass Spectrometry for Chemists and Biochemists

TargetScreener. Innovation with Integrity. A Comprehensive Screening Solution for Forensic Toxicology UHR-TOF MS

Translational Biomarker Core

Ionization Methods in Mass Spectrometry at the SCS Mass Spectrometry Laboratory

Fundamentals of Mass Spectrometry. Fundamentals of Mass Spectrometry. Learning Objective. Proteomics

All Ions MS/MS: Targeted Screening and Quantitation Using Agilent TOF and Q-TOF LC/MS Systems

Cerno Application Note Extending the Limits of Mass Spectrometry

Agilent 6400 Series Triple Quadrupole LC/MS/MS Users Session

Nature Methods: doi: /nmeth Supplementary Figure 1. Fragment indexing allows efficient spectra similarity comparisons.

Dissociation of Even-Electron Ions

SRM assay generation and data analysis in Skyline

Introduction to LC-MS

MassHunter METLIN Metabolite PCD/PCDL Quick Start Guide

5Cl-AKB48 and 5Br-AKB48

Agilent Technologies LCMS Portfolio

Powerful Scan Modes of QTRAP System Technology

Searching Substances in Reaxys

An ion source performs the following two functions:

A Q-TOF Generated, Metabolomics- Specifi c LC/MS/MS Library Facilitates Identifi cation of Metabolites in Malaria Infected Erythrocytes

Cerno Application Note Extending the Limits of Mass Spectrometry

De novo Protein Sequencing by Combining Top-Down and Bottom-Up Tandem Mass Spectra. Xiaowen Liu

Bioanalytical Chem: 4590: LC-MSMS of analgesics LC-MS Experiment Liquid Chromatography Mass Spectrometry (LC/MS)

Protein Quantitation II: Multiple Reaction Monitoring. Kelly Ruggles New York University

QTOF-based proteomics and metabolomics for the agro-food chain.

A Workflow Approach for the Identification and Structural Elucidation of Impurities of Quetiapine Hemifumarate Drug Substance

4-methyl-1-phenyl-2-pyrrolidin-1-yl-pentan-1-one

Proteome-wide label-free quantification with MaxQuant. Jürgen Cox Max Planck Institute of Biochemistry July 2011

Atomic masses. Atomic masses of elements. Atomic masses of isotopes. Nominal and exact atomic masses. Example: CO, N 2 ja C 2 H 4

Thermo Scientific LTQ Velos Dual-Pressure Linear Ion Trap

CHROMATOGRAPHY AND MASS SPECTROMETER

Methyl 1-[(4-fluorophenyl)methyl]indazole-3-carboxylate

Agilent MassHunter Quantitative Data Analysis

Metabolomics Batch Data Analysis Workflow to Characterize Differential Metabolites in Bacteria

Introducing the Agilent 7000A QQQ-MS for GC Sunil Kulkarni Product Specialist Agilent Technologies

Other Methods for Generating Ions 1. MALDI matrix assisted laser desorption ionization MS 2. Spray ionization techniques 3. Fast atom bombardment 4.

Exercise = black text; Key = blue text; Instructor s Guide = bracketed italicized text with yellow highlight

Chemistry Instrumental Analysis Lecture 37. Chem 4631

Mass Spectrometry in MCAL

Quantitation of High Resolution MS Data Using UNIFI: Acquiring and Processing Full Scan or Tof-MRM (Targeted HRMS) Datasets for Quantitative Assays

Transcription:

Biology Informatics Chemistry Welcome! Mass Spectrometry meets ChemInformatics WCMC Metabolomics Course 2013 Tobias Kind Course 3: Mass spectral and molecular database search http://fiehnlab.ucdavis.edu/staff/kind CC-BY License 1

Molecules and mass spectra Close relationship between molecular structure and mass spectra Molecular structure is reflected in mass spectral features (peaks, peak heights and peak combinations) Mass spectra reflect a state of gas phase ion physics and chemistry (rearrangements, fragmentations, bond cleavages) Electron ionization (70 ev) mass spectra; Source: NIST05 2

Molecules and mass spectra Similar structures may or may have not similar mass spectra 100 130 50 73 Si O N Si 0 47 59 59 65 91 91 102 105 114 132 147 147 163 163 179 179 188 204 206 220 N Si 280 294 50 44 Si O 100 73 116 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 230 240 250 260 270 280 290 300 310 320 Silanamine, N,1,1,1-tetramethyl-N-[1-methyl-2-phenyl-2-[(trimethylsilyl)oxy] N-Methylphenylethanolamine, bis(trimethylsilyl)- Electron ionization (70 ev) mass spectra; Source: NIST05; Created using structure similarity search in NIST MS Search program 3

Molecules and mass spectra Similar mass spectra may or may have not similar structures 100 43 55 70 83 50 29 97 0 15 27 32 27 29 65 111 111 125 125 140 139 154 168 153 168 196 196 50 97 41 69 83 100 55 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 1-Tetradecene Cyclotetradecane Electron ionization (70 ev) mass spectra; Source: NIST05; Created using spectral similarity search in NIST MS Search program 4

Large mass spectral databases Name Spectra count Type NIST11 ($$) 244,000 electron ionization (EI 70 ev) Wiley 10 ($$) 719,000 electron ionization (EI 70 ev) MassBank (free) 12,000 electron ionization (EI 70 ev) NIST12 MS/MS ($$) 95,400 MS/MS (ESI, +/-, 30-100V CID) METLIN (open, $$) 57,972 MS2 (30-100V CID, QTOF) MassBank (free) 18,000 MS2 (ESI, APCI) MassFrontier ($$) 7,000 MS n, ESI, (Spectral Tree Library ) RIKEN Respect (free) 9,000 MS2 (ESI) LipidBlast (CC-BY) 212,516 MS2 (in-silico computer generated) 119,200 compounds Important is data quality Annotation with InChI, InchiKey, structure and formula + metadata See: http://www.sisweb.com/software/ms/wiley.htm 5

Mass spectral databases II 272 Smaller specialized libraries 100 Cl Cl Pfleger Maurer Weber (Drugs) MS+RI, 70eV Cl Cl Cl MassFinder (Volatiles) MS+RI, 70eV Cl 50 237 Cl Cl Cl RIZA DB (Toxicants) MS+RI, 70eV Cl Cl Cl 332 Golm DB (primary Metabolites) MS+RI, 70eV 404 0 230 250 270 290 310 330 350 370 390 410 430 450 Fiehnlib (primary Metabolites) MS+RI, 70eV (riza_web) RI 2583 KEY 1596 CAS 2385-85-5 FRML Empty CMPD Mirex AAFS (Drugs, Forensic,Toxicology), MS+RI, 70eV ChemicalSoft (Drugs), MS/MS, MS E In case of electron ionization (EI) same GC-Column (DB-5, RTX-5, DB-1, OV-1) and temperature program must be used for matching retention indices In case of ESI, APPI spectra (LC-MS) same mass spectrometer design and setup should be used (triple-quad, ion-trap, TOF, Q-TOF), collision energy 6

Mass spectral search Library search is always the first step during the identification process. Usually library search is not enough to assign unique isomer structures. Mass spectra must be clean and background free before search. For LC-MS and GC-MS this requires peak picking and deconvolution. Additional orthogonal information has to be used: restriction of compound space to certain species or material use of isotope pattern information use of retention index if derived from GC-MS data use of retention logp or logd correlations in case of LC-MS additional fragmentation at different voltages (MS E ) Only certain mass spectra can be in-silico predicted (calculated) (peptides, lipids, carbohydrates) this is not the rule for other molecules 7

Mass spectral search algorithms PBM - Probability Based Matching (McLafferty & Stauffer) since 1976 Dot Product (Finnigan/INCOS) since 1978 Weighted Dot Product (Stein) since 1993 Mass Spectral Tree Search (Mistrik) since 21 st century Weighted Dot Product: Source: Stein S.E. see notes A u and A r : are the abundances of peaks in the user and reference mass spectra m: m/z values w: weighting term 8

NIST MS Search GUI Search everything: A) Library Search: Reverse, Normal, Similarity, Neutral Loss B) Structure Similarity Search: find molecules similar to C) Formula Search: find C 11 H 13 N 3 O 3 S D) Constrained peak search: find peaks with m/z 122 and 188 and 266 E) Name search: find Stuntman (maleic hydrazide) Search Connections: Import/Export molecular structures: (msp, hpj, sdf) Interpret Structures (MSInterpreter.exe) Find substructures (expert algorithm) Import spectra from other programs (AMDIS, Chemstation, ChromaTOF) [Download] freely available (NIST12 MS Library is licensed ~ $1200) 9

NISTMS mass spectral search The NIST MS Search program is the gold standard for EI spectral search Used for all types of unit resolution spectra MS/MS, APCI, ESI-MS spectra 10

NIST MS Search GUI and NIST12 DB 120,000 MS/MS spectra; 15,000 precursor ions (adducts); 7000 compounds MS2, MS3, MS4 data; up to 15 different ionization energies 12k iontrap, 9k QTOF/QQQ; 90% pos ionization, 10% neg ionization 11

MS/MS search for small molecules General concept low resolution Iontrap # Name Score Triple-Quad 1. CL 72:8 813 2. CL 68:2 863 high resolution Q-TOF TOF-TOF Orbitrap FT-ICR-MS MS/MS spectrum 1) Precursor match 2) Product ion search MS / MS DB LipidBlast 200k tandem mass spectra Results with annotation 3. PC 32:0 88 4. PI 32:0 725 5. PS 36:2 882 6. PC 24:0 514 7. MGDG 38:4 298 8. PG 44:12 785 9. SQDG 30:0 426 10. GM1 465 1) Precursor match, searches ±0.4 Da (iontrap) to ± 0.005 Da windows (QTOFs) Powerful pre-filter, removes up to 99% of the wrong candidates 2) Product ion match (matches ions according to old-school similarity)

Searching 10,000 MS/MS spectra as batch MGF file from QTOF or iontrap Output folder MS/MS library DO NOT load into memory (Bug 2012) NIST MS PepSearch GUI Search speed: 1500 spectra / second Output : EXCEL (tab separated file) Time demand setup: 30 seconds Run time per MFG: <5 seconds 13

Excel output for 10k MS/MS spectra with metabolite name NIST MS PepSearch GUI MS/MS annotations of similar molecules require retention time confirmation

Mass Spectral Trees in Mass Frontier MassFrontier searches MSn and CID mass spectra 15 Source: MassFrontier Helpfile

Mass Frontier MS search MS Tree Hitlits 16

Relative Abundance Relative Abundance m/z Linear ion traps, Orbitraps, FT-CIRS easily can create MSn Reserpine-iontree3 # 45-45 RT: 0.48-0.48 AV: 2 NL: 5.64E5 T: ITMS + c ESI d Full ms3 609.95@cid35.00 448.20@cid35.00 [110.00-460.00] 100 90 80 70 60 50 40 30 20 10 0 195.00 236.06 MS 3 204.01 416.17 144.08167.05 248.21286.15 332.24 384.15 430.21 150 200 250 300 350 400 450 m/z D:\Opteron-Saver\...\Reserpine-ionmap Reserpine-iontree Reserpine-ionmap RT: 0.01-4.99 Mass: 100.00-650.00 NL: 7.36E3 1 2 3 4 Time (min) 600 500 400 300 200 100 Reserpine-iontree3 # 215 RT: 2.75 AV: 1 NL: 3.15E4 T: ITMS + c ESI d Full ms4 609.95@cid35.00 448.20@cid35.00 236.06@cid35.00 [50.00-250.00] 100 90 80 70 60 50 40 30 20 10 0 120.05 163.03 204.10 MS 4 144.05 172.12 148.03 178.98 206.11 91.01106.98 141.16 218.23 60 80 100 120 140 160 180 200 220 240 m/z Ion Map for all m/z values In mass range 100-650 Da one MS/MS spectrum Ion Tree perform data dependent MS 2,MS 3,MS 4 scans over whole mass range Comprehensive ion mapping and ion tree experiments using diverse compound sets will solve many fragmentation mysteries

Conversion of mass spectral libraries Usually a hassle. Keep a copy of libraries always in non-proprietary format. Request export functions or converters from your mass spec producer. XCalibur LibraryManager.exe Thermo Electron Fisher Finnigan MAT ICIS/GCQ/ITS 40 (*.lib, *.lbr) AutoMass (*.spr, *.prs, *.nam, *.hdr, *.fsf, *.cfs) MassLab (*.idb) to NIST and vice versa NIST LIB2NIST.exe [LINK] Spectral files *.msd, *.hpj, *.sdf HP LIB (*.LIB), NIST LIB, JCAMP-DX, (*.jdx *.hpj) 18

How to search molecules Exact search Substructure search Similarity search Ligand search N N L [O,Cl] 19 R-group/Markush search

ChemAxon Instant-JChem desktop database Search structures, formulae, properties, exact masses, adducts 20

ChemAxon JCHEM for EXCEL Can be edited Property values are computed from structure Instant calculation and visualization using charts 21

NIST MS DB has structure similarity search Good for comparing mass spectra of similar compounds (may have similar mass spectra) 22

PubChem open molecule database 47,750,434 compounds 119,809,272 substances (salt forms, acids, modifications) 717,429 bioassays All structures and properties can be searched, downloaded, are linked 23 Picture source: PubChem

Searching Molecules on PubChem 18 million compound DB (++) Goto PubChem Structure Search 24

Searching adduct ions on HMDB 25

Searching everything on ChemSpider Highly curated, literature, patents, properties, links to other DB 26

CAS SciFinder 73 million molecules 70 million commercially available products largest reaction DB (53 million reactions) and literature DB substructure and similarity search of structures a must for chemists and biochemists/biologists no bulk download, no good Import/ Export Download Scifinder 27

Structure search in SciFinder Retrieved 4000 papers (refine search only MS and MALDI) 28

Today: How scientist publish mass spectra (*) A PD F B Scientist A Runs MS Publication on paper as bitmap graphic OCR DB Curation DB Creation Sell DB Scientist B Needs DB Better: A DB B OCR optical character recognition DB database (*) and structures and other spectral data Central and Open Repository such as MassBank Electronic Publishing in XML Computerized Free or Paid Curation 29

Open data repository for mass spectra and metabolomics data No loss of information (high resolution spectra) No truncated data (report five peaks only) No hamburger to cow algorithm needed (OCR) Fast and instant use with no restrictions New synergism for data interpretation Commercial use may be possible NIH funded the METABOLOMICS DATA CENTER to collect and share metabolomics data ($6M) DB Central and Open Repository checkout MassBank checkout MetaboLights 30

The Last Page - What is important to remember There are different search types for mass spectral data similarity search, reverse search, neutral loss search, MS/MS search There are large libraries for electron ionization spectra (EI) from GC-MS There are no large open/commercial libraries for spectra from LC-MS For creation of mass spectral libraries a holistic approach is important Mass spectral trees can give further information (MS E or MS n ) There are different types of searching structures Exact search, similarity search, substructure search Before you start a research project, create target lists of possible candidates Collect mass spectra or structures in libraries with references 31