Workflow concept. Data goes through the workflow. A Node contains an operation An edge represents data flow The results are brought together in tables

Similar documents
Protein Identification Using Tandem Mass Spectrometry. Nathan Edwards Informatics Research Applied Biosystems

Proteomics. November 13, 2007

HOWTO, example workflow and data files. (Version )

Overview - MS Proteomics in One Slide. MS masses of peptides. MS/MS fragments of a peptide. Results! Match to sequence database

Mass Spectrometry and Proteomics - Lecture 5 - Matthias Trost Newcastle University

Computational Methods for Mass Spectrometry Proteomics

Last updated: Copyright

TOMAHAQ Method Construction

Improved 6- Plex TMT Quantification Throughput Using a Linear Ion Trap HCD MS 3 Scan Jane M. Liu, 1,2 * Michael J. Sweredoski, 2 Sonja Hess 2 *

Key questions of proteomics. Bioinformatics 2. Proteomics. Foundation of proteomics. What proteins are there? Protein digestion

Thermo Scientific LTQ Orbitrap Velos Hybrid FT Mass Spectrometer

A TMT-labeled Spectral Library for Peptide Sequencing

MassHunter Software Overview

Relative quantification using TMT11plex on a modified Q Exactive HF mass spectrometer

PeptideProphet: Validation of Peptide Assignments to MS/MS Spectra. Andrew Keller

Proteome-wide label-free quantification with MaxQuant. Jürgen Cox Max Planck Institute of Biochemistry July 2011

NPTEL VIDEO COURSE PROTEOMICS PROF. SANJEEVA SRIVASTAVA

MaSS-Simulator: A highly configurable MS/MS simulator for generating test datasets for big data algorithms.

Supplementary Figure 1

Comprehensive support for quantitation

X!TandemPipeline (Myosine Anabolisée) validating, filtering and grouping MSMS identifications

Protein Quantitation II: Multiple Reaction Monitoring. Kelly Ruggles New York University

SPECTRA LIBRARY ASSISTED DE NOVO PEPTIDE SEQUENCING FOR HCD AND ETD SPECTRA PAIRS

Increasing the Multiplexing of Protein Quantitation from 6- to 10-Plex with Reporter Ion Isotopologues

Effective Strategies for Improving Peptide Identification with Tandem Mass Spectrometry

Protein Quantitation II: Multiple Reaction Monitoring. Kelly Ruggles New York University

MS-based proteomics to investigate proteins and their modifications

Learning Score Function Parameters for Improved Spectrum Identification in Tandem Mass Spectrometry Experiments

Spectronaut Pulsar. User Manual

Self-assembling covalent organic frameworks functionalized. magnetic graphene hydrophilic biocomposite as an ultrasensitive

High-Field Orbitrap Creating new possibilities

Tandem MS = MS / MS. ESI-MS give information on the mass of a molecule but none on the structure

A Description of the CPTAC Common Data Analysis Pipeline (CDAP)

iprophet: Multi-level integrative analysis of shotgun proteomic data improves peptide and protein identification rates and error estimates

Nature Methods: doi: /nmeth Supplementary Figure 1. Fragment indexing allows efficient spectra similarity comparisons.

PeptideProphet: Validation of Peptide Assignments to MS/MS Spectra

MS-MS Analysis Programs

Isotopic-Labeling and Mass Spectrometry-Based Quantitative Proteomics

Spectrum-to-Spectrum Searching Using a. Proteome-wide Spectral Library

Properties of Average Score Distributions of SEQUEST

SRM assay generation and data analysis in Skyline

Tutorial 2: Analysis of DIA data in Skyline

Qualitative Proteomics (how to obtain high-confidence high-throughput protein identification!)

Mass Spectrometry Based De Novo Peptide Sequencing Error Correction

Methods for proteome analysis of obesity (Adipose tissue)

Purdue-UAB Botanicals Center for Age- Related Disease

Tutorial 1: Setting up your Skyline document

De novo peptide sequencing methods for tandem mass. spectra

Quantitation of a target protein in crude samples using targeted peptide quantification by Mass Spectrometry

Peptide Targeted Quantification By High Resolution Mass Spectrometry A Paradigm Shift? Zhiqi Hao Thermo Fisher Scientific San Jose, CA

Figure S1. Interaction of PcTS with αsyn. (a) 1 H- 15 N HSQC NMR spectra of 100 µm αsyn in the absence (0:1, black) and increasing equivalent

LECTURE-13. Peptide Mass Fingerprinting HANDOUT. Mass spectrometry is an indispensable tool for qualitative and quantitative analysis of

On Optimizing the Non-metric Similarity Search in Tandem Mass Spectra by Clustering

DIA-Umpire: comprehensive computational framework for data independent acquisition proteomics

Proteomics: the first decade and beyond. (2003) Patterson and Aebersold Nat Genet 33 Suppl: from

Mass spectrometry has been used a lot in biology since the late 1950 s. However it really came into play in the late 1980 s once methods were

TUTORIAL EXERCISES WITH ANSWERS

Quantitative Proteomics

Tandem mass spectra were extracted from the Xcalibur data system format. (.RAW) and charge state assignment was performed using in house software

MassHunter TOF/QTOF Users Meeting

ASCQ_ME: a new engine for peptide mass fingerprint directly from mass spectrum without mass list extraction

Protein analysis using mass spectrometry

1. Prepare the MALDI sample plate by spotting an angiotensin standard and the test sample(s).

MSnID Package for Handling MS/MS Identifications

Atomic masses. Atomic masses of elements. Atomic masses of isotopes. Nominal and exact atomic masses. Example: CO, N 2 ja C 2 H 4

Quantitation of TMT-Labeled Peptides Using Higher-Energy Collisional Dissociation on the Velos Pro Ion Trap Mass Spectrometer

Relative Quantitation of TMT-Labeled Proteomes Focus on Sensitivity and Precision

Modeling Mass Spectrometry-Based Protein Analysis

SILAC and TMT. IDeA National Resource for Proteomics Workshop for Graduate Students and Post-docs Renny Lan 5/18/2017

De novo Protein Sequencing by Combining Top-Down and Bottom-Up Tandem Mass Spectra. Xiaowen Liu

All Ions MS/MS: Targeted Screening and Quantitation Using Agilent TOF and Q-TOF LC/MS Systems

PRIDE Cluster: building the consensus of proteomics data

QuasiNovo: Algorithms for De Novo Peptide Sequencing

Q Exactive TM : A True Qual-Quan HR/AM Mass Spectrometer for Routine Proteomics Applications. Yi Zhang, Ph.D. ThermoFisher Scientific

for the Novice Mass Spectrometry (^>, John Greaves and John Roboz yc**' CRC Press J Taylor & Francis Group Boca Raton London New York

The Pitfalls of Peaklist Generation Software Performance on Database Searches

Tandem Mass Spectrometry: Generating function, alignment and assembly

BST 226 Statistical Methods for Bioinformatics David M. Rocke. January 22, 2014 BST 226 Statistical Methods for Bioinformatics 1

Database Search Strategies for Proteomic Data Sets Generated by Electron Capture Dissociation Mass Spectrometry

itraq and RNA-Seq analyses provide new insights of Dendrobium officinale seeds (Orchidaceae)

Statistical analysis of isobaric-labeled mass spectrometry data

MS Based Proteomics: Recent Case Studies Using Advanced Instrumentation

Tutorial 1: Library Generation from DDA data

Agilent All Ions MS/MS

A new algorithm for the evaluation of shotgun peptide sequencing in proteomics: support vector machine classification of peptide MS/MS spectra

Introduction to pepxmltab

A graph-based filtering method for top-down mass spectral identification

Performing Peptide Bioanalysis Using High Resolution Mass Spectrometry with Target Enhancement MRM Acquisition

An Unsupervised, Model-Free, Machine-Learning Combiner for Peptide Identifications from Tandem Mass Spectra

Integrated enzyme reactor and high resolving chromatography in sub-chip dimensions for. sensitive protein mass spectrometry

WADA Technical Document TD2015IDCR

Andromeda: A Peptide Search Engine Integrated into the MaxQuant Environment

MSc Chemistry Analytical Sciences. Advances in Data Dependent and Data Independent Acquisition for data analysis in proteomic research

Protein identification problem from a Bayesian pointofview

STATISTICAL METHODS FOR THE ANALYSIS OF MASS SPECTROMETRY- BASED PROTEOMICS DATA. A Dissertation XUAN WANG

Isobaric Labeling-Based Relative Quantification in Shotgun Proteomics

Improved Validation of Peptide MS/MS Assignments. Using Spectral Intensity Prediction

Designed for Accuracy. Innovation with Integrity. High resolution quantitative proteomics LC-MS

MASS ANALYSER. Mass analysers - separate the ions according to their mass-to-charge ratio. sample. Vacuum pumps

High-Throughput Protein Quantitation Using Multiple Reaction Monitoring

Transcription:

PROTEOME DISCOVERER

Workflow concept Data goes through the workflow Spectra Peptides Quantitation A Node contains an operation An edge represents data flow The results are brought together in tables Protein (group) table Peptide (group) table

Two aspects of the analysis Identification MS1 + MS2 Quantification MS1

Identification

Decision tree based MS

How to analyze? Decision tree is implemented in acquisition software Data analysis is slightly different for each fragmentation / detector method Rebuild the decision tree in the workflow!

Overview of the steps 1. Select the file(s) to use 2. Select only the MS2 spectra 3. Choose the different fragmentation methods 4. Create a specific workflow for each fragmentation type 1. Spectrum filtering 2. Database search 5. PSM Validation, PTM scoring

Example dataset 1 Human cancer cell-line with 3 treatments Fractionated with SCX Decision tree: HCD / ETD-FT / ETD

1. Select files to use

2. Extract the ms2 spectra

3. Select spectrum origins

Side note:

For ETD: precursor removal

For all: size reduction

Identifications: Mascot

MASCOT search engine

Proteomics Mass Spectrometry Trypsin Digest 17

Single Stage MS MS 18

Tandem Mass Spectrometry (MS/MS) Precursor selection 19

Tandem Mass Spectrometry (MS/MS) Precursor selection + collision induced dissociation (CID) MS/MS 20

How do search engines work? Input Spectrums Sequence database (proteins) Search parameters Match spectrum to possible peptides Score

Precursor m/z Charge spectrum Precursor mass Mass range Range of possible peptide masses

A peptide database We now know a range where the peptide mass has to be in Which peptides confirm to this criterium? Taken from protein database Modifications taken into account Miscleavages taken into account

Protein database Fixed modifications Variable modifications Digest, modify Allowed missed cleavages Peptide database with modifications indexed by mass

Matching the spectra Now we re able to match the spectrum For that, theoretical spectra are composed This is done with the information from the peptide database

Peptide database With modifications Indexed by mass Range of possible peptide masses Theoretical spectra Compare and score MS/MSspectrum Ranked list of peptides

How to build a theoretical spectrum Information is needed about Amino acid masses Modification masses Methods of fragmentation

Peptide Fragmentation Peptide: S-G-F-L-E-E-D-E-L-K MW ion ion MW 88 b 1 S GFLEEDELK y 9 1080 145 b 2 SG FLEEDELK y 8 1022 292 b 3 SGF LEEDELK y 7 875 405 b 4 SGFL EEDELK y 6 762 534 b 5 SGFLE EDELK y 5 633 663 b 6 SGFLEE DELK y 4 504 778 b 7 SGFLEED ELK y 3 389 907 b 8 SGFLEEDE LK y 2 260 1020 b 9 SGFLEEDEL K y 1 147 28

Identifications: Mascot

PSM Thresholds

PSM thresholds Several options: Target/decoy databases Percolator

False Discovery Rate This is a number indicating how many of your hits are false If the FDR is 0.01 (1%), we expect 1 out of 100 peptide hits to be a false identification The number is estimated by the number of hits in the decoy database search result: FDR = decoy hits / target hits

Decoy hits Too much noise Mixed spectrum Overall low signal MS2 peaks do match with a peptide

Two ways of FDR calculation Concatenated database One search A spectrum matches either one or the other ( Competitive ) Separate decoy database (in Mascot) Two searches A spectrum may match both decoy or target seq ( non-competitive ) FDR = decoy hits / total hits forward reverse d forward reverse d

Limitation: one dimension 0 10 20 30 40 Mascot score Mass delta (ppm) 40 30 20 10 0 0 10 20 30 40 Mascot score

Introducing Percolator Percolator (Lukas Käll, McCoss Lab 2007) Machine learning Multiple features

What is a support vector machine Classification algorithm Best hyperplane between two groups

What is a support vector machine Classification algorithm Best hyperplane between two groups

Soft edges for SVMs

Using several features Score Mass delta (ppm) Delta score Number of matched ions

Higher dimensions

PTM scoring PhosphoRS node Parallel to percolator node PhosphoRS version 2 names change PhosphoRS score Mao will explain in the afternoon

Quantification

Labeled Quantification Peptide modifications Extracted Ion Chromatrogram (XIC) Node to extract the intensities

How to interpret the XICs The XICs need to be grouped Isotopes Known modification (e.g. dimethylation, SILAC) The node needs details Quantification method

Manage quant methods

PD Output

Understanding PD output PD run log Node result columns Protein output Peptide output NB: the modified peptide column comes from the Mascot node, whilst the phosphors site localizations are from that node Modified peptide does not represent best localizations

Peptides tab List Peptides Grouping (right click) Grouped Ungrouped (PTMs) Grouping parameters are in the Result Filters tab Based on sequence Based on sequence and mass (i.e. modifications) PD will filter the PTMs based on the settings in the Result Filters tab

Peptide output < 0.5 >2

Protein tab List proteins Grouping (right click) Grouped Ungrouped Grouping based on shared peptides

What s Grouped? Grouping allows for the calculation of standard deviations Show the highest scores

Filters

Example dataset 2 Yeast time course experiment Labeled with TMT 6 -plex Fractionated with SCX

build this!

Alternative TMT analysis Isobar package Nice Statistical analysis Nice output Freely available Disadvantages Written in R Somewhat hard to use Ask Bas