Developing Algorithms for the Determination of Relative Abundances of Peptides from LC/MS Data

Similar documents
Developing Algorithms for the Determination of Relative Peptide Abundances from LC/MS Data

Protein Quantitation II: Multiple Reaction Monitoring. Kelly Ruggles New York University

Protein Quantitation II: Multiple Reaction Monitoring. Kelly Ruggles New York University

Quantitative Proteomics

Isotopic-Labeling and Mass Spectrometry-Based Quantitative Proteomics

Biological Mass Spectrometry

MS-MS Analysis Programs

6 x 5 Ways to Ensure Your LC-MS/MS is Healthy

Aplicació de la proteòmica a la cerca de Biomarcadors proteics Barcelona, 08 de Juny 2010

Key questions of proteomics. Bioinformatics 2. Proteomics. Foundation of proteomics. What proteins are there? Protein digestion

WADA Technical Document TD2003IDCR

Genome wide analysis of protein and mrna half lives reveals dynamic properties of mammalian gene expression

TUTORIAL EXERCISES WITH ANSWERS

Chem 250 Unit 1 Proteomics by Mass Spectrometry

MS-based proteomics to investigate proteins and their modifications

Overview - MS Proteomics in One Slide. MS masses of peptides. MS/MS fragments of a peptide. Results! Match to sequence database

Analysis of Labeled and Non-Labeled Proteomic Data Using Progenesis QI for Proteomics

SILAC and TMT. IDeA National Resource for Proteomics Workshop for Graduate Students and Post-docs Renny Lan 5/18/2017

Towards the Prediction of Protein Abundance from Tandem Mass Spectrometry Data

Proteomics. Yeast two hybrid. Proteomics - PAGE techniques. Data obtained. What is it?

BIOINF 4120 Bioinformatics 2 - Structures and Systems - Oliver Kohlbacher Summer Systems Biology Exp. Methods

Data pre-processing in liquid chromatography mass spectrometry-based proteomics

Chapter 4. strategies for protein quantitation Ⅱ

All Ions MS/MS: Targeted Screening and Quantitation Using Agilent TOF and Q-TOF LC/MS Systems

BST 226 Statistical Methods for Bioinformatics David M. Rocke. January 22, 2014 BST 226 Statistical Methods for Bioinformatics 1

NPTEL VIDEO COURSE PROTEOMICS PROF. SANJEEVA SRIVASTAVA

Discovering Correlation in Data. Vinh Nguyen Research Fellow in Data Science Computing and Information Systems DMD 7.

Fundamentals of Mass Spectrometry. Fundamentals of Mass Spectrometry. Learning Objective. Proteomics

Chemistry. Animal Health Technology Student Development Program

WADA Technical Document TD2015IDCR

Computational Methods for Mass Spectrometry Proteomics

Guide to Peptide Quantitation. Agilent clinical research

Background: Imagine it is time for your lunch break, you take your sandwich outside and you sit down to enjoy your lunch with a beautiful view of

Reagents. Affinity Tag (Biotin) Acid Cleavage Site. Figure 1. Cleavable ICAT Reagent Structure.

Introduction to Mass Spectrometry (MS)

Yifei Bao. Beatrix. Manor Askenazi

(a) (i) Suggest the formulae of two different ions containing only the 16 O isotope, which might be formed in the mass spectrometer.

Quantitation of a target protein in crude samples using targeted peptide quantification by Mass Spectrometry

Workflow concept. Data goes through the workflow. A Node contains an operation An edge represents data flow The results are brought together in tables

Bioinformatics 2. Yeast two hybrid. Proteomics. Proteomics

PC235: 2008 Lecture 5: Quantitation. Arnold Falick

Statistical analysis of isobaric-labeled mass spectrometry data

Mass Spectrometry. Ionizer Mass Analyzer Detector

Background: Comment [1]: Comment [2]: Comment [3]: Comment [4]: mass spectrometry

Statistical mass spectrometry-based proteomics

Agilent MassHunter Profinder: Solving the Challenge of Isotopologue Extraction for Qualitative Flux Analysis

DIA-Umpire: comprehensive computational framework for data independent acquisition proteomics

Development of a protein quantification algorithm for data analysis in the field of proteomics

Isotope Dilution Mass Spectrometry

CHROMATOGRAPHY AND MASS SPECTROMETER

SRM assay generation and data analysis in Skyline

Mass Spectrometry and Proteomics - Lecture 5 - Matthias Trost Newcastle University

Atomic masses. Atomic masses of elements. Atomic masses of isotopes. Nominal and exact atomic masses. Example: CO, N 2 ja C 2 H 4

COLA Mass Spec Criteria

Designed for Accuracy. Innovation with Integrity. High resolution quantitative proteomics LC-MS

Methods for proteome analysis of obesity (Adipose tissue)

Mass spectrometry has been used a lot in biology since the late 1950 s. However it really came into play in the late 1980 s once methods were

Workshop: SILAC and Alternative Labeling Strategies in Quantitative Proteomics

Mass Spectrometry. Hyphenated Techniques GC-MS LC-MS and MS-MS

Motifs and Logos. Six Introduction to Bioinformatics. Importance and Abundance of Motifs. Getting the CDS. From DNA to Protein 6.1.

Bayesian Clustering of Multi-Omics

Calibration in Proteomics. Proteomics 202 :: Practical Proteomics Using the Skyline Software Ecosystem Lindsay K. Pino Monday, Jan 22

Quantitative Proteomics

Proteome-wide label-free quantification with MaxQuant. Jürgen Cox Max Planck Institute of Biochemistry July 2011

Modeling Mass Spectrometry-Based Protein Analysis

A Software Suite for the Generation and Comparison of Peptide Arrays from Sets. of Data Collected by Liquid Chromatography-Mass Spectrometry

Middle School Science. (8) Science concepts. The student knows that matter is composed of atoms. The student is expected to:

Comprehensive support for quantitation

High-Throughput Protein Quantitation Using Multiple Reaction Monitoring

Chem 4331 Name : Final Exam 2008

sample was a solution that was evaporated in the spectrometer (such as with ESI-MS) ions such as H +, Na +, K +, or NH 4

Chemical Labeling Strategy for Generation of Internal Standards for Targeted Quantitative Proteomics

Theoretical aspects of C13 metabolic flux analysis with sole quantification of carbon dioxide labeling. Guangquan Shi 04/28/06

UNIT 3 CHEMISTRY. Fundamental Principles in Chemistry

(Refer Slide Time 00:09) (Refer Slide Time 00:13)

Slide 1 / Describe the setup of Stanley Miller s experiment and the results. What was the significance of his results?

Development and Evaluation of Methods for Predicting Protein Levels from Tandem Mass Spectrometry Data. Han Liu

Analysis of Polar Metabolites using Mass Spectrometry

SeqAn and OpenMS Integration Workshop. Temesgen Dadi, Julianus Pfeuffer, Alexander Fillbrunn The Center for Integrative Bioinformatics (CIBI)

Key Words Q Exactive, Accela, MetQuest, Mass Frontier, Drug Discovery

Proteomics. Areas of Interest

Mass Spectrometry. General Principles

Videos. Bozeman, transcription and translation: Crashcourse: Transcription and Translation -

MASS SPECTROMETRY. Topics

Mixture Mode for Peptide Mass Fingerprinting ASMS 2003

Introduction to LC-MS

Hole s Human Anatomy and Physiology Eleventh Edition. Chapter 2

NaturalFacts. Introducing our team. New product announcements, specials and information from New Roots Herbal. April 2009

CHEMISTRY (CHEM) CHEM 208. Introduction to Chemical Analysis II - SL

Improved 6- Plex TMT Quantification Throughput Using a Linear Ion Trap HCD MS 3 Scan Jane M. Liu, 1,2 * Michael J. Sweredoski, 2 Sonja Hess 2 *

ZAHID IQBAL WARRAICH

Towards the Prediction of Protein Abundance from Tandem Mass Spectrometry Data

Making Sense of Differences in LCMS Data: Integrated Tools

CSE182-L8. Mass Spectrometry

1. In most cases, genes code for and it is that

Chromatography What is it?

2.1 Matter and Organic Compounds

Translational Biomarker Core

Introduction to Pharmaceutical Chemical Analysis

Tutorial 1: Setting up your Skyline document

Transcription:

Developing Algorithms for the Determination of Relative Abundances of Peptides from LC/MS Data RIPS Team Jake Marcus (Project Manager) Anne Eaton Melanie Kanter Aru Ray Faculty Mentors Shawn Cokus Matteo Pellegrini Industry Sponsors Parag Mallick Roland Luethy

Key terms Protein: a large biomolecule carrying out various functions of a cell Peptide: a fragment of a protein Digestion Protein Peptides

What is proteomics? Proteome: all the proteins expressed in an individual at a given time Proteomics: the study of proteins, their structure and function Replication Transcription Translation DNA RNA Protein Metabolic and bodily functions 20 25 thousand genes Millions of proteins

Why study proteomics? Diagnosis of disease Personalized medicine Analysis

Our sponsor Cedars-Sinai Health System Spielberg Family Center for Applied Proteomics dedicated to developing proteomic technologies to guide doctors in patient management decisions focus on identifying and quantifying proteins using liquid chromatography/mass spectrometry (LC/MS)

Liquid chromatography A method to separate substances based on their affinity to water Retention time (RT): amount of time a substance takes to pass through the chromatography column RT=1 RT=2 RT=3

Intensity Mass spectrometry A method to separate the components of a mixture according to molecular mass Molecules are ionized, separated according to mass/charge, and detected Sample Ionization and Acceleration Electromagnet Mass/Charge Mass/Charge

Intensity Intensity LC/MS: combining liquid chromatography and mass spectrometry Retention time 1 Sample Mass/Charge Retention time 2 Retention Time Separated by Retention Time Mass/Charge

Retention Time The data Mass/Charge List of Identifications Retention Time Mass/charge Peptide Protein Confidence... 246 725.4 K.ACSQRPR.W ADH 86% 0.86 793 432.87 R.IGYADIK.W EPO 12% 0.12 1075 5367.91 K.LGANAILK.W HB 99.45% 0.9945

Retention Time Intensity The problem Determine the relative abundance of peptides in the original samples based on LC/MS data Mass/Charge

Intensity Challenges Locate isotopes Identifications not centered Unknown spread along retention time Noise Isotopes Point of Identification

Peptide quantification modules Extract 2D Neighborhood Squish Isotopes Limit Retention Time Axis Fit Curve Quantify

Intensity (x10 5 ) Extract 2D neighborhood Pick 2D neighborhood around identified location Must be large enough to include entire feature Point of Identification Mass/Charge Extract 2D Neighborhood Squish Isotopes Limit Retention Time Fit Curve Quantify

Intensity (x10 5 ) Squish isotopes Isotopes have similar retention times Select relevant mass/charge values, extract corresponding data Mass/Charge Extract 2D Neighborhood Squish Isotopes Limit Retention Time Fit Curve Quantify

Retention Time Squish isotopes: quantize Mass/Charge Signal Noise Actual mass/charges Extract 2D Neighborhood Squish Isotopes Limit Retention Time Fit Curve Quantify

Intensity Squish isotopes: combine Mass/Charge Extract 2D Neighborhood Squish Isotopes Limit Retention Time Fit Curve Quantify

Intensity (x10 5 ) Limit retention time Find highest peak Search along retention time until 4 out of 5 consecutive data points are below threshold Threshold Retention Time (seconds) Extract 2D Neighborhood Squish Isotopes Limit Retention Time Fit Curve Quantify

Intensity (scaled) Fit curve Gamma curve fit to data by nonlinear regression 1.4 1.2 Gamma, R2=0.98137 Gamma, R 2 = 0.9814 Gamma, R 2 = 0.98 Right-skewed 1 Limited by liquid chromatography flow-rate 0.8 0.6 0.4 0.2 0 10 15 20 25 30 35 40 45 50 55 60 Retention Time (scaled) Extract 2D Neighborhood Squish Isotopes Limit Retention Time Fit Curve Quantify

Intensity (scaled) Quantify Area under curve corresponds to peptide abundance Retention Time (scaled) Extract 2D Neighborhood Squish Isotopes Limit Retention Time Fit Curve Quantify

Evaluation 6 protein mix (~150 peptides) Same amount in every sample 5 protein mix (~250 peptides) SILAC Different amounts in each sample Controlled proportion of two isotopes of each protein in a sample 1x 1x 1x 1x 2x 3x 1:2

Intensity Intensity Data filtering Remove peptides: Not derived from sample protein mix Identified with confidence < 0.99 Difference in retention time > 100 seconds 10 3 10 6 1750 1795 1840 2440 2490 2540 Retention Time Retention Time

Obs/Exp IQR Optimizing the algorithm Developed different versions of each module Evaluated combinations of different versions of modules 1.28 1.26 1.24 1.22 1.2 1.18 1.16 1.14 1.12 1.1 1 2 3 4 5 6 7 Module combinations Modules Final Version 0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 Final Version 1 2 3 4 5 6 7 Module Modules combinations

6 protein mix evaluations ~7 outliers per run not shown

5 protein mix evaluations ~3 outliers per run not shown

Median observed ratio Concentration dependence 30 25 20 15 10 5 0 0 5 10 15 20 25 30 Expected ratio

Intensity SILAC introduction SILAC: Stable Isotope Labeling of Amino Acids in Cell Culture Peptides labeled with isotopes Light Isotope Medium x:y Protein Isolation LC/MS Mass/Charge Heavy Isotope Medium A single retention time slice

Preliminary SILAC evaluations ~1 outlier per run excluded ~6 outliers per run excluded

Intensity Intensity Future directions: better data filtering User inputs a pair of matched features If mismatched, ratio is meaningless Potential to predict when features are mismatched Retention Time Retention Time

Future directions: better data filtering Report match confidence for every ratio Possible diagnostic variables: Confidence of identifications Difference in retention time Difference in maximum intensity

Future directions: peptides to proteins Combine data from peptides to estimate quantity of mutual parent protein Sample 1 x 50 Protein Sample 2 digestion Peptides 5:1 5:1 5:1 algorithm Output 5.2:1 4.9:1 5.1:1 x 10 Protein Expected ratio 5:1 Mean ratio: 5.07:1 estimation

Observed/Expected Future directions: peptides to proteins Preliminary results 10 1 1/10

Acknowledgements Faculty Mentors Shawn Cokus Matteo Pellegrini Industry Sponsors Parag Mallick Roland Luethy Jake Thank You! Aru Everyone at the Spielberg Center IPAM Anne Melanie