De Novo Peptide Sequencing: Informatics and Pattern Recognition applied to Proteomics
|
|
- Gilbert King
- 5 years ago
- Views:
Transcription
1 De Novo Peptide Sequencing: Informatics and Pattern Recognition applied to Proteomics John R. Rose Computer Science and Engineering University of South Carolina 1
2 Overview Background Information Theoretic Scoring Function Test Data Set Comparison with Existing Methods Conclusions Future Work 2
3 Background Analogy: Genome Machine Code Proteome Execution of Code Protein identification is important For drug discovery research For the identification microbes in environmental samples Approaches using tandem mass spectrometry data: Database searching De Novo Sequencing Tagging 3
4 Tandem MS Data A peptide is ionized and the peptide bonds are fragmented Fragment ions form peaks in the spectrum corresponding to their mass-charge ratio. Intens. [a.u.] m/z
5 Tandem MS Data Fragment ions include a,b,c,x,y,z, ions. de Novo sequencing focuses on y and b ions. y ions contain the carboxyl terminus b ions containing the amino terminus 5
6 Tandem MS Data A good quality spectrum consists of a ladder of peaks of the y-ions and a ladder of peaks of the b-ions Ex: b-ions y-ions F GLSLVR FG LSLVR FGL SLVR FGLS LVR FGLSL VR FGLSLV R 6
7 Approaches to peptide identification Frank et al. JPR
8 De Novo Sequencing Data: tandem MS spectrum Goal: find the corresponding peptide General approach: Identify y and/or b ions propose candidate peptides Score each candidate Return highest ranking peptides Two key issues: Model for candidate peptide generation Scoring function to evaluate candidates 8
9 Candidate Peptide Generation The peptide sequence can be derived by the mass differences of adjacent peaks in each of the two ladders Ex: b-ions y-ions I YEVEGMR IY EVEGMR IYE VEGMR IYEV EGMR IYEVE GMR IYEVEG MR IYEVEGM R Complicating factors: Missing peaks Posttranslational modifications Many-to-one equivalences, e.g., AG,GA,K,Q,E are similar in mass IYEVEGMR 9
10 Actual example of labeled y and b ion peaks 10
11 The spectrum graph Frank et al. JPR
12 Construction of the NC-spectrum Graph Chen et. al JCB 2001 Create a pair of nodes, N j and C j, for each ion I j. Create two auxiliary nodes N 0 and C 0. to represent the zero mass and parent mass, respectively. Let V = {N 0, N 1,, N k, C 0, C 1,, C k }. Each node x is placed assigned coordinate cord(x) according to the total mass of its amino acids, that is, cord( x) 0 W 18 = wj 1 W wj x x = x = C = N N 0 x = C N 0 C 2 C 1 N 1 N 2 C j j
13 Construction of the NC-spectrum Graph Abundance (100%) W = W 18 cord( x) = w j 1 W w j + 1 x = N x = C x = N x = C 0 0 j j Mass / Charge N 0 C
14 Construction of the NC-spectrum Graph Abundance (100%) W = W 18 cord( x) = w j 1 W w j + 1 x = N x = C x = N x = C 0 0 j j Mass / Charge N 0 C 1 N 1 C
15 Construction of the NC-spectrum Graph Abundance (100%) W = W 18 cord( x) = w j 1 W w j + 1 x = N x = C x = N x = C 0 0 j j Mass / Charge N 0 C 2 C 1 N 1 N 2 C
16 Construction of the NC-spectrum Graph Mass(S) = S Mass(W) = W Mass(R) = R N 0 C 2 C 1 N 1 N 2 C S+W Mass(S+W) =
17 Construction of the NC-spectrum Graph N 0 N 2 C 1 N 1 C 2 C Each path from N 0 to C 0 represents a possible sequence for the peptide A feasible path is a path from N 0 to C 0 that goes through exactly one node for each pair (either N j or C j ). 17
18 Construction of the NC-spectrum Graph N 0 N 2 C 1 N 1 C 2 C This is not a feasible path: misses ion I 2 18
19 Construction of the NC-spectrum Graph N 0 N 2 C 1 N 1 C 2 C This is a feasible path 19
20 Problem Reformulation Input: an NC-spectrum graph G. Output: a feasible path from N 0 to C 0. Difficulty: A longest path does not always go through exactly one of each pair of nodes. This is an NP-hard problem if the graph is a general directed graph. 20
21 Renaming Nodes Rename the nodes from left to right as X 0,, X k,y k,,y 0 N 0 N 2 C 1 N 1 C 2 C X 0 X 1 X 2 Y 2 Y 1 Y X i and Y i form a complementary pair of nodes for ion i. 21
22 Problem Reformulation X 0 X 1 X k Y k Y 1 Y 0 Let M(i, j) be a two-dimensional matrix with 0 i, j k. Let M(i, j)=1 if there exists a path L from X 0 to X i and a path R from Y j to Y 0, such that L and R together contain exactly one of X p and Y p for each P in [0, max{i, j}]. X 0 X 1 X 2 X i Y j Y i Y 2 Y 1 Y 0 L R 22
23 Problem Reformulation There is a feasible path if and only if for some i and k, there is an edge e from X i to Y k and M(i, k) = 1, or for some k and j, there is an edge e from X k to Y j and M(k, j) = 1 X 0 X i Y k Y 0 L e R X 0 X k Y j Y 0 L 23 e R
24 Candidate Peptide Generation Complicating factors: Posttranslational modifications Many-to-one equivalences, e.g., AG,GA,K,Q,E are similar in mass Noise Peaks Missing peaks 24
25 Missing peaks Candidate Peptide Generation Now a many-to-many combinatorial problem Ex: ATEEQLK If b 4 ion is missing then b 3 represents ATE and b 5 represents ATEEQ Then the mass difference for EQ is unresolved. Recall that AG,GA,K,Q,E are similar in mass Thus EQ, QE, AGQ, GAQ, AGE, GAE,.. have similar mass 25
26 Candidate Peptide Evaluation Model for candidate generation Traditional focus on fragmentation model Increasing fragmentation model sophistication Better posttranslational modification models No model of peptide amino acid content QuasiNovo approach Unsophisticated fragmentation model No posttranslational modification model Uses information theory to model peptide amino acid content 26
27 Modeling Peptide Amino Acid Content Basic Idea: Examine actual proteins to characterize likely combinations of amino acids Underlying hypothesis: amino acid content is not random Analogy: model letter combinations in a language examine documents in that language compile profiles of letter combinations predict missing letters from partial data Motivation: Ability to distinguish between mass-equivalent combinations Ability to deal with missing peaks 27
28 Amino Acid Distribution Data Tabulation of amino acid distributions: Let <a 1 a 2 a n > be a contiguous sequence of n amino acids. There are n amino acids: <a 1 >, < a 2 >,,<a n > There are n-1 ordered amino acid pairs: <a 1 a 2 >, < a 2 a 3 >,,< a n-1 a n > etc. QuasiNovo has been evaluated with 3-,4-,5-, and 6-tuples Tuple frequencies are then normalized. 28
29 Amino Acid Distribution Data Three amino acid profiles used: 1. Gammaproteobacteria: 206 complete genomes 23,882,564 tryptic peptides 2. Actinobacteria: 58 complete genomes 7,380,927 tryptic peptides generated 3. Mammalia: 4 complete genomes: Bovine, Human, Mouse, Rat 9,835,585 tryptic peptides generated 29
30 QuasiNovo s Use of Tuple-Profiles Score candidate peptides score(fglslvr) = p(slvr)p(l SLVR)p(G LSLV)p(F GLSL) Discard poor scoring candidates Handle missing peaks Find set of a i that maximize P(a i a i-4 a i-3 a i-2 a i-1 ) 30
31 Test Data Set 280 spectra of peptides selected by Frank & Pevzner (2005) molecular mass of up to 1400 Da peptides with 7-16 amino acids (average length of 10.5) source: ISB protein mixture data set and Open Proteomics Database Data set used to compare PepNovo with Sherenga Peaks Lutefisk Later used to compare NovoHMM with PepNovo Sherenga Peaks Lutefisk 31
32 The contenders: PepNovo v1.03 PepNovo+ NovoHMM QuasiNovo QuasiNovo Reranking Results 32
33 Results % Correct PepNovo+ PepNovo v1.03 NovoHMM Quasinovo Quasinovo Reranking Number of Incorrect Residues Results for set of 280 MS-MS test spectra comparing PepNovo+, PepNovo, NovoHMM, with a QuasiNovo reranking and QuasiNovo. 33
34 Results % Correct Number of Incorrect Residues PepNovo+ PepNovo v1.03 NovoHMM Gammaproteobacteria Actinobacteria Mammalia Results for set of 76 MS-MS test spectra for E. coli peptides comparing PepNovo+, PepNovo, NovoHMM, with three QuasiNovo scoring functions based on amino acid distributions in Gammaproteobacteria, Actinobacteria, and Mammalia. 34
35 Results Algorithm PepNovo+ NovoHMM Quasinovo Reranking Terminal ion pair b2-ion y2-ion Complete peptide Comparison of Terminal Pair and Overall Accuracy 35
36 Conclusions and Future Work The QuasiNovo peptide model predicts peptide amino acid content has limited understanding of fragmentation outperforms the PepNovo+ and NovoHMM QuasiNovo reranking reranks PepNovo+ and NovoHMM results proof-of-concept for combining peptide & fragmentation models shows best overall performance Future: Combine QuasiNovo amino acid model with a sophisticated fragmentation model 36
37 Acknowledgements Rose Lab Jimmy Cleveland Achraf Elallali Amadeo Bellotti Fox Lab Alvin Fox Karen Fox Jennifer Intelicato-Young Support Funding from Alfred P. Sloan Foundation Experiments were conducted on a 128-core shared memory computer funded by NSF (CNS ). 37
38 Gammaproteobacteria x = 3 x = 4 x = 5 x = 6 x = 7 x = 8 x = 9 x = 10 x = 11 x = 12 QuasiNovo MM Reranking NovoHMM PepNovo+ x = 3 x = 4 x = 5 x = 6 x = 7 x = 8 x = 9 x = 10 x = 11 x = 12 QuasiNovo MM Reranking NovoHMM PepNovo Cumulative results from 174 spectra x = n number of correctly predicted amino acids Note: a predicted amino acid is correct if it appears within 2.5 Da of its position in the actual peptide 38
39 Actinobacteria x = 3 x = 4 x = 5 x = 6 x = 7 x = 8 x = 9 x = 10 x = 11 x = 12 QuasiNovo MM Reranking NovoHMM PepNovo+ x = 3 x = 4 x = 5 x = 6 x = 7 x = 8 x = 9 x = 10 x = 11 x = 12 QuasiNovo MM Reranking NovoHMM PepNovo Cumulative results from 27 spectra x = n number of correctly predicted amino acids Note: a predicted amino acid is correct if it appears within 2.5 Da of its position in the actual peptide 39
40 Results: Mammalia x = 3 x = 4 x = 5 x = 6 x = 7 x = 8 x = 9 x = 10 x = 11 x = 12 QuasiNovo MM Reranking NovoHMM PepNovo+ x = 3 x = 4 x = 5 x = 6 x = 7 x = 8 x = 9 x = 10 x = 11 x = 12 QuasiNovo MM Reranking NovoHMM PepNovo Cumulative results from 79 spectra x = n number of correctly predicted amino acids Note: a predicted amino acid is correct if it appears within 2.5 Da of its position in the actual peptide 40
41 EF-Tu Protein DISTILLER/MASCOT identification: AIDKPFLLPIEDVFSISGR QuasiNovo identification: DSDKPFMMPVEDVFSITGR Score(AIDKPFLLPIEDVFSISGR) = e-38 Score(DSDKPFMMPVEDVFSITGR) = e-36 QuasiNovo result supported by microbiological data Gram stain physiological tests visual comparison of spectra of environmental isolates versus known S. aureus and interpretation of Distiller/Mascot sequence assignment Note: Distiller results based on 18 peaks vs 12 peaks for QuasiNovo Peptide displays loss of 3 water molecules 41
De Novo Peptide Identification Via Mixed-Integer Linear Optimization And Tandem Mass Spectrometry
17 th European Symposium on Computer Aided Process Engineering ESCAPE17 V. Plesu and P.S. Agachi (Editors) 2007 Elsevier B.V. All rights reserved. 1 De Novo Peptide Identification Via Mixed-Integer Linear
More informationCSE182-L8. Mass Spectrometry
CSE182-L8 Mass Spectrometry Project Notes Implement a few tools for proteomics C1:11/2/04 Answer MS questions to get started, select project partner, select a project. C2:11/15/04 (All but web-team) Plan
More informationMS-MS Analysis Programs
MS-MS Analysis Programs Basic Process Genome - Gives AA sequences of proteins Use this to predict spectra Compare data to prediction Determine degree of correctness Make assignment Did we see the protein?
More informationDe Novo Peptide Sequencing
De Novo Peptide Sequencing Outline A simple de novo sequencing algorithm PTM Other ion types Mass segment error De Novo Peptide Sequencing b 1 b 2 b 3 b 4 b 5 b 6 b 7 b 8 A NELLLNVK AN ELLLNVK ANE LLLNVK
More informationQuasiNovo: Algorithms for De Novo Peptide Sequencing
University of South Carolina Scholar Commons Theses and Dissertations 2013 QuasiNovo: Algorithms for De Novo Peptide Sequencing James Paul Cleveland University of South Carolina Follow this and additional
More informationA Dynamic Programming Approach to De Novo Peptide Sequencing via Tandem Mass Spectrometry
A Dynamic Programming Approach to De Novo Peptide Sequencing via Tandem Mass Spectrometry Ting Chen Department of Genetics arvard Medical School Boston, MA 02115, USA Ming-Yang Kao Department of Computer
More informationDe novo Protein Sequencing by Combining Top-Down and Bottom-Up Tandem Mass Spectra. Xiaowen Liu
De novo Protein Sequencing by Combining Top-Down and Bottom-Up Tandem Mass Spectra Xiaowen Liu Department of BioHealth Informatics, Department of Computer and Information Sciences, Indiana University-Purdue
More informationTandem Mass Spectrometry: Generating function, alignment and assembly
Tandem Mass Spectrometry: Generating function, alignment and assembly With slides from Sangtae Kim and from Jones & Pevzner 2004 Determining reliability of identifications Can we use Target/Decoy to estimate
More informationProteomics. November 13, 2007
Proteomics November 13, 2007 Acknowledgement Slides presented here have been borrowed from presentations by : Dr. Mark A. Knepper (LKEM, NHLBI, NIH) Dr. Nathan Edwards (Center for Bioinformatics and Computational
More informationProtein Identification Using Tandem Mass Spectrometry. Nathan Edwards Informatics Research Applied Biosystems
Protein Identification Using Tandem Mass Spectrometry Nathan Edwards Informatics Research Applied Biosystems Outline Proteomics context Tandem mass spectrometry Peptide fragmentation Peptide identification
More informationvia Tandem Mass Spectrometry and Propositional Satisfiability De Novo Peptide Sequencing Renato Bruni University of Perugia
De Novo Peptide Sequencing via Tandem Mass Spectrometry and Propositional Satisfiability Renato Bruni bruni@diei.unipg.it or bruni@dis.uniroma1.it University of Perugia I FIMA International Conference
More informationDe novo peptide sequencing methods for tandem mass. spectra
De novo peptide sequencing methods for tandem mass spectra A Thesis Submitted to the College of Graduate Studies and Research in Partial Fulfillment of the Requirements for the degree of Doctor of Philosophy
More informationNature Methods: doi: /nmeth Supplementary Figure 1. Fragment indexing allows efficient spectra similarity comparisons.
Supplementary Figure 1 Fragment indexing allows efficient spectra similarity comparisons. The cost and efficiency of spectra similarity calculations can be approximated by the number of fragment comparisons
More informationModeling Mass Spectrometry-Based Protein Analysis
Chapter 8 Jan Eriksson and David Fenyö Abstract The success of mass spectrometry based proteomics depends on efficient methods for data analysis. These methods require a detailed understanding of the information
More informationProtein Sequencing and Identification by Mass Spectrometry
Protein Sequencing and Identification by Mass Spectrometry Tandem Mass Spectrometry De Novo Peptide Sequencing Spectrum Graph Protein Identification via Database Search Identifying Post Translationally
More informationSPECTRA LIBRARY ASSISTED DE NOVO PEPTIDE SEQUENCING FOR HCD AND ETD SPECTRA PAIRS
SPECTRA LIBRARY ASSISTED DE NOVO PEPTIDE SEQUENCING FOR HCD AND ETD SPECTRA PAIRS 1 Yan Yan Department of Computer Science University of Western Ontario, Canada OUTLINE Background Tandem mass spectrometry
More informationEffective Strategies for Improving Peptide Identification with Tandem Mass Spectrometry
Effective Strategies for Improving Peptide Identification with Tandem Mass Spectrometry by Xi Han A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree
More informationComputational Methods for Mass Spectrometry Proteomics
Computational Methods for Mass Spectrometry Proteomics Eidhammer, Ingvar ISBN-13: 9780470512975 Table of Contents Preface. Acknowledgements. 1 Protein, Proteome, and Proteomics. 1.1 Primary goals for studying
More informationNovoHMM: A Hidden Markov Model for de Novo Peptide Sequencing
Anal. Chem. 2005, 77, 7265-7273 NovoHMM: A Hidden Markov Model for de Novo Peptide Sequencing Bernd Fischer, Volker Roth, Franz Roos, Jonas Grossmann, Sacha Baginsky, Peter Widmayer, Wilhelm Gruissem,
More informationLecture 15: Realities of Genome Assembly Protein Sequencing
Lecture 15: Realities of Genome Assembly Protein Sequencing Study Chapter 8.10-8.15 1 Euler s Theorems A graph is balanced if for every vertex the number of incoming edges equals to the number of outgoing
More informationWorkflow concept. Data goes through the workflow. A Node contains an operation An edge represents data flow The results are brought together in tables
PROTEOME DISCOVERER Workflow concept Data goes through the workflow Spectra Peptides Quantitation A Node contains an operation An edge represents data flow The results are brought together in tables Protein
More informationA New Hybrid De Novo Sequencing Method For Protein Identification
A New Hybrid De Novo Sequencing Method For Protein Identification Penghao Wang 1*, Albert Zomaya 2, Susan Wilson 1,3 1. Prince of Wales Clinical School, University of New South Wales, Kensington NSW 2052,
More informationMass Spectrometry Based De Novo Peptide Sequencing Error Correction
Mass Spectrometry Based De Novo Peptide Sequencing Error Correction by Chenyu Yao A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Master of Mathematics
More informationTutorial 1: Setting up your Skyline document
Tutorial 1: Setting up your Skyline document Caution! For using Skyline the number formats of your computer have to be set to English (United States). Open the Control Panel Clock, Language, and Region
More informationProtein Quantitation II: Multiple Reaction Monitoring. Kelly Ruggles New York University
Protein Quantitation II: Multiple Reaction Monitoring Kelly Ruggles kelly@fenyolab.org New York University Traditional Affinity-based proteomics Use antibodies to quantify proteins Western Blot Immunohistochemistry
More informationProtein Quantitation II: Multiple Reaction Monitoring. Kelly Ruggles New York University
Protein Quantitation II: Multiple Reaction Monitoring Kelly Ruggles kelly@fenyolab.org New York University Traditional Affinity-based proteomics Use antibodies to quantify proteins Western Blot RPPA Immunohistochemistry
More informationComputational Analysis of Mass Spectrometric Data for Whole Organism Proteomic Studies
University of Tennessee, Knoxville Trace: Tennessee Research and Creative Exchange Doctoral Dissertations Graduate School 5-2006 Computational Analysis of Mass Spectrometric Data for Whole Organism Proteomic
More informationDE NOVO PEPTIDE SEQUENCING FOR MASS SPECTRA BASED ON MULTI-CHARGE STRONG TAGS
DE NOVO PEPTIDE SEQUENCING FO MASS SPECTA BASED ON MULTI-CHAGE STONG TAGS KANG NING, KET FAH CHONG, HON WAI LEONG Department of Computer Science, National University of Singapore, 3 Science Drive 2, Singapore
More informationEfficiency of Database Search for Identification of Mutated and Modified Proteins via Mass Spectrometry
Methods Efficiency of Database Search for Identification of Mutated and Modified Proteins via Mass Spectrometry Pavel A. Pevzner, 1,3 Zufar Mulyukov, 1 Vlado Dancik, 2 and Chris L Tang 2 Department of
More informationMS2DB: An Algorithmic Approach to Determine Disulfide Linkage Patterns in Proteins by Utilizing Tandem Mass Spectrometric Data
MS2DB: An Algorithmic Approach to Determine Disulfide Linkage Patterns in Proteins by Utilizing Tandem Mass Spectrometric Data Timothy Lee 1, Rahul Singh 1, Ten-Yang Yen 2, and Bruce Macher 2 1 Department
More informationNPTEL VIDEO COURSE PROTEOMICS PROF. SANJEEVA SRIVASTAVA
LECTURE-25 Quantitative proteomics: itraq and TMT TRANSCRIPT Welcome to the proteomics course. Today we will talk about quantitative proteomics and discuss about itraq and TMT techniques. The quantitative
More informationParallel Algorithms For Real-Time Peptide-Spectrum Matching
Parallel Algorithms For Real-Time Peptide-Spectrum Matching A Thesis Submitted to the College of Graduate Studies and Research in Partial Fulfillment of the Requirements for the degree of Master of Science
More informationComputationally Analyzing Mass Spectra of Hydrogen Deuterium Exchange Experiments Kevin S. Drew University of Chicago May 21, 2005
Computationally Analyzing Mass Spectra of Hydrogen Deuterium Exchange Experiments Kevin S. Drew University of Chicago May 21, 2005 1 Abstract Hydrogen deuterium exchange (HDX) using Mass Spectrometers
More informationBiological Mass Spectrometry
Biochemistry 412 Biological Mass Spectrometry February 13 th, 2007 Proteomics The study of the complete complement of proteins found in an organism Degrees of Freedom for Protein Variability Covalent Modifications
More informationMass Spectrometry and Proteomics - Lecture 5 - Matthias Trost Newcastle University
Mass Spectrometry and Proteomics - Lecture 5 - Matthias Trost Newcastle University matthias.trost@ncl.ac.uk Previously Proteomics Sample prep 144 Lecture 5 Quantitation techniques Search Algorithms Proteomics
More informationDIA-Umpire: comprehensive computational framework for data independent acquisition proteomics
DIA-Umpire: comprehensive computational framework for data independent acquisition proteomics Chih-Chiang Tsou 1,2, Dmitry Avtonomov 2, Brett Larsen 3, Monika Tucholska 3, Hyungwon Choi 4 Anne-Claude Gingras
More informationPeptide Sequence Tags for Fast Database Search in Mass-Spectrometry
Peptide Sequence Tags for Fast Database Search in Mass-Spectrometry Ari Frank,*, Stephen Tanner, Vineet Bafna, and Pavel Pevzner Department of Computer Science & Engineering, University of California,
More informationQuality Assessment of Tandem Mass Spectra Based on Cumulative Intensity Normalization
Quality Assessment of Tandem Mass Spectra Based on Cumulative Intensity Normalization Seungjin Na and Eunok Paek* Department of Mechanical and Information Engineering, University of Seoul, Seoul, Korea
More informationJiří Novák and David Hoksza
ParametrisedHausdorff HausdorffDistance Distanceas asa Non-Metric a Non-Metric Similarity Similarity Model Model for Tandem for Tandem Mass Mass Spectrometry Spectrometry Jiří Novák and David Hoksza Jiří
More informationUC San Diego UC San Diego Electronic Theses and Dissertations
UC San Diego UC San Diego Electronic Theses and Dissertations Title Algorithms for tandem mass spectrometry-based proteomics Permalink https://escholarship.org/uc/item/89f7x81r Author Frank, Ari Michael
More informationOverview - MS Proteomics in One Slide. MS masses of peptides. MS/MS fragments of a peptide. Results! Match to sequence database
Overview - MS Proteomics in One Slide Obtain protein Digest into peptides Acquire spectra in mass spectrometer MS masses of peptides MS/MS fragments of a peptide Results! Match to sequence database 2 But
More informationPROTEIN SEQUENCING AND IDENTIFICATION USING TANDEM MASS SPECTROMETRY
PROTEIN SEQUENCING AND IDENTIFICATION USING TANDEM MASS SPECTROMETRY Michael Kinter Department of Cell Biology Lerner Research Institute Cleveland Clinic Foundation Nicholas E. Sherman Department of Microbiology
More informationComprehensive support for quantitation
Comprehensive support for quantitation One of the major new features in the current release of Mascot is support for quantitation. This is still work in progress. Our goal is to support all of the popular
More informationAplicació de la proteòmica a la cerca de Biomarcadors proteics Barcelona, 08 de Juny 2010
Aplicació de la proteòmica a la cerca de Biomarcadors proteics Barcelona, 8 de Juny 21 Eliandre de Oliveira Plataforma de Proteòmica Parc Científic de Barcelona Protein Chemistry Proteomics Hypothesis-free
More informationKey questions of proteomics. Bioinformatics 2. Proteomics. Foundation of proteomics. What proteins are there? Protein digestion
s s Key questions of proteomics What proteins are there? Bioinformatics 2 Lecture 2 roteomics How much is there of each of the proteins? - Absolute quantitation - Stoichiometry What (modification/splice)
More informationAn SVM Scorer for More Sensitive and Reliable Peptide Identification via Tandem Mass Spectrometry
An SVM Scorer for More Sensitive and Reliable Peptide Identification via Tandem Mass Spectrometry Haipeng Wang, Yan Fu, Ruixiang Sun, Simin He, Rong Zeng, and Wen Gao Pacific Symposium on Biocomputing
More informationA Suboptimal Algorithm for De Novo Peptide Sequencing via Tandem Mass Spectrometry. BINGWEN LU and TING CHEN ABSTRACT
JOURNAL OF COMPUTATIONAL BIOLOGY Volume 10, Number 1, 2003 Mary Ann Liebert, Inc. Pp. 1 12 A Suboptimal Algorithm for De Novo Peptide Sequencing via Tandem Mass Spectrometry BINGWEN LU and TING CHEN ABSTRACT
More informationMass spectrometry in proteomics
I519 Introduction to Bioinformatics, Fall, 2013 Mass spectrometry in proteomics Haixu Tang School of Informatics and Computing Indiana University, Bloomington Modified from: www.bioalgorithms.info Outline
More informationQuantitative Proteomics
Quantitative Proteomics Quantitation AND Mass Spectrometry Condition A Condition B Identify and quantify differently expressed proteins resulting from a change in the environment (stimulus, disease) Lyse
More informationSpectrum-to-Spectrum Searching Using a. Proteome-wide Spectral Library
MCP Papers in Press. Published on April 30, 2011 as Manuscript M111.007666 Spectrum-to-Spectrum Searching Using a Proteome-wide Spectral Library Chia-Yu Yen, Stephane Houel, Natalie G. Ahn, and William
More informationPeptideProphet: Validation of Peptide Assignments to MS/MS Spectra. Andrew Keller
PeptideProphet: Validation of Peptide Assignments to MS/MS Spectra Andrew Keller Outline Need to validate peptide assignments to MS/MS spectra Statistical approach to validation Running PeptideProphet
More informationWas T. rex Just a Big Chicken? Computational Proteomics
Was T. rex Just a Big Chicken? Computational Proteomics Phillip Compeau and Pavel Pevzner adjusted by Jovana Kovačević Bioinformatics Algorithms: an Active Learning Approach 215 by Compeau and Pevzner.
More informationHOWTO, example workflow and data files. (Version )
HOWTO, example workflow and data files. (Version 20 09 2017) 1 Introduction: SugarQb is a collection of software tools (Nodes) which enable the automated identification of intact glycopeptides from HCD
More informationPepHMM: A Hidden Markov Model Based Scoring Function for Mass Spectrometry Database Search
PepHMM: A Hidden Markov Model Based Scoring Function for Mass Spectrometry Database Search Yunhu Wan, Austin Yang, and Ting Chen*, Department of Mathematics, Department of Pharmaceutical Sciences, and
More informationTandem mass spectra were extracted from the Xcalibur data system format. (.RAW) and charge state assignment was performed using in house software
Supplementary Methods Software Interpretation of Tandem mass spectra Tandem mass spectra were extracted from the Xcalibur data system format (.RAW) and charge state assignment was performed using in house
More informationPredicting Protein Functions and Domain Interactions from Protein Interactions
Predicting Protein Functions and Domain Interactions from Protein Interactions Fengzhu Sun, PhD Center for Computational and Experimental Genomics University of Southern California Outline High-throughput
More informationBiological Networks: Comparison, Conservation, and Evolution via Relative Description Length By: Tamir Tuller & Benny Chor
Biological Networks:,, and via Relative Description Length By: Tamir Tuller & Benny Chor Presented by: Noga Grebla Content of the presentation Presenting the goals of the research Reviewing basic terms
More informationChapter 4. strategies for protein quantitation Ⅱ
Proteomics Chapter 4. strategies for protein quantitation Ⅱ 1 Multiplexed proteomics Multiplexed proteomics is the use of fluorescent stains or probes with different excitation and emission spectra to
More informationMS-based proteomics to investigate proteins and their modifications
MS-based proteomics to investigate proteins and their modifications Francis Impens VIB Proteomics Core October th 217 Overview Mass spectrometry-based proteomics: general workflow Identification of protein
More informationMASS SPECTROMETRY. Topics
MASS SPECTROMETRY MALDI-TOF AND ESI-MS Topics Principle of Mass Spectrometry MALDI-TOF Determination of Mw of Proteins Structural Information by MS: Primary Sequence of a Protein 1 A. Principles Ionization:
More informationTANDEM mass spectrometry (MS/MS) is an essential and
IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, VOL. 2, NO. 3, JULY-SEPTEMBER 2005 217 Predicting Molecular Formulas of Fragment Ions with Isotope Patterns in Tandem Mass Spectra Jingfen
More informationIntroduction to spectral alignment
SI Appendix C. Introduction to spectral alignment Due to the complexity of the anti-symmetric spectral alignment algorithm described in Appendix A, this appendix provides an extended introduction to the
More informationProperties of Average Score Distributions of SEQUEST
Research Properties of Average Score Distributions of SEQUEST THE PROBABILITY RATIO METHOD* S Salvador Martínez-Bartolomé, Pedro Navarro, Fernando Martín-Maroto, Daniel López-Ferrer **, Antonio Ramos-Fernández,
More informationLearning Score Function Parameters for Improved Spectrum Identification in Tandem Mass Spectrometry Experiments
pubs.acs.org/jpr Learning Score Function Parameters for Improved Spectrum Identification in Tandem Mass Spectrometry Experiments Marina Spivak, Michael S. Bereman, Michael J. MacCoss, and William Stafford
More informationThe Pitfalls of Peaklist Generation Software Performance on Database Searches
Proceedings of the 56th ASMS Conference on Mass Spectrometry and Allied Topics, Denver, CO, June 1-5, 2008 The Pitfalls of Peaklist Generation Software Performance on Database Searches Aenoch J. Lynn,
More informationProteome-wide label-free quantification with MaxQuant. Jürgen Cox Max Planck Institute of Biochemistry July 2011
Proteome-wide label-free quantification with MaxQuant Jürgen Cox Max Planck Institute of Biochemistry July 2011 MaxQuant MaxQuant Feature detection Data acquisition Initial Andromeda search Statistics
More informationProtein identification problem from a Bayesian pointofview
Statistics and Its Interface Volume 5 (2012 21 37 Protein identification problem from a Bayesian pointofview YongFugaLi,RandyJ.Arnold,PredragRadivojac and Haixu Tang We present a generic Bayesian framework
More informationMALDI-HDMS E : A Novel Data Independent Acquisition Method for the Enhanced Analysis of 2D-Gel Tryptic Peptide Digests
-HDMS E : A Novel Data Independent Acquisition Method for the Enhanced Analysis of 2D-Gel Tryptic Peptide Digests Emmanuelle Claude, 1 Mark Towers, 1 and Rachel Craven 2 1 Waters Corporation, Manchester,
More informationTUTORIAL EXERCISES WITH ANSWERS
TUTORIAL EXERCISES WITH ANSWERS Tutorial 1 Settings 1. What is the exact monoisotopic mass difference for peptides carrying a 13 C (and NO additional 15 N) labelled C-terminal lysine residue? a. 6.020129
More informationASCQ_ME: a new engine for peptide mass fingerprint directly from mass spectrum without mass list extraction
ASCQ_ME: a new engine for peptide mass fingerprint directly from mass spectrum without mass list extraction Jean-Charles BOISSON1, Laetitia JOURDAN1, El-Ghazali TALBI1, Cécile CREN-OLIVE2 et Christian
More informationLast updated: Copyright
Last updated: 2012-08-20 Copyright 2004-2012 plabel (v2.4) User s Manual by Bioinformatics Group, Institute of Computing Technology, Chinese Academy of Sciences Tel: 86-10-62601016 Email: zhangkun01@ict.ac.cn,
More informationDehua Hang, Matthew Rupp, and Eric Torng Departments of Computer Science and Engineering, Michigan State University, East Lansing, Michigan, USA
FOCUS: PROTEOMICS Automated Data Interpretation Based on the Concept of Negative Signature Mass for Mass-Mapping Disulfide Structures of Cystinyl Proteins Jianfeng Qi* and Wei Wu Department of Chemistry,
More informationElectrospray ionization mass spectrometry (ESI-
Automated Charge State Determination of Complex Isotope-Resolved Mass Spectra by Peak-Target Fourier Transform Li Chen a and Yee Leng Yap b a Bioinformatics Institute, 30 Biopolis Street, Singapore b Davos
More informationMethods for proteome analysis of obesity (Adipose tissue)
Methods for proteome analysis of obesity (Adipose tissue) I. Sample preparation and liquid chromatography-tandem mass spectrometric analysis Instruments, softwares, and materials AB SCIEX Triple TOF 5600
More informationFigure S1. Interaction of PcTS with αsyn. (a) 1 H- 15 N HSQC NMR spectra of 100 µm αsyn in the absence (0:1, black) and increasing equivalent
Figure S1. Interaction of PcTS with αsyn. (a) 1 H- 15 N HSQC NMR spectra of 100 µm αsyn in the absence (0:1, black) and increasing equivalent concentrations of PcTS (100 µm, blue; 500 µm, green; 1.5 mm,
More informationMass spectrometry has been used a lot in biology since the late 1950 s. However it really came into play in the late 1980 s once methods were
Mass spectrometry has been used a lot in biology since the late 1950 s. However it really came into play in the late 1980 s once methods were developed to allow the analysis of large intact (bigger than
More informationA graph-based filtering method for top-down mass spectral identification
Yang and Zhu BMC Genomics 2018, 19(Suppl 7):666 https://doi.org/10.1186/s12864-018-5026-x METHODOLOGY Open Access A graph-based filtering method for top-down mass spectral identification Runmin Yang and
More informationPurdue-UAB Botanicals Center for Age- Related Disease
Purdue-UAB Botanicals Center for Age- Related Disease MALDI-TOF Mass Spectrometry Fingerprinting Technique Landon Wilson MALDI-TOF mass spectrometry is an advanced technique for rapid protein identification
More informationADVANCEMENT IN PROTEIN INFERENCE FROM SHOTGUN PROTEOMICS USING PEPTIDE DETECTABILITY
ADVANCEMENT IN PROTEIN INFERENCE FROM SHOTGUN PROTEOMICS USING PEPTIDE DETECTABILITY PEDRO ALVES, 1 RANDY J. ARNOLD, 2 MILOS V. NOVOTNY, 2 PREDRAG RADIVOJAC, 1 JAMES P. REILLY, 2 HAIXU TANG 1, 3* 1) School
More informationA new algorithm for the evaluation of shotgun peptide sequencing in proteomics: support vector machine classification of peptide MS/MS spectra
A new algorithm for the evaluation of shotgun peptide sequencing in proteomics: support vector machine classification of peptide MS/MS spectra and SEQUEST scores D.C. Anderson*, Weiqun Li, Donald G. Payan,
More informationIntroduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin
1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)
More informationUCD Conway Institute of Biomolecular & Biomedical Research Graduate Education 2009/2010
EMERGING PROTEOMIC TECHNOLOGIES - MODULE SCHEDULE & OUTLINE 2010 Course Organiser: Dr. Giuliano Elia Module Co-ordinator: Dr Giuliano Elia Credits: 5 Date & Time Session & Topic Coordinator 14th April
More informationInsPecT: Identification of Posttranslationally Modified Peptides from Tandem Mass Spectra
Anal. Chem. 2005, 77, 4626-4639 InsPecT: Identification of Posttranslationally Modified Peptides from Tandem Mass Spectra Stephen Tanner,*, Hongjun Shu, Ari Frank, Ling-Chi Wang, Ebrahim Zandi, Marc Mumby,
More informationHMMatch: Peptide Identification by Spectral Matching of Tandem Mass Spectra Using Hidden Markov Models ABSTRACT
JOURNAL OF COMPUTATIONAL BIOLOGY Volume 14, Number 8, 2007 Mary Ann Liebert, Inc. Pp. 1025 1043 DOI: 10.1089/cmb.2007.0071 HMMatch: Peptide Identification by Spectral Matching of Tandem Mass Spectra Using
More informationProbabilistic Arithmetic Automata
Probabilistic Arithmetic Automata Applications of a Stochastic Computational Framework in Biological Sequence Analysis Inke Herms PhD thesis defense Overview 1 Probabilistic Arithmetic Automata 2 Application
More informationChapter 5. Finite Automata
Chapter 5 Finite Automata 5.1 Finite State Automata Capable of recognizing numerous symbol patterns, the class of regular languages Suitable for pattern-recognition type applications, such as the lexical
More informationTOMAHAQ Method Construction
TOMAHAQ Method Construction Triggered by offset mass accurate-mass high-resolution accurate quantitation (TOMAHAQ) can be performed in the standard method editor of the instrument, without modifications
More informationAnalysis of Peptide MS/MS Spectra from Large-Scale Proteomics Experiments Using Spectrum Libraries
Anal. Chem. 2006, 78, 5678-5684 Analysis of Peptide MS/MS Spectra from Large-Scale Proteomics Experiments Using Spectrum Libraries Barbara E. Frewen, Gennifer E. Merrihew, Christine C. Wu, William Stafford
More informationProtein Sequencing Research Group ABRF 2015 annual meeting
Protein Sequencing Research Group ABRF 2015 annual meeting » N-terminal sequencing is in the midst of a technology transition from classical Edman sequencing to mass spectrometry (MS)-based sequencing»
More informationPeptideProphet: Validation of Peptide Assignments to MS/MS Spectra
PeptideProphet: Validation of Peptide Assignments to MS/MS Spectra Andrew Keller Day 2 October 17, 2006 Andrew Keller Rosetta Bioinformatics, Seattle Outline Need to validate peptide assignments to MS/MS
More informationBackground: Imagine it is time for your lunch break, you take your sandwich outside and you sit down to enjoy your lunch with a beautiful view of
Background: Imagine it is time for your lunch break, you take your sandwich outside and you sit down to enjoy your lunch with a beautiful view of Montana s Rocky Mountains. As you look up, you see what
More informationChemoinformatics and information management. Peter Willett, University of Sheffield, UK
Chemoinformatics and information management Peter Willett, University of Sheffield, UK verview What is chemoinformatics and why is it necessary Managing structural information Typical facilities in chemoinformatics
More informationProteome Informatics. Brian C. Searle Creative Commons Attribution
Proteome Informatics Brian C. Searle searleb@uw.edu Creative Commons Attribution Section structure Class 1 Class 2 Homework 1 Mass spectrometry and de novo sequencing Database searching and E-value estimation
More informationOn Optimizing the Non-metric Similarity Search in Tandem Mass Spectra by Clustering
On Optimizing the Non-metric Similarity Search in Tandem Mass Spectra by Clustering Jiří Novák, David Hoksza, Jakub Lokoč, and Tomáš Skopal Siret Research Group, Faculty of Mathematics and Physics, Charles
More informationYour Name: Question 1. Standard Fragmentations in Mass Spectrometry. (20 points)
Exam #4, ovember 14-16, 2007. MU, hemistry 8160, FS07, Dr. Glaser Your ame: Question 1. Standard Fragmentations in Mass Spectrometry. (20 points) For (b) (d), draw the complete structure of the substrate
More informationPing-Chiang Lyu. Institute of Bioinformatics and Structural Biology, Department of Life Science, National Tsing Hua University.
Pharmacophore-based Drug design Ping-Chiang Lyu Institute of Bioinformatics and Structural Biology, Department of Life Science, National Tsing Hua University 96/08/07 Outline Part I: Analysis The analytical
More informationChapter 5. Complexation of Tholins by 18-crown-6:
5-1 Chapter 5. Complexation of Tholins by 18-crown-6: Identification of Primary Amines 5.1. Introduction Electrospray ionization (ESI) is an excellent technique for the ionization of complex mixtures,
More informationProtein Structure Determination from Pseudocontact Shifts Using ROSETTA
Supporting Information Protein Structure Determination from Pseudocontact Shifts Using ROSETTA Christophe Schmitz, Robert Vernon, Gottfried Otting, David Baker and Thomas Huber Table S0. Biological Magnetic
More informationProtein Sequencing and Identification by Mass Spectrometry
Protein Sequencing and Identification by Mass Spectrometry Outline Tandem Mass Spectrometry De Novo Peptide Sequencing Spectrum Graph Protein Identification via Database Search Identifying Post Translationally
More informationProtein Structure. W. M. Grogan, Ph.D. OBJECTIVES
Protein Structure W. M. Grogan, Ph.D. OBJECTIVES 1. Describe the structure and characteristic properties of typical proteins. 2. List and describe the four levels of structure found in proteins. 3. Relate
More information