De Novo Peptide Sequencing
|
|
- Berenice Ray
- 5 years ago
- Views:
Transcription
1 De Novo Peptide Sequencing
2 Outline A simple de novo sequencing algorithm PTM Other ion types Mass segment error
3 De Novo Peptide Sequencing b 1 b 2 b 3 b 4 b 5 b 6 b 7 b 8 A NELLLNVK AN ELLLNVK ANE LLLNVK ANEL LLNVK ANELL LNVK ANELLL NVK ANELLLN VK ANELLLNV K y 8 y 7 y 6 y 5 y 4 y 3 y 2 y 1
4 Score Function Implementation M total residue mass peptide a 1 a i a i+1 a k a 1 a 2 a i a i+1 a k prefix mass m M m suffix mass If the corresponding y and/or b ions are observed for prefix mass m (and therefore suffix M m), then peptide is likely to have a prefix mass m. Let f(m) > 0. Otherwise, f(m) 0. Note that f(m) is usually the sum of several related ion types. Also f(m) can be computed without knowing the actual sequence.
5 Score for a Peptide For a sequence P with prefix masses m 1, m 2,, m k, the peptide score is defined as f S, P = f m 1 + f m 2 + f m k De novo sequencing: Given spectrum S, construct the peptide P that maximizes f(s, P).
6 De Novo Sequencing De novo sequencing: Given a spectrum S and a total residue mass M, computes a peptide P such that score(s, P) is maximized. When score(s, P) is sum of f(prefix mass), there is a simple algorithm. For simplicity we use nominal masses.
7 Simple Model f(m) m M m m(a) Find a path from 0 to M, where each step is equal to the mass of some amino acid. Maximize the total score of the incident cells.
8 Algorithm Idea f(m) m M m m(a) D(m): the maximized score that can be achieved by the partial path reaching m. If P = P a is the best path for m, then P must be the best path for m m(a). Thus, D m = max D m m a + f m. a The algorithm initializes D(0) = 0 and all other cells to be. Then computes D(m) for m from 1 to M by the above formula.
9 Dynamic Programming D m M m m(a) The best sequence can be retrieved by a backtracking process by repetitively computing the last amino acid a. Time complexity?
10 High Resolution Data What if the mass values are not nominal?
11 Phosphorylation Monoisotopic mass change: PO 3 H = ps pt py H H H S T Y
12 PTM and De Novo Sequencing Variable PTM does not cause major speed slow down for de novo sequencing algorithms. Instead of trying 20 regular amino acids in the maximization, the algorithm simply tries all modified amino acids too. The time complexity is increased by a constant factor. (Compare to the exponential growth in database search approach). However, since the solution space is larger when many variable PTMs are allowed, the accuracy of the algorithm is reduced.
13 Other Fragment Ions x 3 y 3 z 3 x 2 y 2 z 2 x 1 y 1 z 1 R 1 O R 2 H 2 N C C N C C N C C N H H H H H H O R 3 O R 4 C H COOH a 1 b 1 c 1 a 2 b 2 Between two adjacent residues, there are 3 fragmentation possibilities, causing 6 fragment ion types. Each ion type has a mass offset a: -27, b: +1, c: +18, x: +45, y: +19, z: +2 b and y ions are complementary. Charge one b + y = total residue mass +20. y ion usually the most abundant. Also neutral loss ions such as y-h 2 O. c 2 a 3 b 3 c 3
14 Calculating f(m) with Other Ions For example, if b and y ions are considered, then for prefix mass m, the corresponding ion masses are: b = m+1; y = M-m+19; Calculate the log likelihood ratio for each ion type, add them up as f(m).
15 Double Count
16 Solution Pretend it does not exist? Rare event in real sequences, anyway. No. the algorithm is encouraged to find a peptide that reuses many significant peaks as both y and b ions. And those results found by the algorithm will be wrong. Heuristic solution: detect the algorithm error, and re-run the algorithm by discounting the overlapped peaks.
17 Mass Segment Error Most errors are due to incomplete ion ladders in the spectrum. Thus, a segment of amino acids cannot be determined. However, the total mass of the segment, is fixed. E.g. [242]VLSLLVESK, where 242 = N+Q, N+K, or L+E The first two or three residues often have low confidence, because of a lack of fragment ions. Most de novo sequencing software uses the precursor mass as a constraint (thus the mass of the derived sequence is usually correct).
18 Solution Match a protein sequence database to correct some of the errors. Use machine learning to learn the more frequent combination when the peaks are missing.
19 Automated De Novo Sequencing Many de novo sequencing programs Sherenga (1999) Lutefisk (2001) PEAKS (2003) PepNovo (2005) Novor (2015) Two main models: Spectrum graph PEAKS
CSE182-L8. Mass Spectrometry
CSE182-L8 Mass Spectrometry Project Notes Implement a few tools for proteomics C1:11/2/04 Answer MS questions to get started, select project partner, select a project. C2:11/15/04 (All but web-team) Plan
More informationProtein Sequencing and Identification by Mass Spectrometry
Protein Sequencing and Identification by Mass Spectrometry Tandem Mass Spectrometry De Novo Peptide Sequencing Spectrum Graph Protein Identification via Database Search Identifying Post Translationally
More informationDe Novo Peptide Sequencing: Informatics and Pattern Recognition applied to Proteomics
De Novo Peptide Sequencing: Informatics and Pattern Recognition applied to Proteomics John R. Rose Computer Science and Engineering University of South Carolina 1 Overview Background Information Theoretic
More informationTandem Mass Spectrometry: Generating function, alignment and assembly
Tandem Mass Spectrometry: Generating function, alignment and assembly With slides from Sangtae Kim and from Jones & Pevzner 2004 Determining reliability of identifications Can we use Target/Decoy to estimate
More informationDe novo Protein Sequencing by Combining Top-Down and Bottom-Up Tandem Mass Spectra. Xiaowen Liu
De novo Protein Sequencing by Combining Top-Down and Bottom-Up Tandem Mass Spectra Xiaowen Liu Department of BioHealth Informatics, Department of Computer and Information Sciences, Indiana University-Purdue
More informationMass Spectrometry Based De Novo Peptide Sequencing Error Correction
Mass Spectrometry Based De Novo Peptide Sequencing Error Correction by Chenyu Yao A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Master of Mathematics
More informationA Dynamic Programming Approach to De Novo Peptide Sequencing via Tandem Mass Spectrometry
A Dynamic Programming Approach to De Novo Peptide Sequencing via Tandem Mass Spectrometry Ting Chen Department of Genetics arvard Medical School Boston, MA 02115, USA Ming-Yang Kao Department of Computer
More informationNature Methods: doi: /nmeth Supplementary Figure 1. Fragment indexing allows efficient spectra similarity comparisons.
Supplementary Figure 1 Fragment indexing allows efficient spectra similarity comparisons. The cost and efficiency of spectra similarity calculations can be approximated by the number of fragment comparisons
More informationTUTORIAL EXERCISES WITH ANSWERS
TUTORIAL EXERCISES WITH ANSWERS Tutorial 1 Settings 1. What is the exact monoisotopic mass difference for peptides carrying a 13 C (and NO additional 15 N) labelled C-terminal lysine residue? a. 6.020129
More informationA New Hybrid De Novo Sequencing Method For Protein Identification
A New Hybrid De Novo Sequencing Method For Protein Identification Penghao Wang 1*, Albert Zomaya 2, Susan Wilson 1,3 1. Prince of Wales Clinical School, University of New South Wales, Kensington NSW 2052,
More informationDe Novo Peptide Identification Via Mixed-Integer Linear Optimization And Tandem Mass Spectrometry
17 th European Symposium on Computer Aided Process Engineering ESCAPE17 V. Plesu and P.S. Agachi (Editors) 2007 Elsevier B.V. All rights reserved. 1 De Novo Peptide Identification Via Mixed-Integer Linear
More informationProbabilistic Arithmetic Automata
Probabilistic Arithmetic Automata Applications of a Stochastic Computational Framework in Biological Sequence Analysis Inke Herms PhD thesis defense Overview 1 Probabilistic Arithmetic Automata 2 Application
More informationMS-MS Analysis Programs
MS-MS Analysis Programs Basic Process Genome - Gives AA sequences of proteins Use this to predict spectra Compare data to prediction Determine degree of correctness Make assignment Did we see the protein?
More informationEffective Strategies for Improving Peptide Identification with Tandem Mass Spectrometry
Effective Strategies for Improving Peptide Identification with Tandem Mass Spectrometry by Xi Han A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree
More informationDe novo peptide sequencing methods for tandem mass. spectra
De novo peptide sequencing methods for tandem mass spectra A Thesis Submitted to the College of Graduate Studies and Research in Partial Fulfillment of the Requirements for the degree of Doctor of Philosophy
More informationComputational Methods for Mass Spectrometry Proteomics
Computational Methods for Mass Spectrometry Proteomics Eidhammer, Ingvar ISBN-13: 9780470512975 Table of Contents Preface. Acknowledgements. 1 Protein, Proteome, and Proteomics. 1.1 Primary goals for studying
More informationProtein Identification Using Tandem Mass Spectrometry. Nathan Edwards Informatics Research Applied Biosystems
Protein Identification Using Tandem Mass Spectrometry Nathan Edwards Informatics Research Applied Biosystems Outline Proteomics context Tandem mass spectrometry Peptide fragmentation Peptide identification
More informationHOWTO, example workflow and data files. (Version )
HOWTO, example workflow and data files. (Version 20 09 2017) 1 Introduction: SugarQb is a collection of software tools (Nodes) which enable the automated identification of intact glycopeptides from HCD
More informationvia Tandem Mass Spectrometry and Propositional Satisfiability De Novo Peptide Sequencing Renato Bruni University of Perugia
De Novo Peptide Sequencing via Tandem Mass Spectrometry and Propositional Satisfiability Renato Bruni bruni@diei.unipg.it or bruni@dis.uniroma1.it University of Perugia I FIMA International Conference
More informationBLAST: Target frequencies and information content Dannie Durand
Computational Genomics and Molecular Biology, Fall 2016 1 BLAST: Target frequencies and information content Dannie Durand BLAST has two components: a fast heuristic for searching for similar sequences
More informationQuasiNovo: Algorithms for De Novo Peptide Sequencing
University of South Carolina Scholar Commons Theses and Dissertations 2013 QuasiNovo: Algorithms for De Novo Peptide Sequencing James Paul Cleveland University of South Carolina Follow this and additional
More informationProteomics. November 13, 2007
Proteomics November 13, 2007 Acknowledgement Slides presented here have been borrowed from presentations by : Dr. Mark A. Knepper (LKEM, NHLBI, NIH) Dr. Nathan Edwards (Center for Bioinformatics and Computational
More informationSupplementary Figure 1
Supplementary Figure 1 The correlation of n-score cutoff and FDR in both CID-only and CID-ETD fragmentation strategies. A bar diagram of different n-score thresholds applied in the search, plotted against
More informationReductionist View: A Priori Algorithm and Vector-Space Text Retrieval. Sargur Srihari University at Buffalo The State University of New York
Reductionist View: A Priori Algorithm and Vector-Space Text Retrieval Sargur Srihari University at Buffalo The State University of New York 1 A Priori Algorithm for Association Rule Learning Association
More informationMass Spectrometry and Proteomics - Lecture 5 - Matthias Trost Newcastle University
Mass Spectrometry and Proteomics - Lecture 5 - Matthias Trost Newcastle University matthias.trost@ncl.ac.uk Previously Proteomics Sample prep 144 Lecture 5 Quantitation techniques Search Algorithms Proteomics
More informationSPECTRA LIBRARY ASSISTED DE NOVO PEPTIDE SEQUENCING FOR HCD AND ETD SPECTRA PAIRS
SPECTRA LIBRARY ASSISTED DE NOVO PEPTIDE SEQUENCING FOR HCD AND ETD SPECTRA PAIRS 1 Yan Yan Department of Computer Science University of Western Ontario, Canada OUTLINE Background Tandem mass spectrometry
More informationDE NOVO PEPTIDE SEQUENCING FOR MASS SPECTRA BASED ON MULTI-CHARGE STRONG TAGS
DE NOVO PEPTIDE SEQUENCING FO MASS SPECTA BASED ON MULTI-CHAGE STONG TAGS KANG NING, KET FAH CHONG, HON WAI LEONG Department of Computer Science, National University of Singapore, 3 Science Drive 2, Singapore
More informationSara C. Madeira. Universidade da Beira Interior. (Thanks to Ana Teresa Freitas, IST for useful resources on this subject)
Bioinformática Sequence Alignment Pairwise Sequence Alignment Universidade da Beira Interior (Thanks to Ana Teresa Freitas, IST for useful resources on this subject) 1 16/3/29 & 23/3/29 27/4/29 Outline
More informationGibbs Sampling Methods for Multiple Sequence Alignment
Gibbs Sampling Methods for Multiple Sequence Alignment Scott C. Schmidler 1 Jun S. Liu 2 1 Section on Medical Informatics and 2 Department of Statistics Stanford University 11/17/99 1 Outline Statistical
More informationNeural Networks for Protein Structure Prediction Brown, JMB CS 466 Saurabh Sinha
Neural Networks for Protein Structure Prediction Brown, JMB 1999 CS 466 Saurabh Sinha Outline Goal is to predict secondary structure of a protein from its sequence Artificial Neural Network used for this
More informationHidden Markov Models
Hidden Markov Models Outline 1. CG-Islands 2. The Fair Bet Casino 3. Hidden Markov Model 4. Decoding Algorithm 5. Forward-Backward Algorithm 6. Profile HMMs 7. HMM Parameter Estimation 8. Viterbi Training
More informationWas T. rex Just a Big Chicken? Computational Proteomics
Was T. rex Just a Big Chicken? Computational Proteomics Phillip Compeau and Pavel Pevzner adjusted by Jovana Kovačević Bioinformatics Algorithms: an Active Learning Approach 215 by Compeau and Pevzner.
More informationIntroduction to spectral alignment
SI Appendix C. Introduction to spectral alignment Due to the complexity of the anti-symmetric spectral alignment algorithm described in Appendix A, this appendix provides an extended introduction to the
More informationIntroduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin
1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)
More informationWorkflow concept. Data goes through the workflow. A Node contains an operation An edge represents data flow The results are brought together in tables
PROTEOME DISCOVERER Workflow concept Data goes through the workflow Spectra Peptides Quantitation A Node contains an operation An edge represents data flow The results are brought together in tables Protein
More informationProtein identification problem from a Bayesian pointofview
Statistics and Its Interface Volume 5 (2012 21 37 Protein identification problem from a Bayesian pointofview YongFugaLi,RandyJ.Arnold,PredragRadivojac and Haixu Tang We present a generic Bayesian framework
More informationBio nformatics. Lecture 23. Saad Mneimneh
Bio nformatics Lecture 23 Protein folding The goal is to determine the three-dimensional structure of a protein based on its amino acid sequence Assumption: amino acid sequence completely and uniquely
More informationGeometry Chapter 3 3-6: PROVE THEOREMS ABOUT PERPENDICULAR LINES
Geometry Chapter 3 3-6: PROVE THEOREMS ABOUT PERPENDICULAR LINES Warm-Up 1.) What is the distance between the points (2, 3) and (5, 7). 2.) If < 1 and < 2 are complements, and m < 1 = 49, then what is
More informationLecture 15: Realities of Genome Assembly Protein Sequencing
Lecture 15: Realities of Genome Assembly Protein Sequencing Study Chapter 8.10-8.15 1 Euler s Theorems A graph is balanced if for every vertex the number of incoming edges equals to the number of outgoing
More informationThe Pitfalls of Peaklist Generation Software Performance on Database Searches
Proceedings of the 56th ASMS Conference on Mass Spectrometry and Allied Topics, Denver, CO, June 1-5, 2008 The Pitfalls of Peaklist Generation Software Performance on Database Searches Aenoch J. Lynn,
More informationProtein folding. α-helix. Lecture 21. An α-helix is a simple helix having on average 10 residues (3 turns of the helix)
Computat onal Biology Lecture 21 Protein folding The goal is to determine the three-dimensional structure of a protein based on its amino acid sequence Assumption: amino acid sequence completely and uniquely
More informationMass spectrometry in proteomics
I519 Introduction to Bioinformatics, Fall, 2013 Mass spectrometry in proteomics Haixu Tang School of Informatics and Computing Indiana University, Bloomington Modified from: www.bioalgorithms.info Outline
More informationFrequent Pattern Mining: Exercises
Frequent Pattern Mining: Exercises Christian Borgelt School of Computer Science tto-von-guericke-university of Magdeburg Universitätsplatz 2, 39106 Magdeburg, Germany christian@borgelt.net http://www.borgelt.net/
More informationAn Introduction to Bioinformatics Algorithms Hidden Markov Models
Hidden Markov Models Outline 1. CG-Islands 2. The Fair Bet Casino 3. Hidden Markov Model 4. Decoding Algorithm 5. Forward-Backward Algorithm 6. Profile HMMs 7. HMM Parameter Estimation 8. Viterbi Training
More informationMotivating the need for optimal sequence alignments...
1 Motivating the need for optimal sequence alignments... 2 3 Note that this actually combines two objectives of optimal sequence alignments: (i) use the score of the alignment o infer homology; (ii) use
More informationPeptide Sequence Tags for Fast Database Search in Mass-Spectrometry
Peptide Sequence Tags for Fast Database Search in Mass-Spectrometry Ari Frank,*, Stephen Tanner, Vineet Bafna, and Pavel Pevzner Department of Computer Science & Engineering, University of California,
More informationComputationally Analyzing Mass Spectra of Hydrogen Deuterium Exchange Experiments Kevin S. Drew University of Chicago May 21, 2005
Computationally Analyzing Mass Spectra of Hydrogen Deuterium Exchange Experiments Kevin S. Drew University of Chicago May 21, 2005 1 Abstract Hydrogen deuterium exchange (HDX) using Mass Spectrometers
More informationA graph-based filtering method for top-down mass spectral identification
Yang and Zhu BMC Genomics 2018, 19(Suppl 7):666 https://doi.org/10.1186/s12864-018-5026-x METHODOLOGY Open Access A graph-based filtering method for top-down mass spectral identification Runmin Yang and
More informationParameter estimation using simulated annealing for S- system models of biochemical networks. Orland Gonzalez
Parameter estimation using simulated annealing for S- system models of biochemical networks Orland Gonzalez Outline S-systems quick review Definition of the problem Simulated annealing Perturbation function
More informationLog-Linear Models, MEMMs, and CRFs
Log-Linear Models, MEMMs, and CRFs Michael Collins 1 Notation Throughout this note I ll use underline to denote vectors. For example, w R d will be a vector with components w 1, w 2,... w d. We use expx
More informationSupplementary Material for: Clustering Millions of Tandem Mass Spectra
Supplementary Material for: Clustering Millions of Tandem Mass Spectra Ari M. Frank 1 Nuno Bandeira 1 Zhouxin Shen 2 Stephen Tanner 3 Steven P. Briggs 2 Richard D. Smith 4 Pavel A. Pevzner 1 October 4,
More informationOn the Monotonicity of the String Correction Factor for Words with Mismatches
On the Monotonicity of the String Correction Factor for Words with Mismatches (extended abstract) Alberto Apostolico Georgia Tech & Univ. of Padova Cinzia Pizzi Univ. of Padova & Univ. of Helsinki Abstract.
More informationComparative Network Analysis
Comparative Network Analysis BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 2016 Anthony Gitter gitter@biostat.wisc.edu These slides, excluding third-party material, are licensed under CC BY-NC 4.0 by
More informationBioinformatics. Scoring Matrices. David Gilbert Bioinformatics Research Centre
Bioinformatics Scoring Matrices David Gilbert Bioinformatics Research Centre www.brc.dcs.gla.ac.uk Department of Computing Science, University of Glasgow Learning Objectives To explain the requirement
More informationSTRING: Protein association networks. Lars Juhl Jensen
STRING: Protein association networks Lars Juhl Jensen interaction networks association networks guilt by association protein networks STRING 9.6 million proteins common foundation Exercise 1 Go to http://string-db.org/
More informationIntroduction to Comparative Protein Modeling. Chapter 4 Part I
Introduction to Comparative Protein Modeling Chapter 4 Part I 1 Information on Proteins Each modeling study depends on the quality of the known experimental data. Basis of the model Search in the literature
More informationLinear classifiers: Logistic regression
Linear classifiers: Logistic regression STAT/CSE 416: Machine Learning Emily Fox University of Washington April 19, 2018 How confident is your prediction? The sushi & everything else were awesome! The
More informationSynthesis of 2-level Logic Exact and Heuristic Methods. Two Approaches
Synthesis of 2-level Logic Exact and Heuristic Methods Lecture 7: Branch & Bound Exact Two Approaches Find all primes Find a complete sum Find a minimum cover (covering problem) Heuristic Take an initial
More informationNovoHMM: A Hidden Markov Model for de Novo Peptide Sequencing
Anal. Chem. 2005, 77, 7265-7273 NovoHMM: A Hidden Markov Model for de Novo Peptide Sequencing Bernd Fischer, Volker Roth, Franz Roos, Jonas Grossmann, Sacha Baginsky, Peter Widmayer, Wilhelm Gruissem,
More informationBLAST. Varieties of BLAST
BLAST Basic Local Alignment Search Tool (1990) Altschul, Gish, Miller, Myers, & Lipman Uses short-cuts or heuristics to improve search speed Like speed-reading, does not examine every nucleotide of database
More informationProtein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche
Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche The molecular structure of a protein can be broken down hierarchically. The primary structure of a protein is simply its
More informationProtein Post-translational Modifications Mapping with MS/MS based Frequent Interval Pattern Mining
Protein Post-translational Modifications Mapping with MS/MS based Frequent Interval Pattern Mining Han Liu Department of Computer Science University of Illinois at Urbana-Champaign Email: hanliu@ncsa.uiuc.edu
More informationData-Intensive Computing with MapReduce
Data-Intensive Computing with MapReduce Session 8: Sequence Labeling Jimmy Lin University of Maryland Thursday, March 14, 2013 This work is licensed under a Creative Commons Attribution-Noncommercial-Share
More informationHidden Markov Models
Hidden Markov Models Outline CG-islands The Fair Bet Casino Hidden Markov Model Decoding Algorithm Forward-Backward Algorithm Profile HMMs HMM Parameter Estimation Viterbi training Baum-Welch algorithm
More informationGUIDED NOTES 6.1 EXPONENTIAL FUNCTIONS
GUIDED NOTES 6.1 EXPONENTIAL FUNCTIONS LEARNING OBJECTIVES In this section, you will: Evaluate exponential functions. Find the equation of an exponential function. Use compound interest formulas. Evaluate
More informationGeneralized Linear Models for Non-Normal Data
Generalized Linear Models for Non-Normal Data Today s Class: 3 parts of a generalized model Models for binary outcomes Complications for generalized multivariate or multilevel models SPLH 861: Lecture
More informationAmino Acid Structures from Klug & Cummings. 10/7/2003 CAP/CGS 5991: Lecture 7 1
Amino Acid Structures from Klug & Cummings 10/7/2003 CAP/CGS 5991: Lecture 7 1 Amino Acid Structures from Klug & Cummings 10/7/2003 CAP/CGS 5991: Lecture 7 2 Amino Acid Structures from Klug & Cummings
More informationSequence Modelling with Features: Linear-Chain Conditional Random Fields. COMP-599 Oct 6, 2015
Sequence Modelling with Features: Linear-Chain Conditional Random Fields COMP-599 Oct 6, 2015 Announcement A2 is out. Due Oct 20 at 1pm. 2 Outline Hidden Markov models: shortcomings Generative vs. discriminative
More informationX!TandemPipeline (Myosine Anabolisée) validating, filtering and grouping MSMS identifications
X!TandemPipeline 3.3.3 (Myosine Anabolisée) validating, filtering and grouping MSMS identifications Olivier Langella and Benoit Valot langella@moulon.inra.fr; valot@moulon.inra.fr PAPPSO - http://pappso.inra.fr/
More informationDIA-Umpire: comprehensive computational framework for data independent acquisition proteomics
DIA-Umpire: comprehensive computational framework for data independent acquisition proteomics Chih-Chiang Tsou 1,2, Dmitry Avtonomov 2, Brett Larsen 3, Monika Tucholska 3, Hyungwon Choi 4 Anne-Claude Gingras
More informationThermo Scientific LTQ Orbitrap Velos Hybrid FT Mass Spectrometer
IET International Equipment Trading Ltd. www.ietltd.com Proudly serving laboratories worldwide since 1979 CALL +847.913.0777 for Refurbished & Certified Lab Equipment Thermo Scientific LTQ Orbitrap Velos
More informationParallel Algorithms For Real-Time Peptide-Spectrum Matching
Parallel Algorithms For Real-Time Peptide-Spectrum Matching A Thesis Submitted to the College of Graduate Studies and Research in Partial Fulfillment of the Requirements for the degree of Master of Science
More informationCSE 473: Artificial Intelligence Autumn Topics
CSE 473: Artificial Intelligence Autumn 2014 Bayesian Networks Learning II Dan Weld Slides adapted from Jack Breese, Dan Klein, Daphne Koller, Stuart Russell, Andrew Moore & Luke Zettlemoyer 1 473 Topics
More informationWhat s an HMM? Extraction with Finite State Machines e.g. Hidden Markov Models (HMMs) Hidden Markov Models (HMMs) for Information Extraction
Hidden Markov Models (HMMs) for Information Extraction Daniel S. Weld CSE 454 Extraction with Finite State Machines e.g. Hidden Markov Models (HMMs) standard sequence model in genomics, speech, NLP, What
More informationThe prediction of membrane protein types with NPE
The prediction of membrane protein types with NPE Lipeng Wang 1a), Zhanting Yuan 1, Xuhui Chen 1, and Zhifang Zhou 2 1 College of Electrical and Information Engineering Lanzhou University of Technology,
More informationIntroduction to Signal Detection and Classification. Phani Chavali
Introduction to Signal Detection and Classification Phani Chavali Outline Detection Problem Performance Measures Receiver Operating Characteristics (ROC) F-Test - Test Linear Discriminant Analysis (LDA)
More informationComputational Biology
Computational Biology Lecture 6 31 October 2004 1 Overview Scoring matrices (Thanks to Shannon McWeeney) BLAST algorithm Start sequence alignment 2 1 What is a homologous sequence? A homologous sequence,
More informationCS47300: Web Information Search and Management
CS47300: Web Information Search and Management Prof. Chris Clifton 6 September 2017 Material adapted from course created by Dr. Luo Si, now leading Alibaba research group 1 Vector Space Model Disadvantages:
More informationConditional Random Fields
Conditional Random Fields Micha Elsner February 14, 2013 2 Sums of logs Issue: computing α forward probabilities can undeflow Normally we d fix this using logs But α requires a sum of probabilities Not
More informationModel Accuracy Measures
Model Accuracy Measures Master in Bioinformatics UPF 2017-2018 Eduardo Eyras Computational Genomics Pompeu Fabra University - ICREA Barcelona, Spain Variables What we can measure (attributes) Hypotheses
More informationMS/MS of Peptides Manual Sequencing of Protonated Peptides
S/S of Peptides anual Sequencing of Protonated Peptides Árpád Somogyi Associate irector CCIC, ass Spectrometry and Proteomics Laboratory SU July 11, 2018 Peptides Product Ion Scan Product ion spectra contain
More informationModeling Mass Spectrometry-Based Protein Analysis
Chapter 8 Jan Eriksson and David Fenyö Abstract The success of mass spectrometry based proteomics depends on efficient methods for data analysis. These methods require a detailed understanding of the information
More informationGiri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748
CAP 5510: Introduction to Bioinformatics Giri Narasimhan ECS 254; Phone: x3748 giri@cis.fiu.edu www.cis.fiu.edu/~giri/teach/bioinfs07.html 2/15/07 CAP5510 1 EM Algorithm Goal: Find θ, Z that maximize Pr
More informationMolecular Modeling Lecture 7. Homology modeling insertions/deletions manual realignment
Molecular Modeling 2018-- Lecture 7 Homology modeling insertions/deletions manual realignment Homology modeling also called comparative modeling Sequences that have similar sequence have similar structure.
More informationCMSC 723: Computational Linguistics I Session #5 Hidden Markov Models. The ischool University of Maryland. Wednesday, September 30, 2009
CMSC 723: Computational Linguistics I Session #5 Hidden Markov Models Jimmy Lin The ischool University of Maryland Wednesday, September 30, 2009 Today s Agenda The great leap forward in NLP Hidden Markov
More informationAgilent ESI and APCI sources: for polar to non-polar compounds
1 Agilent 6400 Series Triple Quadrupole Users Workshop 1 Agilent ESI and APCI sources: for polar to non-polar compounds Nebulizer Pressure Corona current Nebulizer Pressure Vaporizer Vcap Vcap Drying Gas
More informationProtein Structures. Sequences of amino acid residues 20 different amino acids. Quaternary. Primary. Tertiary. Secondary. 10/8/2002 Lecture 12 1
Protein Structures Sequences of amino acid residues 20 different amino acids Primary Secondary Tertiary Quaternary 10/8/2002 Lecture 12 1 Angles φ and ψ in the polypeptide chain 10/8/2002 Lecture 12 2
More informationFinal. Introduction to Artificial Intelligence. CS 188 Spring You have approximately 2 hours and 50 minutes.
CS 188 Spring 2014 Introduction to Artificial Intelligence Final You have approximately 2 hours and 50 minutes. The exam is closed book, closed notes except your two-page crib sheet. Mark your answers
More informationHigh-Field Orbitrap Creating new possibilities
Thermo Scientific Orbitrap Elite Hybrid Mass Spectrometer High-Field Orbitrap Creating new possibilities Ultrahigh resolution Faster scanning Higher sensitivity Complementary fragmentation The highest
More informationIntroduction to Hidden Markov Models for Gene Prediction ECE-S690
Introduction to Hidden Markov Models for Gene Prediction ECE-S690 Outline Markov Models The Hidden Part How can we use this for gene prediction? Learning Models Want to recognize patterns (e.g. sequence
More informationSingle alignment: Substitution Matrix. 16 march 2017
Single alignment: Substitution Matrix 16 march 2017 BLOSUM Matrix BLOSUM Matrix [2] (Blocks Amino Acid Substitution Matrices ) It is based on the amino acids substitutions observed in ~2000 conserved block
More information11.3 Decoding Algorithm
11.3 Decoding Algorithm 393 For convenience, we have introduced π 0 and π n+1 as the fictitious initial and terminal states begin and end. This model defines the probability P(x π) for a given sequence
More informationSupport Vector Machines (SVM) in bioinformatics. Day 1: Introduction to SVM
1 Support Vector Machines (SVM) in bioinformatics Day 1: Introduction to SVM Jean-Philippe Vert Bioinformatics Center, Kyoto University, Japan Jean-Philippe.Vert@mines.org Human Genome Center, University
More informationpnovo: De novo Peptide Sequencing and Identification Using HCD Spectra
pnovo: De novo Peptide Sequencing and Identification Using HCD Spectra Hao Chi,, Rui-Xiang Sun, Bing Yang, Chun-Qing Song, Le-Heng Wang, Chao Liu,, Yan Fu, Zuo-Fei Yuan,, Hai-Peng Wang,, Si-Min He,*, and
More informationA Better Scoring Model for De Novo Peptide Sequencing: The Symmetric Difference between Explained and Measured Masses Supplementary Figures
A Better Scoring Model for De Novo Peptide Sequencing: The Symmetric Difference between Explained and Measured Masses Supplementary Figures Thomas Tschager *, Simon Rösch *, Ludovic Gillet 2 and Peter
More informationTemplate Free Protein Structure Modeling Jianlin Cheng, PhD
Template Free Protein Structure Modeling Jianlin Cheng, PhD Professor Department of EECS Informatics Institute University of Missouri, Columbia 2018 Protein Energy Landscape & Free Sampling http://pubs.acs.org/subscribe/archive/mdd/v03/i09/html/willis.html
More informationMachine Learning for NLP
Machine Learning for NLP Linear Models Joakim Nivre Uppsala University Department of Linguistics and Philology Slides adapted from Ryan McDonald, Google Research Machine Learning for NLP 1(26) Outline
More informationCS 188: Artificial Intelligence. Bayes Nets
CS 188: Artificial Intelligence Probabilistic Inference: Enumeration, Variable Elimination, Sampling Pieter Abbeel UC Berkeley Many slides over this course adapted from Dan Klein, Stuart Russell, Andrew
More informationCommunities, Spectral Clustering, and Random Walks
Communities, Spectral Clustering, and Random Walks David Bindel Department of Computer Science Cornell University 26 Sep 2011 20 21 19 16 22 28 17 18 29 26 27 30 23 1 25 5 8 24 2 4 14 3 9 13 15 11 10 12
More informationLocal Alignment Statistics
Local Alignment Statistics Stephen Altschul National Center for Biotechnology Information National Library of Medicine National Institutes of Health Bethesda, MD Central Issues in Biological Sequence Comparison
More information