BST 226 Statistical Methods for Bioinformatics David M. Rocke. January 22, 2014 BST 226 Statistical Methods for Bioinformatics 1

Similar documents
Fundamentals of Mass Spectrometry. Fundamentals of Mass Spectrometry. Learning Objective. Proteomics

(Refer Slide Time 00:09) (Refer Slide Time 00:13)

LC-MS Based Metabolomics

TANDEM MASS SPECTROSCOPY

Atomic masses. Atomic masses of elements. Atomic masses of isotopes. Nominal and exact atomic masses. Example: CO, N 2 ja C 2 H 4

CHROMATOGRAPHY AND MASS SPECTROMETER

Lecture 8: Mass Spectrometry

for the Novice Mass Spectrometry (^>, John Greaves and John Roboz yc**' CRC Press J Taylor & Francis Group Boca Raton London New York

Lecture 8: Mass Spectrometry

Mass spectrometry has been used a lot in biology since the late 1950 s. However it really came into play in the late 1980 s once methods were

Computational Methods for Mass Spectrometry Proteomics

Chemistry 311: Topic 3 - Mass Spectrometry

Chemistry Instrumental Analysis Lecture 34. Chem 4631

MASS SPECTROMETRY. Topics

Qualitative Organic Analysis CH 351 Mass Spectrometry

Molecular weight of polymers. Molecular weight of polymers. Molecular weight of polymers. Molecular weight of polymers. H i

LECTURE-13. Peptide Mass Fingerprinting HANDOUT. Mass spectrometry is an indispensable tool for qualitative and quantitative analysis of

Analysis of Polar Metabolites using Mass Spectrometry

MASS ANALYSER. Mass analysers - separate the ions according to their mass-to-charge ratio. sample. Vacuum pumps

Biological Mass Spectrometry

Choosing the metabolomics platform

1. Prepare the MALDI sample plate by spotting an angiotensin standard and the test sample(s).

Macromolecular Chemistry

Chemistry Instrumental Analysis Lecture 37. Chem 4631

LIQUID CHROMATOGRAPHY-MASS SPECTROMETRY (LC/MS) Presented by: Dr. T. Nageswara Rao M.Pharm PhD KTPC

Mass Spectrometry and Proteomics - Lecture 2 - Matthias Trost Newcastle University

This is the total charge on an ion divided by the elementary charge (e).

Chapter 14. Molar Mass Distribution.

Mass Spectrometry. What is Mass Spectrometry?

Proteomics. November 13, 2007

Mass Spectrometry. General Principles

Lecture 15: Introduction to mass spectrometry-i

BIOINF 4120 Bioinformatics 2 - Structures and Systems - Oliver Kohlbacher Summer Systems Biology Exp. Methods

Introduction to Mass Spectrometry

Molecular Mass Spectrometry

Analytical Technologies in Biotechnology Prof. Dr. Ashwani K. Sharma Department of Biotechnology Indian Institute of Technology, Roorkee

20.2 Ion Sources. ions electrospray uses evaporation of a charged liquid stream to transfer high molecular mass compounds into the gas phase as MH n

Mass spectrometry gas phase transfer and instrumentation

BIOINF 4399B Computational Proteomics and Metabolomics

Nanoporous GaN-Ag Composite Materials Prepared by Metal-Assisted Electroless Etching

Cryodetectors in Life Science Mass Spectrometry. Damian Twerenbold

Study of Non-Covalent Complexes by ESI-MS. By Quinn Tays

MS Goals and Applications. MS Goals and Applications

MS Goals and Applications. MS Goals and Applications

Mass Spectrometry. Hyphenated Techniques GC-MS LC-MS and MS-MS

Instrumental Analysis. Mass Spectrometry. Lecturer:! Somsak Sirichai

Mass Spectrometry Course

LECTURE-11. Hybrid MS Configurations HANDOUT. As discussed in our previous lecture, mass spectrometry is by far the most versatile

Mass spectrometry: forming ions, to identifying proteins and their modifications Stephen Barnes, PhD

1. The range of frequencies that a measurement is sensitive to is called the frequency

Ionization Techniques Part IV

Harris: Quantitative Chemical Analysis, Eight Edition

Molecular Mass Spectrometry

Mass spectrometry: forming ions, to identifying proteins and their modifications Stephen Barnes, PhD

CEE 772 Lecture #27 12/10/2014. CEE 772: Instrumental Methods in Environmental Analysis

Mass Analyzers. mass measurement accuracy/reproducibility. % of ions allowed through the analyzer. Highest m/z that can be analyzed

CEE 772: Instrumental Methods in Environmental Analysis

Mass Analyzers. Ion Trap, FTICR, Orbitrap. CU- Boulder CHEM 5181: Mass Spectrometry & Chromatography. Prof. Jose-Luis Jimenez

Welcome to Organic Chemistry II

15.04.jpg. Mass spectrometry. Electron impact Mass spectrometry

12. Structure Determination: Mass Spectrometry and Infrared Spectroscopy

Introduction to Mass Spectrometry (MS)

Ionization Methods in Mass Spectrometry at the SCS Mass Spectrometry Laboratory

M M e M M H M M H. Ion Sources

Types of Analyzers: Quadrupole: mass filter -part1

Mass Spectrometry (MS)

Overview. Introduction. André Schreiber AB SCIEX Concord, Ontario (Canada)

Protein analysis using mass spectrometry

MS-based proteomics to investigate proteins and their modifications

Tandem MS = MS / MS. ESI-MS give information on the mass of a molecule but none on the structure

Mass Spectrometry. A truly interdisciplinary and versatile analytical method

Other Methods for Generating Ions 1. MALDI matrix assisted laser desorption ionization MS 2. Spray ionization techniques 3. Fast atom bombardment 4.

Isotopic-Labeling and Mass Spectrometry-Based Quantitative Proteomics

BENG 183 Trey Ideker. Protein Sequencing

Mass spectrometry of proteins, peptides and other analytes: principles and principal methods. Matt Renfrow January 11, 2008

Where are we? Check-In

Mass spectrometry and elemental analysis

Workflow concept. Data goes through the workflow. A Node contains an operation An edge represents data flow The results are brought together in tables

2. Separate the ions based on their mass to charge (m/e) ratio. 3. Measure the relative abundance of the ions that are produced

Protein Quantitation II: Multiple Reaction Monitoring. Kelly Ruggles New York University

Mass Spectrometry for Chemists and Biochemists

MS/MS .LQGVRI0606([SHULPHQWV

Mass Spectrometry (MS)

Università degli Studi di Bari CHIMICA ANALITICA STRUMENTALE

Proudly serving laboratories worldwide since 1979 CALL for Refurbished & Certified Lab Equipment

Welcome! Course 7: Concepts for LC-MS

Protein Structure Analysis and Verification. Course S Basics for Biosystems of the Cell exercise work. Maija Nevala, BIO, 67485U 16.1.

Overview - MS Proteomics in One Slide. MS masses of peptides. MS/MS fragments of a peptide. Results! Match to sequence database

Tryptic Digest of Bovine Serum Albumin

What is Tandem Mass Spectrometry? (MS/MS)

WADA Technical Document TD2003IDCR

Traditional Herbal Medicine Structural Elucidation using SYNAPT HDMS

NIH Public Access Author Manuscript Anal Chim Acta. Author manuscript; available in PMC 2010 August 26.

Chapter 5. Complexation of Tholins by 18-crown-6:

i. This is the best evidence for the fact that electrons in an atom surround the nucleus in certain allowed energy levels or orbitals ii.

Protein Quantitation II: Multiple Reaction Monitoring. Kelly Ruggles New York University

4. How can fragmentation be useful in identifying compounds? Permits identification of branching not observed in soft ionization.

High-Throughput Protein Quantitation Using Multiple Reaction Monitoring

Propose a structure for an alcohol, C4H10O, that has the following

Fereshteh Zandkarimi, Samanthi Wickramasekara, Jeff Morre, Jan F. Stevens and Claudia S. Maier

Transcription:

BST 226 Statistical Methods for Bioinformatics David M. Rocke January 22, 2014 BST 226 Statistical Methods for Bioinformatics 1

Mass Spectrometry Mass spectrometry (mass spec, MS) comprises a set of instrumental methods that can identify compounds by molecular weight and can potentially quantify by using the peak height or peak area. If the substance to be analyzed is not already a gas, it needs to be vaporized and also ionized so that the molecules are charged, most commonly by elimination of an electron. The behavior of the ion can be measured in terms of the mass-to-charge ratio (m/z) because a 100 Dalton + molecule behaves in a magnetic field like a 200 Dalton ++ molecule. January 22, 2014 BST 226 Statistical Methods for Bioinformatics 2

Physics One method is to detect the amount of deflection as the ions traverse an evacuated tube in a magnetic field. The amount of deflection at the end of the tube is smaller when m/z is larger. Varying the magnetic field yields a spectrum of total ion current at each field strength = at each m/z Time-of-flight MS is used for larger molecules. The ions are accelerated at the beginning of the tube, and the velocity larger for smaller m/z, so the time of flight is larger. The spectrum is collected over time and the x-axis converted to m/z. Varieties of ion trap MS circulate the ions in a magnetic field and generate a signal that can be deconvolved with a Fourier transform. January 22, 2014 BST 226 Statistical Methods for Bioinformatics 3

January 22, 2014 BST 226 Statistical Methods for Bioinformatics 4

January 22, 2014 BST 226 Statistical Methods for Bioinformatics 5

MALDI Matrix Assisted Laser Desorption Ionization Sample is embedded in matrix on a slide Laser energy absorbed by matrix, ionizes and vaporizes matrix and sample. Can fragment compounds Can doubly or triply ionize compound Look for peak at half or a third of the expected m/z Results of different shots can be very different multiple shots for each sample January 22, 2014 BST 226 Statistical Methods for Bioinformatics 6

January 22, 2014 BST 226 Statistical Methods for Bioinformatics 7

Electrospray Ionization Electrospray ionization uses electricity to disperse a liquid High voltage is applied to a liquid supplied through an emitter (usually a glass or metallic capillary). This leads to the formation of small and highly charged liquid droplets, which are radially dispersed due to Coulomb repulsion. This is often used between a separation stage (liquid chromatography) and the mass spec analyzer. January 22, 2014 BST 226 Statistical Methods for Bioinformatics 8

Separation Technologies Various chemical methods can be used to produce a sample that contains mostly the class of analyte of interest. These can be small molecule organic compounds, lipids, oxy-lipids, sugars, proteins, etc. Gas or liquid chromatography will separate the sample by various chemical properties and then aliquots will be serially analyzed by mass spec. This can help differentiate molecules that have the same weight because they elute at different times. January 22, 2014 BST 226 Statistical Methods for Bioinformatics 9

Metabolomics Mass spectrometry (e.g., LC/ToF-MS can detect and measure relative amounts of small molecules much more easily than with proteins Lipids, saccharides, and others For example we can speciate 500 lipids including diand tri-glycerides species For example, we can measure over a hundred compounds in the arachidonic acid pathway including prostaglandins, COX1, COX2, and LOX products, and CyP450 pathways eicosanoids. Can measure enzymes, substrate, and product January 22, 2014 BST 226 Statistical Methods for Bioinformatics 10

Proteomics Proteins can be very large molecules, up to several thousand amino acids or hundreds of thousands of Daltons. These must be broken up into much smaller peptides, usually chemically with a protease, before they can possibly be ionized. These peptides are then usually further fragmented in a second chamber and the spectrum of fragment sizes recorded. This is called tandem mass spec or MS/MS January 22, 2014 BST 226 Statistical Methods for Bioinformatics 11

January 22, 2014 BST 226 Statistical Methods for Bioinformatics 12

Processing Spectra Baseline estimation Peak identification Mass calibration Data transformation Normalizing across spectra Peak quantitation Identification of isotope and shoulder peaks FTICRMS R package January 22, 2014 BST 226 Statistical Methods for Bioinformatics 13

Baseline Correction In regions of the spectrum where no analyte is present, the signal should fluctuate around a baseline, which should be set to zero In peak regions, the estimated baseline forms the base for peak heights or area. For some technologies, the baseline tends to be relatively flat, but for others, such as NMR and FT-ICR MS, the baseline is curved or wavy. It is important to get the baseline right to find peaks and accurately measure them. January 22, 2014 BST 226 Statistical Methods for Bioinformatics 14

Peak Identification In some cases, a peak is just a point on the spectrum where the points to the left and right have lower magnitudes. But this can result in false multiple peaks from jagged signal. For FT-ICR MS, the peaks look quadratic on the log scale which allows accurate identification. January 22, 2014 BST 226 Statistical Methods for Bioinformatics 15

January 22, 2014 BST 226 Statistical Methods for Bioinformatics 16

Mass Calibration This is supposedly done by in-machine software using known large peaks In principle, the mass accuracy of FTICRMS at 1000 Daltons should be 0.001 Daltons or better In practice, as shown on the previous slide, the same compound can be as much as 0.1 Daltons apart Because the theoretical mass accuracy of FT-ICR is so high, this mass calibration process can be done very effectively. The accuracy of ToF MS and NMR is much worse. January 22, 2014 BST 226 Statistical Methods for Bioinformatics 17

Data Transformation Data from MS and most other technologies needs to be transformed, usually to a log scale to make analysis effective. Care needs to be taken at the low end. Logs of zero and negative numbers are not defined Signal fluctuating around a zero baseline will often be negative. Shifted log (add a constant), generalized log, and other methods can be used. January 22, 2014 BST 226 Statistical Methods for Bioinformatics 18

Statistical Analysis Use only peaks that are present in a minimum number of samples, impute peaks for samples in which peaks are not detected Appropriate statistical analysis per peak depending on the design (e.g., one-way ANOVA, two-way ANOVA, linear regression). Correct for multiple comparisons using Benjamini Hochberg False Discovery Rate control methods Determine signatures by a variety of methods: logistic regression, PAM, SVM January 22, 2014 BST 226 Statistical Methods for Bioinformatics 19

Proteomics Quantitative proteomics by MS/MS is much more difficult since we never actually see the signal from a whole protein (in contrast to ELISA or Luminex where we do). The process of analysis is to take each peptide whose mass is measured in stage 1 and assign it to a protein, if possible, by the fragment spectrum in stage 2 For each protein (and for humans we have a kind of a list in advance) we count the unique peptides that map to that protein and use that as the score for the protein. This can then be analyzed with (for example) overdispersed Poisson regression. January 22, 2014 BST 226 Statistical Methods for Bioinformatics 20