Mass spectrometry has been used a lot in biology since the late 1950 s. However it really came into play in the late 1980 s once methods were

Similar documents
Lecture 8: Mass Spectrometry

Lecture 8: Mass Spectrometry

TANDEM MASS SPECTROSCOPY

(Refer Slide Time 00:09) (Refer Slide Time 00:13)

Introduction to Mass Spectrometry (MS)

Fundamentals of Mass Spectrometry. Fundamentals of Mass Spectrometry. Learning Objective. Proteomics

Computational Methods for Mass Spectrometry Proteomics

CHROMATOGRAPHY AND MASS SPECTROMETER

LECTURE-13. Peptide Mass Fingerprinting HANDOUT. Mass spectrometry is an indispensable tool for qualitative and quantitative analysis of

Other Methods for Generating Ions 1. MALDI matrix assisted laser desorption ionization MS 2. Spray ionization techniques 3. Fast atom bombardment 4.

MS/MS .LQGVRI0606([SHULPHQWV

MS-based proteomics to investigate proteins and their modifications

BST 226 Statistical Methods for Bioinformatics David M. Rocke. January 22, 2014 BST 226 Statistical Methods for Bioinformatics 1

Instrumental Analysis. Mass Spectrometry. Lecturer:! Somsak Sirichai

Atomic masses. Atomic masses of elements. Atomic masses of isotopes. Nominal and exact atomic masses. Example: CO, N 2 ja C 2 H 4

sample was a solution that was evaporated in the spectrometer (such as with ESI-MS) ions such as H +, Na +, K +, or NH 4

Tutorial 1: Setting up your Skyline document

LECTURE-11. Hybrid MS Configurations HANDOUT. As discussed in our previous lecture, mass spectrometry is by far the most versatile

Chemistry 311: Topic 3 - Mass Spectrometry

Protein Structure Analysis and Verification. Course S Basics for Biosystems of the Cell exercise work. Maija Nevala, BIO, 67485U 16.1.

Proteomics. November 13, 2007

Harris: Quantitative Chemical Analysis, Eight Edition

Key questions of proteomics. Bioinformatics 2. Proteomics. Foundation of proteomics. What proteins are there? Protein digestion

Powerful Scan Modes of QTRAP System Technology

Protein Quantitation II: Multiple Reaction Monitoring. Kelly Ruggles New York University

Tandem MS = MS / MS. ESI-MS give information on the mass of a molecule but none on the structure

Qualitative Proteomics (how to obtain high-confidence high-throughput protein identification!)

MASS ANALYSER. Mass analysers - separate the ions according to their mass-to-charge ratio. sample. Vacuum pumps

SRM assay generation and data analysis in Skyline

Types of Analyzers: Quadrupole: mass filter -part1

Protein Quantitation II: Multiple Reaction Monitoring. Kelly Ruggles New York University

Introduction to the Q Trap LC/MS/MS System

De novo Protein Sequencing by Combining Top-Down and Bottom-Up Tandem Mass Spectra. Xiaowen Liu

Mass Spectrometry in MCAL

Lecture 15: Realities of Genome Assembly Protein Sequencing

Mass Spectrometry and Proteomics - Lecture 2 - Matthias Trost Newcastle University

MASS SPECTROMETRY. Topics

NPTEL VIDEO COURSE PROTEOMICS PROF. SANJEEVA SRIVASTAVA

Welcome to Organic Chemistry II

Mass Spectrometry. Hyphenated Techniques GC-MS LC-MS and MS-MS

MASS SPECTROSCOPY (MS)

Biological Mass Spectrometry

Comprehensive support for quantitation

An ion source performs the following two functions:

Mixture Mode for Peptide Mass Fingerprinting ASMS 2003

Introduction to LC-MS

Bioinformatics and BLAST

Mass Spectrometry. General Principles

Protein Sequencing and Identification by Mass Spectrometry

Electrophiles are attracted to the π bond Addition sees a π bond replaced with a σ bond There are many different types of addition reactions:

ChemActivity L2: Mass Spectrometry

Propose a structure for an alcohol, C4H10O, that has the following

for the Novice Mass Spectrometry (^>, John Greaves and John Roboz yc**' CRC Press J Taylor & Francis Group Boca Raton London New York

Mass Spectrometry - Background

The Power of LC MALDI: Identification of Proteins by LC MALDI MS/MS Using the Applied Biosystems 4700 Proteomics Analyzer with TOF/TOF Optics

Mass spectrometry and elemental analysis

Mass spectrometry.

Mass Analyzers. Principles of the three most common types magnetic sector, quadrupole and time of flight - will be discussed herein.

WADA Technical Document TD2015IDCR

Protein analysis using mass spectrometry

Accurate, High-Throughput Protein Identification Using the Q TRAP LC/MS/MS System and Pro ID Software

Effective Strategies for Improving Peptide Identification with Tandem Mass Spectrometry

CEE 772 Lecture #27 12/10/2014. CEE 772: Instrumental Methods in Environmental Analysis

Quantitation of a target protein in crude samples using targeted peptide quantification by Mass Spectrometry

SPECTRA LIBRARY ASSISTED DE NOVO PEPTIDE SEQUENCING FOR HCD AND ETD SPECTRA PAIRS

CEE 772: Instrumental Methods in Environmental Analysis

TUTORIAL EXERCISES WITH ANSWERS

Chapter 5. Complexation of Tholins by 18-crown-6:

Chemistry Instrumental Analysis Lecture 37. Chem 4631

LC-MS Based Metabolomics

Identification of proteins by enzyme digestion, mass

What is Tandem Mass Spectrometry? (MS/MS)

Chemistry Instrumental Analysis Lecture 34. Chem 4631

THE MODERN VIEW OF ATOMIC STRUCTURE

Modeling Mass Spectrometry-Based Protein Analysis

Thermo Scientific LTQ Orbitrap Velos Hybrid FT Mass Spectrometer

Towards the Prediction of Protein Abundance from Tandem Mass Spectrometry Data

PeptideProphet: Validation of Peptide Assignments to MS/MS Spectra. Andrew Keller

Analytical Technologies in Biotechnology Prof. Dr. Ashwani K. Sharma Department of Biotechnology Indian Institute of Technology, Roorkee

Chem 250 Unit 1 Proteomics by Mass Spectrometry

2. Separate the ions based on their mass to charge (m/e) ratio. 3. Measure the relative abundance of the ions that are produced

ZAHID IQBAL WARRAICH

i. This is the best evidence for the fact that electrons in an atom surround the nucleus in certain allowed energy levels or orbitals ii.

1.1 Atomic structure

Mass Spectrometry (MS)

Identification of Human Hemoglobin Protein Variants Using Electrospray Ionization-Electron Transfer Dissociation Mass Spectrometry

Isotopic-Labeling and Mass Spectrometry-Based Quantitative Proteomics

Improved 6- Plex TMT Quantification Throughput Using a Linear Ion Trap HCD MS 3 Scan Jane M. Liu, 1,2 * Michael J. Sweredoski, 2 Sonja Hess 2 *

MS Goals and Applications. MS Goals and Applications

BENG 183 Trey Ideker. Protein Sequencing

WADA Technical Document TD2003IDCR

Information Dependent Acquisition (IDA) 1

Workflow concept. Data goes through the workflow. A Node contains an operation An edge represents data flow The results are brought together in tables

Tandem mass spectra were extracted from the Xcalibur data system format. (.RAW) and charge state assignment was performed using in house software

Gases. Pressure is formally defined as the force exerted on a surface per unit area:

Protein Identification Using Tandem Mass Spectrometry. Nathan Edwards Informatics Research Applied Biosystems

Interazioni di ioni con elettroni (ECD, ETD) e fotoni (Ion spectroscopy) Gianluca Giorgi. via Aldo Moro Siena

Algorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment

BIOINF 4120 Bioinformatics 2 - Structures and Systems - Oliver Kohlbacher Summer Systems Biology Exp. Methods

Parallel Algorithms For Real-Time Peptide-Spectrum Matching

Transcription:

Mass spectrometry has been used a lot in biology since the late 1950 s. However it really came into play in the late 1980 s once methods were developed to allow the analysis of large intact (bigger than 1,000 Daltons) molecule. Two soft ionization techniques, Electrospray and Matrix Assisted Laser Desorption led to a huge jump in popularity as did the development of much more compact (bench top rather than whole laboratory) mass spectrometers. 1!

Mass spectrometry is dependant on the ability to turn the analyte of interest into individual but intact, charged molecules in the gas phase. These are released into a low pressure area (a vacuum from 10-3 to 10-10 torr) where they can be manipulated by electrostatic and/or magnetic fields and separated. The force fields require the molecule to be charged, neutral molecules cannot be manipulated and are lost from the system. 2!

Here the spectrum of a small molecule caffeine, shows basically one main peak and a few larger but much less intense peaks. 3!

Isotope distributions in nature: Carbon C12 (98.9%), C13 (1.1%), C14 (small) Hydrogen H1 (99.98%), Deuterium (0.015%), Tritium (small) Oxygen O16 (99.8%), O17 (0.04%), O18 (0.2%) Sulphur S32 (95.0%), S33 (0.8), S34 (4.2%) These atoms are common in biological systems. Since the heavy isotopes are rare, this is not significant for most small molecules. Once one starts to look at biological molecules like peptides, the mass changes can be significant. For example, insulin has a mass of over 6,000 and thus a 1% shift by each of carbon, oxygen, hydrogen etc. can spread the mass from the lightest molecule (all C12, H1, O16 etc) to the heaviest (all C13, D2, O18 etc) over a range of 20-30 mass units. However some elements such as bromine have almost equally distributed isotopes (Br79 50.5% and Br 81 49.5%) which give rise to spectra with all peaks appearing as doublets. The effect of the isotope distribution on the shape of the spectrum (sometimes called the isotope envelope) becomes much more pronounced when analysing larger biomolecules. Here the various masses are shown for the peptide hormone glucagon. 4!

Here is a spectrum of pure substance P, a peptide. Since a mass spectrometer always measures m/z the mass to charge ratio, two peaks are found. Here we see two main peaks, the doubly and singly charged ions. 5!

In order to find out what you are looking at, i.e. is it singly, doubly etc charged, one looks at the details of the isotope distribution. Since isotopes are always one mass unit apart, if the peaks are one unit apart, the ion is singly charge since the mass difference (1 mass unit) divided by the charge (1+) is 1. 6!

If the peaks are 0.5 mass units apart, they are doubly charged. Remember m/ z,that the mass difference is 1 unit between isotopes and if the charge is 2 then the the mass change is 1/2 = 0.5 7!

An electric field accelerates the ions to a high speed. After this, they are directed into a magnetic field which applies a force to each ion perpendicular to the plane defined by the particles' direction of travel and the magnetic field lines. This force deflects the ions (makes them curve instead of traveling in a straight line) to varying degrees depending on their mass-to-charge ratio. Lighter ions get deflected more than the heavier ions. This is due to Newton's second law of motion. The acceleration of a particle is inversely proportional to its mass. Therefore, the magnetic field deflects the lighter ions more than it does the heavier ions. The detector measures the deflection of each resulting ion beam. From this measurement, the mass-to-charge ratios of all the ions produced in the source can be determined. 8!

All of these mass spectrometers have many things in common. Firstly they possess an Ion Source, that produces ions, an Analyzer that sorts them in some way by their masses, and a Detector that measures the relative intensities of different masses. The underlying principle of all mass spectrometers is that the paths of gas phase ions in electric and magnetic fields are dependent on their mass-to-charge ratios which is used by the analyzer to distinguish the ions from one another. 9!

The simplest type of mass spectrometer involves a single mass separation stage. The ions that are passed from the source into the mass analyzer give a simple read out of the intact molecular ions (assuming the ionization method is soft enough) 10!

As interest grew in analyzing the structure of molecules, more complex mass spectrometers were developed with two mass separation stages. The first stage allows the selection of unique molecules by creating a single mass window to filter away other molecular species. The isolated molecule can then be broken into smaller components by a variety of techniques and the resultant fragment ions can be analysed in the second mass seperation stage. This is called tandem mass spectrometry or MS/MS since it originally was carried out using two mass spectrometers joined together in tandem. 11!

Fragmentation of gas-phase ions is essential to tandem mass spectrometry and occurs between different stages of mass analysis. There are many methods used to fragment the ions and can result in different types of fragmentation and thus different information about the structure and composition of the molecule. There are a number of different tandem MS experiments, which each have their own applications and offer their own information. An instrument equipped for tandem MS can still be used to run MS experiments. Tandem MS can be done in either time or space. Tandem MS in space involves the physical separation of the instrument components (QqQ or QTOF), tandem MS in time involves the use of an ion trap. Post-source fragmentation is most often what is being used in a tandem mass spectrometry experiment. Energy can also be added to the, usually already vibrationally excited, ions through post-source collisions with neutral atoms or molecules, the absorption of radiation, or the transfer or capture of an electron by a multiply charged ion. Collision-induced dissociation (CID), also called collisionally activated dissociation (CAD), involves the collision of an ion with a neutral atom or molecule in the gas phase and subsequent dissociation of the ion. In mass spectrometry, collision-induced dissociation (CID), referred to by some as collisionally activated dissociation (CAD), is a mechanism by which to fragment molecular ions in the gas phase. The molecular ions are usually accelerated by some electrical potential to high kinetic energy in the vacuum of a mass spectrometer and then allowed to collide with neutral gas molecules (often helium, nitrogen or argon). In the collision some of the kinetic energy is converted into internal energy which results in bond breakage and the fragmentation of the molecular ion into smaller fragments. These fragment ions can then be analyzed by a mass spectrometer. In peptide analysis, CID cleaves randomly along the peptide backbone producing b and y ions (see later section 12!

Two types of MS/MS experiments can be carried out depending on the instrument type being used. The first approach developed was tandem in space, in which the parent molecule of interest is fragmented in one part of the instrument before being moved to a second part for the analysis of the daughter (fragment) ions. 13!

The alternative to tandem in space, is the type of experiment that is carried out in an ion trap; tandem in time. Here the isolation of the parent molecule and the analysis of the daughter ions produced by fragmentation occur in the same part of the instrument. The two processes are merely separated by time, the parent isolation occurs first, then the fragmentation and finally the daughter analysis is carried out in the same part of the trap. 14!

Genomics began with the goal of sequencing entire genomes. To accomplish this task, two different sequencing approaches were developed. These methods can be thought of in the following way: Imagine that you have the complete works of an author, written in a language that you studied in school, but never became fluent in. Moreover, the books are in such bad shape that if you open them, they disintegrate. You have two alternatives. You can remove one page at a time, preserve it and decipher it. Or you can open all the books at once and then pick up the fragments of paper and use the words on them to figure out how they fit together. The page-by-page approach to sequencing the human genome was used by the public genome-sequencing consortium. This group first figured out how all the pages fit together and then deciphered all the words on each page. Finally, it assembled the pages back together to produce the whole genome. The advantage of this approach is that it is very precise. The disadvantage is that it takes a long time. The biotechnology company Celera used the other method, called whole genome shotgun sequencing, in its competing effort to sequence the human genome. This method is equivalent to figuring out what s written on all the fragments of paper from all of the volumes and then figuring out how they piece together. To do this procedure effectively requires starting with several copies of each volume so that overlaps among the fragments can be found. The number of original copies is referred to as coverage. To produce a high-quality sequence by this method usually requires eight- to tenfold coverage. The disadvantage of this method is that you rarely get the whole sequence to line up. The advantage is that the portion of the sequence that does line up is acquired much more rapidly than via the pageby-page method. 15!

Proteins can be identified in simple mixtures by digesting them with an enzyme and then measuring the masses of the peptides formed. The set of masses is called the peptide fingerprint. A database is made containing all the proteins in the species genome and the masses of all the peptides from each protein produced by a certain enzyme are calculated. Thus each protein has a theoretical peptide fingerprint. The experimental fingerprint is then compared to all the theoretical fingerprints and the best match is calculated. This should be the correct identity of the unknown protein. 16!

Fingerprints are generated by using specific proteases. These are ones that cut after known amino-acids and hence one can predict theoretically which peptides will be formed.trypsin is the most commonly used protease in proteomics studies since it cuts after arginine and lysine and on average generates peptides that are around 12 amino acids long on average. This is ideal for ESI MS/MS analysis. 17!

A specific enzyme, here trypsin, is used to cut the protein into peptides. Trypsin cuts after arginine (R) and lysine (K) and the masses of the peptides can then be calculated. The experimentally determined masses are then searched against the theoretical masses in the fingerprint database to try and find the best match between the two sets of masses. The result of the search is returned as a list of matches according a probability of this not occuring at random. 18!

Here the output results from database search using the popular Mascot program are shown. A graph is shown to aid visualization. The green box indicates an area where the probability of a hit being correct is less than the significance threshold set, usually 0.05. The red bars outside the box indicate proteins that are likely hits. 19!

The first time a peptide match to a query (one spectrum) appears in the report, it is shown in bold face. Whenever the top ranking peptide match appears, it is shown in red. This means that protein hits with peptide matches that are both bold and red are the most likely assignments. These hits represent the highest scoring protein that contains one or more top ranking peptide matches. 20!

The concept of shotgun proteomics is shown above. Instead of separating the proteins, the entire cell extract is digested with proteases and then the complex mixture is separated. The peptides are eluted from the final separation method, usually reversed-phase chromatography directly into the mass spectrometer where they are automatically subjected to MS/MS analysis. The peptides are identified in a similar way to how proteins are identified. Maybe 10 peptides are entering the mass spectrometer. The MS picks automatically the most intense, isolates it (throwing away the other 9 peptides) and then smashes it into pieces. The mass of the peptide is used to search the database to find all peptides with the same mass. The fragmentation spectra of all these peptides are then calculated and compared to the experimental fragments observed. The best matching peptide sequence is then selected. 21!

Here an automatic RP-HPLC-MS/MS run is shown. The mass spectrometer first accumulates a normal MS scan. It finds the 10 most intense peaks. It uses a mass window of around 10 to prevent picking all the isotopes in a intense peak envelope. The mass spectrometer then sequentially performs MS/MS on each of the ten peaks and then returns to MS mode. The ten peaks are then placed in an exclusion list which tells the mass spectrometer to ignore these masses for the next 5 minutes to ensure they have all eluted and are not repeatedly analysed. The next ten most intense peaks are then determined and scheduled for MS/MS: 22!

The experimental data is generated by the automatic accumulation of MS/MS spectra of tryptic peptides from the multi-dimensional peptide separation. A list of intact peptide masses, each with a list of their fragment masses is generated. In a manner analogous to protein fingerprinting, a theoretical in silico list of the masses of all the tryptic peptides predicted for a specific genome together with their predicted fragment ions is generated. In a first pass, the best theoretical 1000 matching peptide intact massesis generated for each experimental parent mass. Then a cross-correlation analysis is done between the experimental MS/MS spectrum and every theoretical spectrum. The crosscorrelation indicates which is the best matching spectrum and again the probability of the match not occurring at random is calculated. 23!

This shows the MASCOT output for such search. The green area shows insignificant matches and the red boxes indicate significant protein identifications. 24!

25!

26!

27!

28!

29!

30!

31!

32!

This shows the MASCOT output for such search. The green area shows insignificant matches and the red boxes indicate significant protein identifications. 33!

34

35

36

37

Nominal values: important: we have discret space, unit: 1 m/z 38!

39!

40!

41!

42!

43!

44!

45!

46!

47!

48!

49!

50!

51!

In bioinformatics, Basic Local Alignment Search Tool, or BLAST, is an algorithm for comparing primary biological sequence information, such as the amino-acid sequences of different proteins or the nucleotides of DNA sequences. A BLAST search enables a researcher to compare a query sequence with a library or database of sequences, and identify library sequences that resemble the query sequence above a certain threshold. BLAST searches for high scoring sequence alignments between the query sequence and sequences in the database using a heuristic approach that approximates the Smith- Waterman algorithm. The exhaustive Smith-Waterman approach is too slow for searching large genomic databases such as GenBank. Therefore, the BLAST algorithm uses a heuristic approach that is less accurate than the Smith- Waterman but over 50 times faster. The speed and relatively good accuracy of BLAST are the key technical innovation of the BLAST programs. The BLAST algorithm can be conceptually divided into three stages. In the first stage, BLAST searches for exact matches of a small fixed length W between the query and sequences in the database. For example, given the sequences AGTTAC and ACTTAG and a word length W = 3, BLAST would identify the matching substring TTA that is common to both sequences. These exact matches are known as seeds. By default, W = 11 is used for nucleic seeds. In the second stage, BLAST tries to extend the match in both directions, starting at the seed. The ungapped alignment process extends the initial seed match of length W in each direction in an attempt to boost the alignment score. If a high-scoring un-gapped alignment is found, the database sequence passes on to the third stage. 52!

53!

54!

55!

56!

57!

58!

59!

60!

61!

62!

63!

64!

65!

66!

67!

68!

69!

70!

71!

72!

73!

74!

75!

76!

77!

78!

79!

80!

81!