Advances in quantitative proteomics using stable isotope tags

Similar documents
Biological Mass Spectrometry

NPTEL VIDEO COURSE PROTEOMICS PROF. SANJEEVA SRIVASTAVA

Computational Methods for Mass Spectrometry Proteomics

Reagents. Affinity Tag (Biotin) Acid Cleavage Site. Figure 1. Cleavable ICAT Reagent Structure.

Isotopic-Labeling and Mass Spectrometry-Based Quantitative Proteomics

Modeling Mass Spectrometry-Based Protein Analysis

PC235: 2008 Lecture 5: Quantitation. Arnold Falick

Proteomics. Yeast two hybrid. Proteomics - PAGE techniques. Data obtained. What is it?

PeptideProphet: Validation of Peptide Assignments to MS/MS Spectra. Andrew Keller

PROTEIN SEQUENCING AND IDENTIFICATION USING TANDEM MASS SPECTROMETRY

Identification of proteins by enzyme digestion, mass

Bioinformatics 2. Yeast two hybrid. Proteomics. Proteomics

Chapter 4. strategies for protein quantitation Ⅱ

Quantitative Proteomics

UCD Conway Institute of Biomolecular & Biomedical Research Graduate Education 2009/2010

LECTURE-13. Peptide Mass Fingerprinting HANDOUT. Mass spectrometry is an indispensable tool for qualitative and quantitative analysis of

Mass Spectrometry and Proteomics - Lecture 5 - Matthias Trost Newcastle University

The Power of LC MALDI: Identification of Proteins by LC MALDI MS/MS Using the Applied Biosystems 4700 Proteomics Analyzer with TOF/TOF Optics

BIOINF 4120 Bioinformatics 2 - Structures and Systems - Oliver Kohlbacher Summer Systems Biology Exp. Methods

Designed for Accuracy. Innovation with Integrity. High resolution quantitative proteomics LC-MS

Aplicació de la proteòmica a la cerca de Biomarcadors proteics Barcelona, 08 de Juny 2010

Quantitation of a target protein in crude samples using targeted peptide quantification by Mass Spectrometry

Key questions of proteomics. Bioinformatics 2. Proteomics. Foundation of proteomics. What proteins are there? Protein digestion

Overview - MS Proteomics in One Slide. MS masses of peptides. MS/MS fragments of a peptide. Results! Match to sequence database

Improved 6- Plex TMT Quantification Throughput Using a Linear Ion Trap HCD MS 3 Scan Jane M. Liu, 1,2 * Michael J. Sweredoski, 2 Sonja Hess 2 *

Tandem mass spectra were extracted from the Xcalibur data system format. (.RAW) and charge state assignment was performed using in house software

MS-based proteomics to investigate proteins and their modifications

Protein Quantitation II: Multiple Reaction Monitoring. Kelly Ruggles New York University

Proteomics. Areas of Interest

Amine specific Labeling Reagents for Multiplexed Relative and Absolute Protein Quantitation

Protein Quantitation II: Multiple Reaction Monitoring. Kelly Ruggles New York University

Translational Biomarker Core

Tiffany Samaroo MB&B 452a December 8, Take Home Final. Topic 1

TANDEM MASS SPECTROSCOPY

Mass spectrometry has been used a lot in biology since the late 1950 s. However it really came into play in the late 1980 s once methods were

Guide to Peptide Quantitation. Agilent clinical research

A Software Suite for the Generation and Comparison of Peptide Arrays from Sets. of Data Collected by Liquid Chromatography-Mass Spectrometry

Proteomics: the first decade and beyond. (2003) Patterson and Aebersold Nat Genet 33 Suppl: from

Introduction to Proteomics & Bottom-up Proteomics

Bruker Daltonics. EASY-nLC. Tailored HPLC for nano-lc-ms Proteomics. Nano-HPLC. think forward

Yifei Bao. Beatrix. Manor Askenazi

Workshop: SILAC and Alternative Labeling Strategies in Quantitative Proteomics

for the Novice Mass Spectrometry (^>, John Greaves and John Roboz yc**' CRC Press J Taylor & Francis Group Boca Raton London New York

Comprehensive support for quantitation

What is Systems Biology

Analysis of Peptide MS/MS Spectra from Large-Scale Proteomics Experiments Using Spectrum Libraries

COMBINATORIAL CHEMISTRY: CURRENT APPROACH

Quan%ta%on with XPRESS. and. ASAPRa%o

Chem 250 Unit 1 Proteomics by Mass Spectrometry

Atomic masses. Atomic masses of elements. Atomic masses of isotopes. Nominal and exact atomic masses. Example: CO, N 2 ja C 2 H 4

Developing Algorithms for the Determination of Relative Abundances of Peptides from LC/MS Data

Methods for proteome analysis of obesity (Adipose tissue)

Three types of RNA polymerase in eukaryotic nuclei

High-Throughput Protein Quantitation Using Multiple Reaction Monitoring

Powerful Scan Modes of QTRAP System Technology

High-Field Orbitrap Creating new possibilities

LECTURE-11. Hybrid MS Configurations HANDOUT. As discussed in our previous lecture, mass spectrometry is by far the most versatile

CHROMATOGRAPHY AND MASS SPECTROMETER

What is the central dogma of biology?

Qualitative Proteomics (how to obtain high-confidence high-throughput protein identification!)

Quantitative analysis of the proteome

HOWTO, example workflow and data files. (Version )

Overview. Introduction. André Schreiber AB SCIEX Concord, Ontario (Canada)

PesticideScreener. Innovation with Integrity. Comprehensive Pesticide Screening and Quantitation UHR-TOF MS

Multi-residue analysis of pesticides by GC-HRMS

1. Prepare the MALDI sample plate by spotting an angiotensin standard and the test sample(s).

FORENSIC TOXICOLOGY SCREENING APPLICATION SOLUTION

Proteomics. November 13, 2007

Towards the Prediction of Protein Abundance from Tandem Mass Spectrometry Data

PeptideProphet: Validation of Peptide Assignments to MS/MS Spectra

Mass spectrometry and proteomics Steven P Gygi* and Ruedi Aebersold

WADA Technical Document TD2015IDCR

Empirical Statistical Model To Estimate the Accuracy of Peptide Identifications Made by MS/MS and Database Search

WADA Technical Document TD2003IDCR

Identification of Human Hemoglobin Protein Variants Using Electrospray Ionization-Electron Transfer Dissociation Mass Spectrometry

De novo Protein Sequencing by Combining Top-Down and Bottom-Up Tandem Mass Spectra. Xiaowen Liu

Simplified Approaches to Impurity Identification using Accurate Mass UPLC/MS

A Rapid Approach to the Confirmation of Drug Metabolites in Preclinical and Clinical Bioanalysis Studies

Structure of the α-helix

Workflow concept. Data goes through the workflow. A Node contains an operation An edge represents data flow The results are brought together in tables

DIA-Umpire: comprehensive computational framework for data independent acquisition proteomics

Protein Identification Using Tandem Mass Spectrometry. Nathan Edwards Informatics Research Applied Biosystems

Measuring Drug-to-Antibody Ratio (DAR) for Antibody-Drug Conjugates (ADCs) with UHPLC/Q-TOF

Overview. Introduction. André Schreiber 1 and Yun Yun Zou 1 1 AB SCIEX, Concord, Ontario, Canada

Mass Spectrometry. Hyphenated Techniques GC-MS LC-MS and MS-MS

Effective Strategies for Improving Peptide Identification with Tandem Mass Spectrometry

Dr. A. Peter Snyder and Dr. Rabih E. Jabbour Private Citizens June 26, 2013

Protein analysis using mass spectrometry

PosterREPRINT COMPARISON OF PEAK PARKING VERSUS AUTOMATED FRACTION ANALYSIS OF A COMPLEX PROTEIN MIXTURE. Introduction

Stephen McDonald, Mark D. Wrona, Jeff Goshawk Waters Corporation, Milford, MA, USA INTRODUCTION

WALKUP LC/MS FOR PHARMACEUTICAL R&D

Identification and Characterization of an Isolated Impurity Fraction: Analysis of an Unknown Degradant Found in Quetiapine Fumarate

Cryodetectors in Life Science Mass Spectrometry. Damian Twerenbold

Study of Non-Covalent Complexes by ESI-MS. By Quinn Tays

Agilent MassHunter Profinder: Solving the Challenge of Isotopologue Extraction for Qualitative Flux Analysis

2 The Proteome. The Proteome 15

Quantitative Proteomics

Effective desalting and concentration of in-gel digest samples with Vivapure C18 Micro spin columns prior to MALDI-TOF analysis.

Computational Analysis of Mass Spectrometric Data for Whole Organism Proteomic Studies

Improved Throughput and Reproducibility for Targeted Protein Quantification Using a New High-Performance Triple Quadrupole Mass Spectrometer

Transcription:

Advances in quantitative proteomics using stable isotope tags Mark R. Flory, Timothy J. Griffin, Daniel Martin and Ruedi Aebersold A great deal of current biological and clinical research is directed at the interpretation of the information contained in the human genome sequence in terms of the structure, function and control of biological systems and processes. Proteomics, the systematic analysis of proteins, is becoming a critical component in this endeavor because proteomic measurements are carried out directly on proteins the catalysts and effectors of essentially all biological functions. To detect changes in protein profiles that might provide important diagnostic or functional insights, proteomic analyses necessarily have to be quantitative. This article summarizes recent technological advances in quantitative proteomics. The complete genome sequence produced by the Human Genome Project (HGP) [1,2] contains, in principle, the information required for the development and the physiology of a human being. However, much of the knowledge about how the sequence information relates to physiology in health and disease is currently lacking. The genome project has also catalyzed the emergence of new approaches and technologies that attempt to generate and interpret the data required for a comprehensive understanding of biological processes. Among these, systematic and quantitative profiling technologies have been broadly applied to identify diagnostic expression patterns specific to pathological cellular states, such as cancer [3], and to provide clues to the genes and gene products that are responsible for, or correlate with, a specific state of a system.this might, in turn, reveal fundamental mechanisms of biological control that are potential therapeutic targets in the case of pathologically perturbed systems. The most mature of these profiling technologies are those that measure expression changes at the level of mature mrna products [4 6]. However, despite their widespread use, mrna-based analyses of gene expression have shortcomings.their use as tools for the comprehensive investigation of biological processes is limited by the observation that quantitative measurements at the mrna level cannot provide accurate reflections of either the absolute amount or the induced changes in abundance at the protein level [7 9]. Measurements made at the mrna level also fail to detect the post-translational modifications, cellular localizations and activities of proteins or the macromolecular interactions that cooperatively affect their function. The use of mrna profiling techniques as tools for the detection of prognostic or diagnostic patterns is limited by the frequently difficult access to the target tissue or the inability to identify the target tissue in the first place. Quantitative proteomics has recently emerged as a complementary technology to mrna profiling with the ability to characterize comprehensively the structural and functional aspects of protein products [10]. Quantitative proteomics based on stable isotope tagging A key aspect to the comprehensive characterization of protein products is the quantitative analysis of protein profiles. For this, two alternative approaches have been developed. The first and most widely used method is based on high resolution two-dimensional electrophoresis (2DE) and mass spectrometry, the second on quantitative mass spectrometry via stable isotope tagging of proteins and peptides. In the 2DE-based method, complex protein mixtures are initially separated electrophoretically and stained. Specific proteins are then selected for mass spectrometric identification based on quantitative comparison of the 2DE staining patterns of suitable experimental and control protein samples. Whereas the technique is mature and robust, several conceptual and technical considerations limit its general utility [11]. Most significantly, a study using unfractionated soluble proteins from a wholecell yeast (Saccharomyces cerevisiae) lysate demonstrated that even with maximal sample loading and extended electrophoretic separation, low-abundance proteins, which constitute nearly half the yeast proteome, were systematically excluded [12]. The most significant recent advances in quantitative proteomics have been catalyzed by quantitative mass spectrometry, the subject of this review. This method consists of the following four steps: (1) differential isotopic labeling of separate protein mixtures; (2) digestion of the Mark R. Flory Timothy J. Griffin Daniel Martin Ruedi Aebersold* Institute for Systems Biology, 1441 North 34th Street, Seattle, WA 98103-8904, USA. *e-mail: raebersold@ systemsbiology.org 0167-7799/02/$ see front matter 2002 Elsevier Science Ltd. All rights reserved. PII: S0167-7799(02)02088-7 S23

Review A TRENDS Guide to Proteomics Trends in Biotechnology Vol. 20 No. 12 (Suppl.), 2002 Protein from cell state 1 Protein from cell state 2 Reduce free sulfhydryls and label with heavy or light ICAT reagent Combine samples and enzymatically convert labeled protein to peptides using trypsin Multidimensional chromatographic separation of peptides Relative quantification of labeled peptide pairs (reconstructed ion chromatogram analyzed with XPRESS software) H H OD 214 L TRENDS in Biotechnology combined, labeled protein mixtures followed by separation of the resulting peptides by multidimensional liquid chromatography (LC); (3) analysis of the separated peptides by automated tandem mass spectrometry (MS MS); and (4) automated database searching to identify the peptide sequences (and hence the proteins from which they were derived) and determination of relative protein abundance from the mass spectral data. L [Ion exchange trace] Microcapillary reversed phase chromatography and tandem mass spectrometry Peptide sequence identification (results of CID determined using INTERACT software) Figure 1. Strategy for isotope-coded affinity tag (ICAT TM ) reagent-based analysis of protein targets Proteins from two cell states are denatured, reduced and labeled with either isotopically heavy or isotopically light ICAT TM reagent. Following digestion with trypsin, ICAT TM -reagent-labeled peptides are separated by multidimensional liquid chromatography before analysis by reversed-phase tandem mass spectrometry. Both the identity and relative abundances of peptide peak pairs are determined. The most commonly used and now commercially supported isotope labeling method for quantitative proteomics has been the isotope-coded affinity tag (ICAT TM ) reagent methodology [13]. It is based upon the covalent labeling of cysteine residues in separate polypeptide isolates using chemically identical but isotopically different reagents, followed by proteolysis, multidimensional LC and mass spectral analysis to quantify and determine the sequence of the isolated peptides. The approach is shown in Fig. 1. The multidimensional LC method that has seen the most widespread use couples strong-cation exchange (SCX) LC with reversed-phase (RP) microcapillary LC (µlc) [14 17]. Peptides are fractionated by electrostatic charge by SCX LC and the generated fractions are then sequentially further separated by RP µlc either with the two columns linked on-line [14,16] or applied sequentially in an off-line configuration [15,17]. Multidimensional LC has been shown to overcome the sensitivity limitations of commonly employed 2DE methods to separate complex mixtures before mass spectral analysis. This was illustrated by a study in which tryptic peptides generated from a yeast lysate were extensively fractionated and analyzed by LC MS MS. Proteins with codon-bias values indicating very low protein abundance were detected conclusively if three dimensions of chromatographic separation were combined with MS sequencing protocols that maximized the number of sequencing operations [15].The identification in that study of ICAT TM reagent-labeled low-abundance proteins also indicates that the labeling reaction proceeds significantly toward completion for protein mixtures with a predicted dynamic range of five to six orders of magnitude. The sequence of the isotopically labeled (i.e. ICAT TM - reagent-labeled) peptides eluting from the RP µlc column are identified by a highly automated process most commonly using on-line electrospray ionization (ESI) operated in data-dependent MS MS mode. The acquired MS MS spectra are then searched against a peptide sequence database using SEQUEST [18] or other analogous software routines such as MASCOT (http://www.matrixscience. com/cgi/index.pl?page=../home.html) and ProFound (http://prowl.rockefeller.edu/cgi-bin/profound?intro) to determine the sequence of the detected peptide. The mass spectral signals of the identified, differentially isotopically labeled peptide sequences are then reconstructed and the relative intensities of these signals are measured to determine the relative abundance of the peptide (and hence the protein from which it is derived) between the two starting mixtures [13,17]. Although the majority of applications of the ICAT TM reagent have been carried out in this gel free format using multidimensional LC, the approach is versatile and has also been shown to be S24

effective in conjunction with initial separation of protein mixtures by 2DE followed by mass spectral analysis [19]. The separation by 2DE and mass spectrometric analysis of differentially isotopically labeled proteins provides accurate quantification of each protein contained in protein spots of comigrating proteins and thus addresses a common and important problem in protein profiling based on gel electrophoresis patterns. Recent advances of isotope-tagging based proteomics technology Since the description of the principles of the method, variations and improvements in isotope incorporation, mass spectrometry and data processing and analysis have been reported. These advances demonstrate that the combination of selective chemistries, tandem mass spectrometry and data processing algorithms provides a generic and increasingly useful approach to quantitative proteomics. Advances in tagging chemistries and isotope incorporation The general approach to quantitative proteomics involving selective tagging, stable isotope dilution and mass spectrometry has catalyzed the proliferation of a variety of different isotopic labeling methods. Stable isotope incorporation has been achieved in a variety of ways, including metabolic labeling [20,21,30], enzymatically directed methods [22 24] and chemical labeling using externally introduced tags such as that facilitated by the ICAT TM reagent [13]. More recently, a solid-phase method for the isolation and isotopic labeling of cysteine-containing peptides in complex mixtures has been developed that involves tag addition at the end of the labeling procedure after trypsinization and sample preparation [25]. Despite the potential for biases resulting from differences in sample digestion and manipulation that might occur before addition of the tag, the solid-phase method is simpler, more efficient and more sensitive in side-by-side comparisons with the solution-based ICAT TM labeling protocol [25]. Another technological advance is a secondgeneration ICAT reagent containing a cleavable linker (Applied Biosystems, Foster City, CA, USA) that promises to increase further the sensitivity and versatility of quantitative proteomic studies. Alternative strategies for the specific isotopic labeling of amino-acid residues other than cysteine have also been developed, such as per-methyl esterification of carboxylic acid groups [26], specific labeling of lysine residues [27] and peptide N-termini [28]. In addition, an alternative strategy employing differential chemical labeling of lysine residues that does not require stable-isotope incorporation has also been described [29]. The combined use of a variety of labeling reagents reactive with different functionalities should increase sample coverage and should also help to reduce the confound of non-unique peptides when quantitating proteins on the peptide level isolated from complex mammalian samples. The use of chemical reactions to introduce tags detectable by MS into specific sites in proteins and peptides has also been extended to probe functional aspects of proteins contained in complex mixtures. Several methods that enable differential isotopic labeling and isolation of phosphorylated peptides have been described [30 33] as demonstrating or suggesting the potential for quantitative profiling by MS of protein phosphorylation on a proteome-wide scale. Selective chemistries have also been used to investigate the activity of specific classes of enzymes. This was achieved by the synthesis of reagents that bind selectively to proteins in an activity-dependent manner [34]. Incorporation of stable isotopes into such reagents holds great potential for the profiling of protein activities by MS. The ability to quantify relative abundances, post-translational modifications and activities of proteins across whole proteomes and in purified biological complexes will undoubtedly revolutionize our understanding of biological mechanisms acting in both normal and diseased cellular states. Advances in MS instrumentation and methods for quantitative proteome analysis Although it is robust, sensitive and automated, the analysis of complex, isotope-tagged peptide mixtures by on-line LC ESI MS MS suffers from the demand for continual sample consumption and the requirement for untargeted, or on the fly, selection of precursor ions for sequencing. The coupling of matrix-assisted laser desorption ionization (MALDI) to a tandem mass spectrometer promises to overcome these limitations and greatly increase the efficiency of global proteomic comparisons of biological cell states. In this instance, peptides are prepared as described above except that they are eluted and spotted directly from a microcapillary reversed-phase liquid chromatography column on a MALDI sample plate. A mass spectrum is first obtained for each spot to quantify heavy and light ICAT TM reagent-labeled pairs of proteins. Tandem mass spectrometry can then be focused directly on those spots containing ICAT TM reagent-labeled peptide pairs showing interesting changes in relative abundance (Fig. 2). Because the samples are physically separated into spots on the sample plate, this method avoids needless collisioninduced dissociation (CID) sequence analysis of ICAT TM reagent-labeled peptide pairs showing no change in abundance. Therefore, the method significantly reduces S25

Review A TRENDS Guide to Proteomics Trends in Biotechnology Vol. 20 No. 12 (Suppl.), 2002 Chromatographically purified heavy and light ICAT reagent-labeled peptides from two cell states Reversed-phase microcapillary liquid chromatography instrumentation time during data collection and subsequent data analysis by the researcher, making it extremely attractive for analysis of global proteome states. This method has been pioneered [35] using a tandem quadrupole time-of-flight (QTOF) mass spectrometer [36] but could conceivably also be implemented on MALDI ion trap [37] and time-of-flight (TOF) instruments [38]. 90 80 70 60 50 40 30 20 10 (MALDI sample target plate) Light Heavy 0 1216 1218 1220 1222 1224 1226 1228 1230 1232 1234 1236 m/z, amu Quantification of relative abundances of ICAT reagent-labeled peptide pairs (MS scan) Sequence analysis on selected ICAT reagent-labeled peptide pairs (MS MS) TRENDS in Biotechnology Figure 2. Implementation of matrix-assisted laser desorption ionization (MALDI) for global proteomic comparisons Purified isotope-coded affinity tag (ICAT TM )-reagent-labeled peptides are eluted and spotted directly from a microcapillary reversed-phase liquid chromatography column onto a MALDI sample plate. A mass spectrum is obtained for each spot to quantify relative levels of heavy and light ICAT TM -reagent-labeled peptide pairs. Tandem mass spectrometry can then be targeted to spots containing ICAT TM -reagent-labeled peptide pairs exhibiting interesting changes in relative abundance. Advances in data processing and management The automation of proteomic data collection has created an imbalance between the ability to generate proteomic data and the capacity to interpret these data. The analysis of a single experiment requires an assessment of each selected or identified sequence, the culling of incorrect and redundant assignments, the analysis of single-ion chromatograms to quantify each peptide and the integration of all of these into a multidimensional database linking all peptides to the proteins from which they originated. Subsequently, proteomic data are related to other types of data, such as those determined using gene expression arrays. Each one of these tasks demands the development and refinement of open source software tools to prevent bottlenecks during data analysis. Several software solutions have been implemented to reduce the amount of spectra that require manual curation.the first of these approaches removes spectra below preset quality thresholds before database searching, saving both curation and computational time. A second approach integrates the scoring features of SEQUEST and the properties of the identified peptide into a single probability score of a correct identification.this function has excellent operating characteristics and allows the elimination of the bulk of the false identifications with only minimal loss of correct identifications, thereby vastly reducing the number of individual spectra that must be examined [39]. Quantification software such as XPRESS [17] currently provides a sophisticated level of automation to the process of quantitatively analyzing the elution profile of all identified peptides. Analogous commercially available software packages, such as ProICAT (Applied Biosystems, Foster City, CA, USA), have also been developed to provide analytical tools specifically tailored to specific brands of chromatographic and mass spectrometric instruments. The XPRESS software (described in more detail online at: http://www.systemsbiology.org/research/software/ proteomics/) fits a curve to an elution profile based on a reconstructed single-ion trace chromatogram and subsequently calculates a ratio between a pair of peptides. The calculated expression ratio from all peptides from a single protein can be assembled into a composite ratio for that protein, which includes mean and error measurements. S26

Whereas the current generation of software requires manual curation of each peptide elution, future generations will be able to assess the fit of the curve assigned to a chromatographic peak, calling the attention of the user to only those proteins that either demonstrate a significant change in abundance between the experimental and control samples or that contain outliers that require human attention. In this way, the quantification process and subsequent data analysis are targeted to only the relevant, biologically interesting, proteins in the experiment. Finally, once the data are culled using quantitative software such as XPRESS or ProICAT, composite data from multiple experiments those collected from a timecourse analysis, for example can be analyzed collectively using INTERACT software. INTERACT permits multiple datasets to be analyzed using a form-based internet-accessible interface. INTERACT allows the researcher to quickly filter and sort selected MS MS database search results according to various parameters as dictated by the user. Internet links are included for each peptide to allow the researcher to view the mass spectrum and the reconstructed chromatographic peaks, thus facilitating analysis of even the extremely large datasets that arise from proteomic profiling studies [17]. Collectively, these advances in stable isotope incorporation, mass spectrometry and data processing contribute to the further development of a robust and automated platform for quantitative proteomic studies. Representative applications of stable isotope tagging Proteomic profiling cellular extracts One of the earliest descriptions of quantitative proteomic profiling involved the detection of protein abundance changes in two different yeast cultures (e.g. mutant and wild-type) grown on either 14 N- or 15 N-enriched cell growth medium. Following an appropriate growth interval, the two cell cultures were combined and isolated proteins were separated by 2DE. Gel spots were excised and analyzed by MS, revealing the identity of the protein and the relative abundance of the protein in the two cultures based on the measured nitrogen isotope ratio. These measurements allowed not only determination of protein abundance differences between the two yeast cultures but also revealed in vivo phosphorylation sites in a biologically interesting target protein [30]. Pioneering proteomic profiling methods such as these have been improved upon by implementing post-isolation isotopic labeling methods that can be applied to virtually any protein sample and, as discussed below, by adding multidimensional chromatographic separation steps before mass spectral analysis. The multidimensional protein identification technology (MudPIT) method, which involves two liquid chromatographic separation steps, tandem mass spectrometry and database searching, was used to detect and identify 1484 proteins in whole-cell protein extracts from the budding yeast proteome [16]. This protein set included regulatory proteins, such as members of the MAP kinase pathway; transcription factors, including those in the SWI SNF cell cycle regulatory complex; and proteins associated with the membranes of various organelles, including the plasma membrane (e.g. Pma1p), endoplasmic reticulum, the Golgi apparatus, the nucleus, and other membrane-bound structures. Proteins with pis outside the narrow range afforded by 2DE were detected, including 12 proteins with pis <4.3 and 29 proteins with pis >11.A large range of protein masses was seen, including proteins with masses <10 000 and >190 000 Da. Importantly, more than 50% of the predicted low-abundance proteins with a codon index below 0.2 were also represented [16]. Although successful for analysis of yeast proteins, improvements in chromatographic separation and mass spectrometric instrumentation will undoubtedly be needed for complete coverage of complex eukaryotic protein samples that are predicted to span greater than six orders of magnitude in abundance. However, preliminary studies indicate that multidimensional LC- and affinitybased chromatographic separation steps before mass spectrometry allow for a more complete description of a cellular proteome, including low-abundance and recalcitrant proteins [15,16]. This method will undoubtedly provide a platform upon which even more robust strategies applicable to the analysis of higher eukaryotes that demonstrate extremely large dynamic ranges in protein abundance. Proteomic profiling organelles and cellular fractions The implementation of multiple dimensions of chromatographic separation and isotopic labeling strategies has enabled the proteomic profiling of complex cellular mixtures such as those isolated from cellular organelles and cellular fractions. Extended proteome coverage has also been facilitated through the use of gas-phase fractionation (GPF), the iterative mass spectral analysis of a sample over multiple unique m/z ranges, or windows, within a larger mass spectrum survey scan m/z range (e.g. 400 2000 m/z) [40]. Implementation of GPF for proteins enriched in yeast peroxisomal membranes demonstrated significant increases in both sensitivity and reproducibility in identifying proteins known to be associated with this organelle [41]. S27

Review A TRENDS Guide to Proteomics Trends in Biotechnology Vol. 20 No. 12 (Suppl.), 2002 A combination of multidimensional LC separation and ICAT TM tagging has also been used to characterize proteins from higher eukaryotes, including humans. Specifically, the method has been successfully used to determine the ratios of abundance of nearly 500 proteins associated with the microsomal fractions of control and in vitro-differentiated human myeloid leukemia (HL-60) cells [17]. This study represents the most comprehensive analysis to date of membrane-associated proteins and the first study in which the relative abundance of identified proteins was determined directly and systematically. This study indicates this approach and the supporting software tools are broadly applicable to the analysis of large sets of proteins, notably including those refractory proteins that have previously not been accessible using standard proteomic technologies. Conclusions The technical advances and applications described here indicate that quantitative proteomics based on stable isotope tagging and automated mass spectrometry is developing into a mature technology that can interface with a multitude of biological and clinical research questions. We expect that the performance, sample throughput, precision of measurement and level of automation will continue to increase.we furthermore expect that the basis of the method, the combination of selective chemical reactions and mass spectrometry will be applied to probe proteomic properties that indicate, directly or indirectly, the function and activity of proteins. Such measurements include the systematic analysis of protein phosphorylation, the analysis of protein linkage maps [42,43] and enzyme activity profiles. Improved performance and utility of proteomics technology will therefore increasingly impact biology and medicine. However, it is possible that lack of access to suitable research infrastructure and trained personnel will limit the penetrance of quantitative proteomics as a general research tool. Technologic complexity and the high costs of required equipment and trained personnel make proteomics prohibitively expensive for many research groups. A suitable solution might be the establishment of high-throughput proteomic centers, analogous to the successful structural genomics data collections centers, the national beamlines. These publicly funded facilities would allow for the assembly of a critical mass of experts in protein biochemistry, separation sciences, mass spectrometry, computation and informatics, making proteomics accessible to other publicly funded academic researchers. In addition, such centers would pioneer the development of the technologies and software necessary to propel the field of global quantitative proteomics forward and train scientists in the current technologies. Realization of the full potential of this promising technology, therefore, requires coordinated plans for technology development, dissemination and application. Acknowledgements This work was supported by NIH Grants No. P41-RR11823 03, 1-R33-CA89807, 1-R33-CA84698, and 1-R33-CA93302 to R.A., and NIH Genome Postdoctoral Genome Training Grant fellowships to M.R.F. and T.J.G.We also acknowledge generous support from Merck Genome Research Institute and Oxford Glycosciences. References 1 Lander, E.S. et al. (2001) Initial sequencing and analysis of the human genome. Nature 409, 860 921 2 Venter, J.C. et al. (2001) The sequence of the human genome. Science 291, 1304 1351 3 Shipp, M.A. et al. (2002) Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nat. Med. 8, 68 74 4 Diehn, M. et al. (2000) Examining the living genome in health and disease with DNA microarrays. JAMA 283, 2298 2299 5 Hughes,T.R. et al. (2001) Expression profiling using microarrays fabricated by an ink-jet oligonucleotide synthesizer. Nat. Biotechnol. 19, 342 347 6 Lockhart, D.J. and Winzeler, E.A. (2000) Genomics, gene expression and DNA arrays. Nature 405, 827 836 7 Gygi, S.P. et al. (1999) Correlation between protein and mrna abundance in yeast. Mol. Cell. Biol. 19, 1720 1730 8 Ideker,T. et al. (2001) Integrated genomic and proteomic analyses of a systematically perturbed metabolic network. Science 292, 929 934 9 Griffin,T.J. et al. (2002) Complementary profiling of gene expression at the transcriptome and proteome levels in S. cerevisiae. Mol. Cell. Proteomics 1, 323 333 10 Fields, S. (2001) Proteomics in genomeland. Science 291, 1221 1224 11 Griffin,T.J. and Aebersold, R. (2001) Advances in proteome analysis by mass spectrometry. J. Biol. Chem. 276, 45497 45500 12 Gygi, S.P. et al. (2000) Evaluation of two-dimensional gel electrophoresis-based proteome analysis technology. Proc. Natl.Acad. Sci. U. S.A. 97, 9390 9395 13 Gygi, S.P. et al. (1999) Quantitative analysis of complex protein mixtures using isotope-coded affinity tags. Nat. Biotechnol. 17, 994 999 14 Link, A.J. et al. (1999) Direct analysis of protein complexes using mass spectrometry. Nat. Biotechnol. 17, 676 682 15 Gygi, S.P. et al. (2002) Proteome analysis of low-abundance proteins using multidimensional chromatography and isotope-coded affinity tags. J. Proteome Res. 1, 47 54 16 Washburn, M.P. et al. (2001) Large-scale analysis of the yeast proteome by multidimensional protein identification technology. Nat. Biotechnol. 19, 242 247 17 Han, D.K. et al. (2001) Quantitative profiling of differentiation-induced microsomal proteins using isotope-coded affinity tags and mass spectrometry. Nat. Biotechnol. 19, 946 951 18 Eng, J. et al. (1994) An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J.Am. Soc. Mass Spectrom. 5, 976 989 19 Smolka, M. et al. (2002) Quantitative protein profiling using two-dimensional gel electrophoresis, isotope-coded affinity tag labeling, and mass spectrometry. Mol. Cell. Proteomics 1, 19 29 20 Washburn, M.P. et al. (2002) Analysis of quantitative proteomic data generated via multidimensional protein identification technology. Anal. Chem. 74, 1650 1657 21 Ong, S.E. et al. (2002) Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics. Mol. Cell. Proteomics 1, 376 386 S28

22 Shevchenko, A. et al. (1997) Rapid de novo peptide sequencing by a combination of nanoelectrospray, isotopic labeling and a quadrupole/time-of-flight mass spectrometer. Rapid Commun. Mass Spectrom. 11, 1015 1024 23 Mirgorodskaya, O.A. et al. (2000) Quantitation of peptides and proteins by matrix-assisted laser desorption/ionization mass spectrometry using (18)O-labeled internal standards. Rapid Commun. Mass Spectrom. 14, 1226 1232 24 Yao, X. et al. (2001) Proteolytic 18O labeling for comparative proteomics: model studies with two serotypes of adenovirus. Anal. Chem. 73, 2836 2842 25 Zhou, H. et al. (2002) Quantitative proteome analysis by solid phase isotope tagging and mass spectrometry. Nat. Biotechnol. 20, 512 515 26 Goodlett, D.R. et al. (2001) Differential stable isotope labeling of peptides for quantitation and de novo sequence derivation. Rapid Commun. Mass Spectrom. 15, 1214 1221 27 Peters, E.C. et al. (2001) A novel multifunctional labeling reagent for enhanced protein characterization with mass spectrometry. Rapid Commun. Mass Spectrom. 15, 2387 2392 28 Munchbach, M. et al. (2000) Quantitation and facilitated de novo sequencing of proteins by isotopic N-terminal labeling of peptides with a fragmentation-directing moiety. Anal. Chem. 72, 4047 4057 29 Cagney, G. and Emili, A. (2002) De novo peptide sequencing and quantitative profiling of complex protein mixtures using mass-coded abundance tagging. Nat. Biotechnol. 20, 163 170 30 Oda,Y. et al. (1999) Accurate quantitation of protein expression and site-specific phosphorylation. Proc. Natl.Acad. Sci. U. S.A. 96, 6591 6596 31 Zhou, H. et al. (2001) A systematic approach to the analysis of protein phosphorylation. Nat. Biotechnol. 19, 375 378 32 Goshe, M.B. et al. (2001) Phosphoprotein isotope-coded affinity tag approach for isolating and quantitating phosphopeptides in proteome-wide analyses. Anal. Chem. 73, 2578 2586 33 Ficarro, S.B. et al. (2002) Phosphoproteome analysis by mass spectrometry and its application to Saccharomyces cerevisiae. Nat. Biotechnol. 20, 301 305 34 Cravatt, B.F. and Sorensen, E.J. (2000) Chemical strategies for the global analysis of protein function. Curr. Opin. Chem. Biol. 4, 663 668 35 Griffin,T.J. et al. (2001) Quantitative proteomic analysis using a MALDI quadrupole time-of-flight mass spectrometer. Anal. Chem. 73, 978 986 36 Loboda, A.V. et al. (2000) A tandem quadrupole/time-of-flight mass spectrometer with a matrix-assisted laser desorption/ionization source: design and performance. Rapid Commun. Mass Spectrom. 14, 1047 1057 37 Krutchinsky, A.N. et al. (2001) Automatic identification of proteins with a MALDI-quadrupole ion trap mass spectrometer. Anal. Chem. 73, 5066 5077 38 Bienvenut,W.V. et al. (2002) Matrix-assisted laser desorption/ionization-tandem mass spectrometry with high resolution and sensitivity for identification and characterization of proteins. Proteomics 2, 868 876 39 Keller, A. et al. Empirical statistical method to estimate the accuracy of peptide identifications made by MS MS and database search. Anal. Chem. (in press) 40 Spahr, C.S. et al. (2001) Towards defining the urinary proteome using liquid chromatography-tandem mass spectrometry. I. Profiling an unfractionated tryptic digest. Proteomics 1, 93 107 41 Yi, E.C. et al. (2002) Approaching complete peroxisome characterization by gas-phase fractionation. Electrophoresis 18, 3205 3216 42 Gavin, A.C. et al. (2002) Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 415, 141 147 43 Ho,Y. et al. (2002) Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415, 180 183 Peer reviewed for authority Reaching the key decision makers Meeting your needs across the drug discovery business FREE copies available at www.drugdiscoverytoday.com 1 August 2002 Volume 7 No. 15 ISSN 1359-6446 Most highly cited journal for drug discovery Peer reviewed for authoritative coverage Chemoproteomics in post-genomic drug discovery Protein microarray technology Drug delivery in veterinary medicine Samir Hanash on the globalization of proteomics July 2002 Volume 1 No. 1 ISSN 1477-3627 TARGETS Innovations in Genomics & Proteomics High-throughput antibody development Protein-domain knockouts RNA expression profiling in cardiac tissue and bacteria 1 January 2003 Volume 1 No. 1 ISSN 1359-6446 BIOSILICO Information Technology in Drug Discovery Visualization of complex databases High-performance computing Information automation and drug development www.drugdiscoverytoday.com COMING IN EARLY 2003 From lead identification to clinical trials Most highly cited journal in drug discovery Your strategic overview 24 issues per year Innovations in genomics & proteomics New high-quality review journal Tactical issues that impact your work 6 issues per year Information technology in drug discovery New high-quality review journal Covers key tactical issues 6 issues per year S29