Roles for the Two-hybrid System in Exploration of the Yeast Protein Interactome*

Similar documents
Physical and Functional Modularity of the Protein Network in Yeast*

Lecture 10: May 19, High-Throughput technologies for measuring proteinprotein

A Method for Assessing the Statistical Significance of Mass Spectrometry-Based Protein Identifications Using General Scoring Schemes

Biological Pathway Completion Using Network Motifs and Random Walks on Graphs

Systems biology and biological networks

Evidence for dynamically organized modularity in the yeast protein-protein interaction network

I-DIRT, A General Method for Distinguishing between Specific and Nonspecific Protein Interactions

Cell biology traditionally identifies proteins based on their individual actions as catalysts, signaling

Divergence Pattern of Duplicate Genes in Protein-Protein Interactions Follows the Power Law

A Two-Step Approach for Clustering Proteins based on Protein Interaction Profile

Proteomics. Yeast two hybrid. Proteomics - PAGE techniques. Data obtained. What is it?

Bioinformatics 2. Yeast two hybrid. Proteomics. Proteomics

Constructing More Reliable Protein-Protein Interaction Maps

Lecture 4: Yeast as a model organism for functional and evolutionary genomics. Part II

Network Biology: Understanding the cell s functional organization. Albert-László Barabási Zoltán N. Oltvai

The geneticist s questions

Biological Networks. Gavin Conant 163B ASRC

Towards Detecting Protein Complexes from Protein Interaction Data

Bayesian Inference of Protein and Domain Interactions Using The Sum-Product Algorithm

Computational Analyses of High-Throughput Protein-Protein Interaction Data

Interaction Network Topologies

GRAPH-THEORETICAL COMPARISON REVEALS STRUCTURAL DIVERGENCE OF HUMAN PROTEIN INTERACTION NETWORKS

networks in molecular biology Wolfgang Huber

Lecture Notes for Fall Network Modeling. Ernest Fraenkel

Approximation Algorithms and Hardness Results for Shortest Path Based Graph Orientations

Discrete Applied Mathematics. Integration of topological measures for eliminating non-specific interactions in protein interaction networks

Types of biological networks. I. Intra-cellurar networks

Tiffany Samaroo MB&B 452a December 8, Take Home Final. Topic 1

Cellular Biophysics SS Prof. Manfred Radmacher

Comparison of Protein-Protein Interaction Confidence Assignment Schemes

Discovering Temporal Relations in Molecular Pathways Using Protein-Protein Interactions

Protein interactions: Two methods for assessment of the. reliability of high-throughput observations

Basic modeling approaches for biological systems. Mahesh Bule

Robust Community Detection Methods with Resolution Parameter for Complex Detection in Protein Protein Interaction Networks

Comparison of Human Protein-Protein Interaction Maps

arxiv: v1 [q-bio.mn] 5 Feb 2008

Genome-Scale Gene Function Prediction Using Multiple Sources of High-Throughput Data in Yeast Saccharomyces cerevisiae ABSTRACT

7.06 Problem Set #4, Spring 2005

Written Exam 15 December Course name: Introduction to Systems Biology Course no

A critical and integrated view of the yeast interactome

Modeling Interactome: Scale-Free or Geometric?

Computational methods for predicting protein-protein interactions

Protein-protein interaction networks Prof. Peter Csermely

Computational approaches for functional genomics

Analysis of Biological Networks: Network Robustness and Evolution

2. Yeast two-hybrid system

The geneticist s questions. Deleting yeast genes. Functional genomics. From Wikipedia, the free encyclopedia

Predicting Protein Functions and Domain Interactions from Protein Interactions

Improved peptide sequencing using isotope information inherent in tandem mass spectra

Models of transcriptional regulation

MTopGO: a tool for module identification in PPI Networks

Measuring TF-DNA interactions

PROTEOMICS. Is the intrinsic disorder of proteins the cause of the scale-free architecture of protein-protein interaction networks?

Identifying Signaling Pathways

Eukaryotic Gene Expression

Noisy PPI Data: Alarmingly High False Positive and False Negative Rates

Map of AP-Aligned Bio-Rad Kits with Learning Objectives

An Efficient Algorithm for Protein-Protein Interaction Network Analysis to Discover Overlapping Functional Modules

Graph Theory Approaches to Protein Interaction Data Analysis

Comparative Network Analysis

Analysis and modelling of protein interaction networks

Modeling Mass Spectrometry-Based Protein Analysis

CHAPTER 13 PROKARYOTE GENES: E. COLI LAC OPERON

Introduction. Gene expression is the combined process of :

Introduction to Bioinformatics

Ensemble Non-negative Matrix Factorization Methods for Clustering Protein-Protein Interactions

Connectivity and expression in protein networks: Proteins in a complex are uniformly expressed

Constructing Signal Transduction Networks Using Multiple Signaling Feature Data

Biological Knowledge Discovery Through Mining Multiple Sources of High-Throughput Data

An introduction to SYSTEMS BIOLOGY

AP Bio Module 16: Bacterial Genetics and Operons, Student Learning Guide

Journal of Biology. BioMed Central

AP Curriculum Framework with Learning Objectives

Introduction to Bioinformatics Integrated Science, 11/9/05

2 Genome evolution: gene fusion versus gene fission

COMBINATORIAL CHEMISTRY IN A HISTORICAL PERSPECTIVE

DISCOVERING PROTEIN COMPLEXES IN DENSE RELIABLE NEIGHBORHOODS OF PROTEIN INTERACTION NETWORKS

BIOINF 4120 Bioinformatics 2 - Structures and Systems - Oliver Kohlbacher Summer Systems Biology Exp. Methods

A direct comparison of protein interaction confidence assignment schemes

STRING: Protein association networks. Lars Juhl Jensen

How Scale-free Type-based Networks Emerge from Instance-based Dynamics

Bioinformatics Chapter 1. Introduction

BME 5742 Biosystems Modeling and Control

Random Boolean Networks

INTERACTIVE CLUSTERING FOR EXPLORATION OF GENOMIC DATA

Big Idea 1: The process of evolution drives the diversity and unity of life.

Analysis and Simulation of Biological Systems

Towards the Prediction of Protein Abundance from Tandem Mass Spectrometry Data

A A A A B B1

The architecture of transcription elongation A crystal structure explains how transcription factors enhance elongation and pausing

V 6 Network analysis

Enduring understanding 1.A: Change in the genetic makeup of a population over time is evolution.

A complementation test would be done by crossing the haploid strains and scoring the phenotype in the diploids.

Biological Concepts and Information Technology (Systems Biology)

INCORPORATING GRAPH FEATURES FOR PREDICTING PROTEIN-PROTEIN INTERACTIONS

V 5 Robustness and Modularity

Dr. Amira A. AL-Hosary

Bioinformatics I. CPBS 7711 October 29, 2015 Protein interaction networks. Debra Goldberg

Overview. Overview. Social networks. What is a network? 10/29/14. Bioinformatics I. Networks are everywhere! Introduction to Networks

Peeling the yeast protein network

Transcription:

Reviews/Perspectives Roles for the Two-hybrid System in Exploration of the Yeast Protein Interactome* Takashi Ito, Kazuhisa Ota, Hiroyuki Kubota, Yoshihiro Yamaguchi, Tomoko Chiba, Kazumi Sakuraba, and Mikio Yoshida ** Comprehensive analysis of protein-protein interactions is a challenging endeavor of functional proteomics and has been best explored in the budding yeast. The yeast protein interactome analysis was achieved first by using the yeast two-hybrid system in a proteome-wide scale and next by large-scale mass spectrometric analysis of affinity-purified protein complexes. While these interaction data have led to a number of novel findings and the emergence of a single huge network containing thousands of proteins, they suffer many false signals and fall short of grasping the entire interactome. Thus, continuous efforts are necessary in both bioinformatics and experimentation to fully exploit these data and to proceed another step forward to the goal. Computational tools to integrate existing biological knowledge buried in literature and various functional genomic data with the interactome data are required for biological interpretation of the huge protein interaction network. Novel experimental methods have to be developed to detect weak, transient interactions involving low abundance proteins as well as to obtain clues to the biological role for each interaction. Since the yeast two-hybrid system can be used for the mapping of the interaction domains and the isolation of interactiondefective mutants, it would serve as a technical basis for the latter purpose, thereby playing another important role in the next phase of protein interactome research. Molecular & Cellular Proteomics 1:561 566, 2002. WHY PROTEIN INTERACTOME? Proteins rarely work by themselves. They almost always interact with other biomolecules to execute their functions. Networks of such biomolecular interactions constitute the basis for life, and those occurring between proteins play extremely important roles. Thus deciphering of entire protein interaction networks or protein interactome is vital to our From the Division of Genome Biology, Cancer Research Institute, Kanazawa University, Kanazawa 920-0934, Japan, the Institute for Bioinformatics Research and Development (BIRD), Japan Science and Technology Corporation (JST), Tokyo 102-0081, Japan, and **INTEC Web and Genome Informatics Corporation, Tokyo 136-0075, Japan Received, September 3, 2002, and in revised form, September 16, 2002 Published, MCP Papers in Press, September 16, 2002, DOI 10.1074/mcp.R200005-MCP200 understanding of life as a system of molecules. Although identification of novel protein interactions is an integral part of conventional or target-oriented studies, these are severely skewed to the neighbors of proteins with current popularity. It is thus necessary to perform a complementary, hypothesisfree approach in protein interactome analysis to obtain more unbiased representation. Such an analysis would also help us guess the functions of numerous novel proteins revealed by the genome projects, which are currently lacking any clue as to their specific functions. If a novel protein is found to bind a well characterized one, the former is likely involved in the same functional category as the latter (i.e. guilt by association). At the same time, the novel protein indicates a previously unrecognized aspect of the molecular pathways involving the known one, thereby expanding our knowledge on that pathway. Pioneering works toward this goal were first undertaken using the two-hybrid system to analyze the yeast protein interactome. THE PRINCIPLE OF THE YEAST TWO-HYBRID SYSTEM The yeast two-hybrid (Y2H) 1 system was developed by Stanley Fields (1) on the basis of modular domain structure of the transcription factor GAL4, comprised of a DNA binding domain and transcription activation domain. In the Y2H system, one of the proteins of one s interest, termed X, is expressed as a hybrid protein with the GAL4 DNA binding domain, whereas the other, termed Y, is expressed with the activation domain. If X and Y interact, the two hybrid proteins, often coined as bait and prey, respectively, are assembled onto GAL4 binding sites in the yeast genome. The assembly functionally reconstitutes the GAL4 transcription factor and induces the expression of reporter genes integrated in the region downstream of the GAL4 binding sites. The Y2H system enables highly sensitive detection of protein-protein interactions in vivo without handling any protein molecules. It also allows one to screen a library of activation domain fusions or preys for the binding partners of one s favorite protein expressed as a DNA binding domain fusion or bait, and it can be used to pinpoint protein regions mediating the interactions. On the other hand, the Y2H system has limitations. First, in principle, it cannot detect interactions requiring three or more 1 The abbreviations used are: Y2H, yeast two-hybrid; ORF, open reading frame; IST, interaction sequence tag; MS, mass spectrometric or mass spectrometry. 2002 by The American Society for Biochemistry and Molecular Biology, Inc. Molecular & Cellular Proteomics 1.8 561 This paper is available on line at http://www.mcponline.org

proteins and those depending on posttranslational modifications. However, note that, when applied to the budding yeast itself, it can occasionally detect interactions involving three proteins or posttranslational modification by the aid of endogenous third proteins or modifying enzymes. Second, the Y2H system is not suitable for the detection of interactions involving membrane proteins, although a substantial number of such interactions have been detected via an unexplained mechanism. Finally, the Y2H interaction does not guarantee that the inferred interactions are of physiological relevance. Despite these and other limitations, the power of the Y2H system is so tremendous that it is now established as a standard technique in molecular biology. The Y2H system has been successfully used to examine an interaction between the two proteins of one s interest and also to screen for unknown binding partners of one s favorite protein. It can be, in principle, used in a more comprehensive fashion to examine all possible binary combinations between the proteins encoded by any single genome. Three groups (ours, CuraGen s, and Fields s) launched such ambitious projects using the budding yeast as the target. GENOME-WIDE Y2H ANALYSIS OF THE BUDDING YEAST We amplified all open reading frames (ORFs) of the budding yeast by means of PCR and cloned them into two types of vectors, one expressing each ORF as bait and the other as prey (2). The bait and prey plasmids were introduced into Y2H host strains one by one: the former and the latter were transformed to Y2H hosts bearing mating type a and, respectively. Bearing opposite mating types, bait clone and prey clone can mate to form diploid cells. Consequently each diploid cell has a unique combination of bait and prey. If they interact, the reporter genes are activated to allow the cells to survive the selection. In other words, each survivor should bear a pair of mutually interacting bait and prey, which can be revealed by tag sequencing of the cohabiting plasmids to generate an interaction sequence tag (IST). These ISTs can then be used for the data base search to decode the inferred protein-protein interactions. We prepared pools for screening, each containing 96 bait or prey clones, performed the mating-based screening described above in all possible combinations between the pools, and finally revealed 4,549 independent two-hybrid interactions (3). Of these, 841 were detected more than three times and were assumed to be of high relevance. Hence we call these interactions as our core data. Notably more than 80% of these interactions were the ones never described before. A similar IST project was conducted by CuraGen (4), who screened a pool of 6,000 preys with each unique bait. They revealed 691 interactions in total, most of which were also novel. Comparison between the two data sets revealed an unexpectedly small overlap: they share 141 interactions, which correspond to 10% of the total independent interactions (Fig. 1) (3). There would be a number of plausible reasons for FIG. 1.Overlap between interactome data. The upper and lower Venn diagrams indicate the overlap between the two data sets of high-throughput two-hybrid analysis and the overlap between the two large-scale co-precipitation/ms analyses, respectively. The two-hybrid data were taken from the high-throughput approach of Uetz et al. (4) and the core data of our analysis (3). The MS data were taken from supplementary Table S1 of Gavin et al. (7) and Table S2 of Ho et al. (8). the small overlap. The systems used by the two groups were different: we used multicopy vectors in the host bearing multiple reporter genes, whereas they used single-copy vectors but used only a single reporter gene. Since both groups PCR-amplified the ORFs, some would inevitably bear mutations that affect interactions. Although both groups pooled clones, the screen does not seem saturated: two-thirds of our 4,549 interactions (3), and one-third of CuraGen s 691 interactions were identified only once (4). Of course, any twohybrid screen contains false signals (see below). These and other unidentified factors are assumed to contribute to the small overlap observed between the two IST projects. The group led by Fields (4) took a different approach in which an array of 6,000 prey clones was mated with each unique bait strain, and the diploid cells formed were replicaplated onto the selection medium to decode interactions from the coordinates of the survivors. This approach is rather slow and tedious but is highly sensitive and free from the problem of unsaturated screening. They examined 142 baits to reveal 281 interactions, which again failed to largely overlap with those by IST approaches. FALSE POSITIVES One of the major concerns for the Y2H system is so-called false positives, which actually include two different categories, namely technical and biological ones. The technical false 562 Molecular & Cellular Proteomics 1.8

positive is an apparent two-hybrid interaction that is not based on the assembly of two hybrid proteins: expression of some baits or preys seems to induce unexplained events leading to artificial induction of reporter genes. Use of multiple reporter genes driven by different GAL4-responsive promoters, as we did in our project (3), was reported to minimize such technical false positives. The biological false positive means a bona fide two-hybrid interaction with no physiological relevance and is discussed below. What fraction of the genome-wide Y2H data is biologically relevant? To estimate the reliability of our data, we inspected a subset of our core data composed of 415 interactions because these interactions occur between two known proteins and hence can be, more or less, evaluated for their biological relevance. This analysis indicated that 50% of the interactions can be assumed to be biologically relevant (3). More recently an interesting method was developed to evaluate the validity of interaction data based on the similarity of the gene expression profile between the genes for the bait and prey displaying a two-hybrid interaction (5). The analysis of our data by this method indicates that interactions with more than three IST hits, or our core data, are expected to be 60% reliable (5). While these two independent estimates may illustrate the overall quality of genome-wide two-hybrid data, users of these data still have to evaluate each interaction of their interest. Even in our non-core data, one does find a number of intriguing interactions. On the other hand, those with high IST hits may well contain a substantial number of biologically meaningless interactions. Therefore, bioinformatics tools to assist such evaluation are critical to fully exploit these genome-wide data (see below). FALSE NEGATIVES It should be also noted that the genome-wide Y2H projects missed most (as much as 90%) of known interactions (i.e. false negatives) (3). Recently an interesting result was reported by Vidal s group (6), who tried to recapitulate twohybrid interactions reported by the three groups (i.e. Ito, CuraGen, and Fields). They amplified yeast ORFs by themselves, cloned them into their own two-hybrid vectors, and examined the interactions in their own Y2H system. Of the 72 interactions examined, 19 (26%) were recapitulated in their study. We further analyzed their data by examining the origin of each interaction examined in their study. The analysis revealed that 9 of the 19 interactions reproduced were originally detected by at least two of the three groups, whereas more than 90% of the interactions that they failed to recapitulate were those detected only by a single group. Although such irreproducible interactions may well be technical false positives discussed above, some interactions seem to be sensitive to subtle difference in the constructs and Y2H system used, whereas others are largely insensitive and easily reproduced by anyone. Such a tendency may become more prominent when using full-length ORFs in the Y2H system because it is known that full-length proteins often show much weaker signals than the appropriately trimmed protein regions containing the interaction domains. These features are inherent to the Y2H system and seem to have contributed to the small overlap observed between the different genome-wide screen data. Y2H AND OTHER INTERACTOME DATA Two impressive studies were published to report largescale mass spectrometric (MS) analyses of affinity-purified yeast protein complexes to demonstrate the power of proteomics (7, 8). They, however, also illustrate the difficulty and limitations of the approach. For instance, Gavin et al. (7) and Ho et al. (8) purified 589 and 493 complexes, respectively, of which 93 were purified by both groups using the same proteins as baits. Comparison of MS analysis on these 93 complexes between the two groups revealed that 48 complexes (52%) contain at least one protein detected by both groups, whereas the other 45 (48%) failed to share any. With respect to the entire proteins detected in these complexes, Gavin et al. (7) and Ho et al. (8) revealed 577 and 877, respectively. The overlap between these proteins was only 133, thereby comprising 10% of the 1,321 proteins collectively reported by the two groups (Fig. 1). Even in the 48 complexes described above, the proteins detected in both studies comprise 14% of the total. Thus the rate of overlap is similar to the one observed in the two-hybrid projects. Although the strategies of the two groups are different and the comparison at the level of protein nexuses revealed by several different baits improves the overlap, it should be noted that even these proteomic studies contain substantial false signals. It is also interesting to note that the interactions revealed by these approaches are somewhat complementary to those by the Y2H system. The Y2H projects essentially detect binary interactions including those of rather weak or transient nature. On the other hand, the MS studies reveal more complex interactions, which are inevitably biased toward those with high abundance and stability (9). Novel analytical platforms are thus required for the detection of weak or fast interactions by means of MS. One of the promising approaches would be an integration of MS with biomolecular interaction analysis based on the principle of surface plasmon resonance (10). Intriguing features of these data sets are also revealed by the integration of gene expression data (11). The data set by Gavin et al. (7) based on genomic integration of tandem affinity purification tags displays strong co-expression among the genes encoding the identified proteins. In contrast, those by Ho et al. (8) based on episomal overexpression of epitopetagged proteins shows rather weak co-expression patterns similar to those of Y2H projects. Recent analysis of accumulated protein interaction data provides further detail on the various aspects of both Y2H and MS data sets (9). Molecular & Cellular Proteomics 1.8 563

HUGE PROTEIN INTERACTION NETWORK The Y2H projects led to the explosion of yeast protein interaction data, and integration of binary interaction data in silico has generated a single huge nexus of proteins including up to 4,000 proteins (3, 12, 13). The additional interaction data by co-precipitation/ms studies would further expand the largest network. The entire network is obviously too complex for the human brain to understand. We need a method to extract biologically meaningful clusters or subnetworks from the huge nexus to formulate a novel hypothesis for further experimentation. However, one should note that the network has become too complex due to the lack of spatial and temporal resolution. For instance, while RNA polymerase I, II, and III are distinct entities, they would be linked into a single huge nexus in silico because of common subunits shared by the three. Thus, we have to integrate existing knowledge on yeast proteins with the massive interactome data. In addition, we should evaluate the relevance of each interaction provided by any large-scale projects. Ideally each edge of the complex graph should be weighed to help the evaluation of each interaction. As discussed above, the number of IST hits may serve as a good measure for the reliability of Y2H data (3, 5). The independent lines of evidence for the interaction, such as coincidence between Y2H and MS data, presence of genetic interaction, similar mutant phenotype, shared subcellular localization, and co-expression of the genes, would be more important. Even provided with these valuable data, construction of the protein interaction network model is still a tedious task that requires many trials and errors. We thus developed a bioinformatics tool to visualize and help one estimate the structure of networks by referring to existing knowledge and other data (Fig. 2) (3, 14). Such tools would become critical to fully exploit interactome data as well as a plethora of other functional genomic data. TOP-DOWN APPROACH TO INTERPRET HUGE INTERACTION NETWORK The approach described above is a knowledge-based, bottom-up approach to construct biologically meaningful subnetworks around the protein of one s interest. However, a totally different approach might be undertaken to interpret the huge network. Recent studies have revealed a scale-free nature of various complex networks of both artificial and natural origins (15). These networks are composed of a small number of highly connected nodes and numerous poorly connected ones, and the number of nodes and the interaction partners of each node follow a power-law distribution (15). The yeast protein interaction network revealed by the interactome analysis shares a similar structure (16). Intriguingly, a highly biased distribution of connectivity has proved to ensure a robustness of the network against random perturbations. On the other hand, such networks are extremely vulnerable to targeted attacks to the highly connected nodes or hubs. Consistent with this, such hubs of the huge protein interaction networks tend to be the products of essential genes (16). Thus, the protein interactome seems to share a basic design principle with other complex networks. It is also conceivable that highly connected nodes lacking apparent homologs in mammals may serve as good targets for antifungal drug development. The complex network has a heterogeneous organization with local clustering or communities, which may represent functional modules to be identified in the case of protein interactome. Recently an intriguing approach was developed to dissect complex network into communities based solely on the topology of the network and was successfully applied to biological and social networks (17). This approach calculates the shortest paths between every combination of the nodes in the complex graph and identifies the most frequently used one. The edge with the highest usage is just like the main traffic connecting one town to the other, and cutting the network at the edge can properly split the networks into two communities. This method can be applied to the dissection of protein interaction networks. Indeed an application of a similar idea to the complex protein network, which was constructed by the conserved co-occurrence of genes in operons, was reported to successfully split a single huge network into clusters composed of functionally more homogeneous proteins (18). FROM CATALOGING TO FUNCTIONAL INTERPRETATION The catalog of protein-protein interactions is still growing to increase the complexity of the largest network, which would be split by various bioinformatics approaches into clusters or individual functional modules (see above). Each cluster is represented as a group of proteins mutually connected by interaction arrows. However, each arrow has a unique biological meaning. Some activate the binding partners, and others suppress. Some are stable, and others are transient. Thus, to biologically understand these clusters, we have to assign additional information onto each edge. Although it can be extracted from literature or existing knowledge, most interactions are revealed by recent high-throughput methods and hence are not associated with functional data. Therefore, one of the next challenges in protein interactome analysis would be to develop strategies for systematic collection of functional data on each of the cataloged interactions. One key would be profiling of the interaction. It is quite informative to learn when and where an interaction occurs. Combinatorial use of recent proteomic techniques for protein complex purification and expression profiling will play a major role for this purpose (19 21). The set of yeast strains bearing tandem affinity purification-tagged ORFs would serve as a valuable resource to perform these analyses. Although the interaction profiling is of particular importance, it provides us with nothing more than an indirect hint for the role of an interaction. To unequivocally uncover it, one 564 Molecular & Cellular Proteomics 1.8

FIG. 2.Tools for analyzing protein interactome data. Analysis of protein interaction networks is substantially facilitated by the use of specialized bioinformatics tools such as WebGenNet (genome.c.kanazawa-u.ac.jp/ webgen/webgen.html). In this system, one can select and display the proteins of one s interest with their interaction partners (left windows). Note that interactions are indicated by arrows with different colors and thickness according to their origins and reliability (e.g. numbers of IST hits), respectively (left upper window). One can also retrieve information on each protein or node of the graph (right window). The system can be used for integrative analysis with other functional genomic data. For instance, red and green circles surrounding each node in this figure indicate the expression profile of the gene encoding each protein at a particular time point of the cell cycle. has to examine what happens if the interaction is specifically disrupted. To perform such an interaction targeting, one has to know the protein regions that mediate the interaction. Once such interaction domains are pinpointed, they can be overexpressed as dominant negatives to disrupt the cognate interaction between the endogenous proteins. Furthermore, they can be used for the isolation of interaction-defective mutants. The most versatile method for the mapping of interaction domains is obviously the Y2H system. Notably it can also be used as so-called reverse two-hybrid selection to select against interaction (22, 23). Using the reverse Y2H system, one can select interaction-defective alleles from a randomly mutagenized population. Once identified, responsible mutations can be easily introduced into the genome using a standard technique of yeast molecular genetics, and phenotypes of such interaction-defective mutants are expected to tell the biology of the interaction. A pitfall of the reverse Y2H system is the lack of discrimination between missense and nonsense mutations, the latter of which abolish all of the functions born by the regions C-terminal to the mutations and hence have to be avoided in interaction targeting. Similarly, missense mutations destabilizing the protein should be eliminated. To achieve this goal, we applied the dual bait Y2H method (24, 25) to guarantee that the introduced mutations induce neither truncation nor destabilization of the protein to be analyzed (26). This guaranteed reverse Y2H system is ideal for the identification of missense mutations suitable for interaction targeting to clarify the biological role for the interaction per se. Molecular & Cellular Proteomics 1.8 565

The applications described above exemplify the potential of the Y2H system as a tool for functional characterization of protein-protein interactions. It should be noted that even the interactions originally detected by other means can be similarly analyzed if they are successfully recapitulated by the Y2H system. However, the current Y2H system is prone to show false negative signals (see above). Hence the development of the Y2H system with low false negatives would be of particular significance in accelerating the functional analysis of cataloged protein-protein interactions to proceed into the next stage of protein interactome analysis. CONCLUSION The Y2H system has been a major player in the cataloging phase of protein interactome analysis. If proteomics intends not to stay as a mere cataloging effort but to proceed into biology or functional analyses, the Y2H system will again play a key role in the near feature. * This work was supported in part by research grants from the Ministry of Education, Culture, Sports, Science and Technology of Japan (MEXT), the Japan Society for the Promotion of Science (JSPS), and the New Energy and Industrial Technology Development Organization (NEDO). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. To whom correspondence should be addressed: Division of Genome Biology, Cancer Research Inst., Kanazawa University, 13-1 Takaramachi, Kanazawa 920-0934, Japan. Tel.: 81-76-265-2726; Fax: 81-76-234-4508; E-mail: titolab@kenroku.kanazawa-u.ac.jp. Recipient of the postdoctoral fellowship from JSPS. REFERENCES 1. Fields, S., and Song, O. (1989) A novel genetic system to detect proteinprotein interactions. Nature 340, 245 246 2. Ito, T., Tashiro, K., Muta, S., Ozawa, R., Chiba, T., Nishizawa, M., Yamamoto, K., Kuhara, S., and Sakaki, Y. (2000) Toward a proteinprotein interaction map of the budding yeast: a comprehensive system to examine two-hybrid interactions in all possible combinations between the yeast proteins. Proc. Natl. Acad. Sci. U. S. A. 97, 1143 1147 3. Ito, T., Chiba, T., Ozawa, R., Yoshida, M., Hattori, M., and Sakaki, Y. (2001) A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc. Natl. Acad. Sci. U. S. A. 98, 4569 4574 4. Uetz, P., Giot, L., Cagney, G., Mansfield, T. A., Judson, R. S., Knight, J. R., Lockshon, D., Narayan, V., Srinivasan, M., Pochart, P., Qureshi-Emilli, A., Li, Y., Godwin, B., Conover, D., Kalbfleisch, T., Vijayadamodar, G., Yang, M., Johnston, M., Fields, S., and Rothberg, J. M. (2000) A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 403, 623 627 5. Deane, C. M., Salwinski, L., Xenarios, I., and Eisenberg, D. (2002) Protein interactions: two methods for assessment of the reliability of high throughput observations. Mol. Cell. Proteomics 1, 349 356 6. Matthews, L. R., Vaglio, P., Reboul, J., Ge, H., Davis, B. P., Garrels, J., Vincent, S., and Vidal, M. (2001) Identification of potential interaction networks using sequence-based searches for conserved protein-protein interactions or interologs. Genome Res. 11, 1771 1775 7. Gavin, A.-C., Bosche, M., Krause, R., Grandi, P., Marzioch, M., Bauer, A., Schultz, J., Rick, J. M., Michon, A. M., Cruciat, C. M., Remor, M., Hofert, C., Schelder, M., Brajenovic, M., Ruffner, H., Merino, A., Klein, K., Hudak, M., Dickson, D., Rudi, T., Gnau, V., Bauch, A., Bastuck, S., Huhse, B., Leutwein, C., Heurtier, M. A., Copley, R. R., Edelmann, A., Querfurth, E., Rybin, V., Drewes, G., Raida, M., Bouwmeester, T., Bork, P., Seraphin, B., Kuster, B., Neubauer, G., and Superti-Furga, G. (2002) Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 415, 141 147 8. Ho, Y., Gruhler, A., Heilbut, A., Bader, G. D., Moore, L., Adams, S. L., Millar, A., Taylor, P., Bennett, K., Boutilier, K., Yang, L., Wolting, C., Donaldson, I., Schandorff, S., Shewnarane, J., Vo, M., Taggart, J., Goudreault, M., Muskat, B., Alfarano, C., Dewar, D., Lin, Z., Michalickova, K., Willems, A. R., Sassi, H., Nielsen, P. A., Rasmussen, K. J., Andersen, J, R., Johansen, L. E., Hansen, L. H., Jespersen, H., Podtelejnikov, A., Nielsen, E., Crawford, J., Poulsen, V., Sorensen, B. D., Matthiesen, J., Hendrickson, R. C., Gleeson, F., Pawson, T., Moran, M. F., Durocher, D., Mann, M., Hogue, C. W., Figeys, D., and Tyers, M. (2002) Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415, 180 183 9. von Mering, C., Krause, R., Snel, B., Cornell, M., Oliver, S. G., Fields, S., and Bork, P. (2002) Comparative assessment of large-scale data sets of protein-protein interactions. Nature 417, 399 403 10. Natsume, T., Nakayama, H., and Isobe, T. (2001) BIA-MS-MS: biomolecular interaction analysis for functional proteomics. Trends Biotechnol. 19, S28 S33 11. Kemmeren, P., van Berkum, N. L., Vilo, J., Bijma, T., Donders, R., Brazma, A., and Holstege, F. C. (2002) Protein interaction verification and functional annotation by integrated analysis of genome-scale data. Mol. Cell 9, 1133 1143 12. Fellenberg, M., Albermann, K., Zollner, A., Mewes, H. W., and Hani, J. (2000) Integrative analysis of protein interaction data. Proc. Int. Conf. Intell. Syst. Mol. Biol. 8, 152 161 13. Schwikowski, B., Uetz, P., and Fields, S. (2000) A network of proteinprotein interactions in yeast. Nat. Biotechnol. 18, 1257 1261 14. Ito, T., Chiba, T., and Yoshida, M. (2001) Exploring the protein interactome using comprehensive two-hybrid projects. Trends Biotechnol. 19, S23 S27 15. Strogatz, S. H. (2001) Exploring complex networks. Nature 410, 268 276 16. Jeong, H., Mason, S. P., Barabási, A.-L., and Oltvai, Z. N. (2001) Lethality and centrality in protein networks. Nature 411, 41 42 17. Girvan, M., and Newman, M. E. (2002) Community structure in social and biological networks. Proc. Natl. Acad. Sci. U. S. A. 99, 7821 7826 18. Snel, B., Bork, P., and Huynen, M. A. (2002) The identification of functional modules from the genomic association of genes. Proc. Natl. Acad. Sci. U. S. A. 99, 5890 5895 19. Rigaout, G., Shevchenko, A., Rutz, B., Wilm, M., Mann, M., and Seraphin, B. (1999) A generic protein purification method for protein complex characterization and proteome exploration. Nat. Biotechnol. 17, 1030 1032 20. Oda, Y., Huang, K., Cross, F. R., Cowburn, D., and Chait, B. T. (1999) Accurate quantitation of protein expression and site-specific phosphorylation. Proc. Natl. Acad. Sci. U. S. A. 96, 6591 6596 21. Gypi, S. P., Rist, B., Gerber, S. A., Turecek, F., Gelb, M. H., and Aebersold, R. (1999) Quantitative analysis of complex protein mixtures using isotope-coded affinity tags. Nat. Biotechnol. 17, 994 999 22. Le Dourain, B., Pierrat, B., vom Baur, E., Chambon, P., and Losson, R. (1995) A new version of the two-hybrid assay for detection of proteinprotein interactions. Nucleic Acids Res. 23, 876 878 23. Vidal, M., Brachman, R. K., Fattaey, A., Harlow, E., and Boeke, J. D. (1996) Reverse two-hybrid and one-hybrid systems to detect dissociation of protein-protein and DNA-protein interactions. Proc. Natl. Acad. Sci. U. S. A. 93, 10315 10320 24. Inouye, C., Dhillon, N., Durfee, T., Zambryski, P. C., and Thorner, J. (1997) Mutational analysis of STE5 in the yeast Saccharomyces cerevisiae: application of a differential interaction trap assay for examining proteinprotein interactions. Genetics 147, 479 492 25. Serebriiskii, I. G., Mitina, O., Chernoff, J., and Golemis, E. (2001) Twohybrid dual bait system to discriminate specificity of protein interactions in small GTPases. Methods Enzymol. 332, 277 300 26. Kubota, H., Ota, K., Sakaki, Y., and Ito, T. (2001) Budding yeast GCN1 binds the GI domain to activate the eif2 kinase GCN2. J. Biol. Chem. 276, 17591 17596 566 Molecular & Cellular Proteomics 1.8