Analysis and modelling of protein interaction networks

Similar documents
2. Yeast two-hybrid system

CHAPTER 13 PROKARYOTE GENES: E. COLI LAC OPERON

Eukaryotic Gene Expression

Proteomics. Yeast two hybrid. Proteomics - PAGE techniques. Data obtained. What is it?

2. What was the Avery-MacLeod-McCarty experiment and why was it significant? 3. What was the Hershey-Chase experiment and why was it significant?

Introduction to Molecular and Cell Biology

Bioinformatics 2. Yeast two hybrid. Proteomics. Proteomics

2012 Univ Aguilera Lecture. Introduction to Molecular and Cell Biology

Introduction. Gene expression is the combined process of :

7.06 Problem Set #4, Spring 2005

BME 5742 Biosystems Modeling and Control

Newly made RNA is called primary transcript and is modified in three ways before leaving the nucleus:

UNIT 5. Protein Synthesis 11/22/16

(Lys), resulting in translation of a polypeptide without the Lys amino acid. resulting in translation of a polypeptide without the Lys amino acid.

1. In most cases, genes code for and it is that

Control of Gene Expression in Prokaryotes

Bacterial Genetics & Operons

Lecture 2: Read about the yeast MAT locus in Molecular Biology of the Gene. Watson et al. Chapter 10. Plus section on yeast as a model system Read

Lecture 18 June 2 nd, Gene Expression Regulation Mutations

Types of biological networks. I. Intra-cellurar networks

4. Why not make all enzymes all the time (even if not needed)? Enzyme synthesis uses a lot of energy.

GCD3033:Cell Biology. Transcription

Multiple Choice Review- Eukaryotic Gene Expression

UNIT 6 PART 3 *REGULATION USING OPERONS* Hillis Textbook, CH 11

Name Period The Control of Gene Expression in Prokaryotes Notes

Lesson Overview. Gene Regulation and Expression. Lesson Overview Gene Regulation and Expression

Number of questions TEK (Learning Target) Biomolecules & Enzymes

Boolean models of gene regulatory networks. Matthew Macauley Math 4500: Mathematical Modeling Clemson University Spring 2016

Computational Biology: Basics & Interesting Problems

3.B.1 Gene Regulation. Gene regulation results in differential gene expression, leading to cell specialization.

Prof. Fahd M. Nasr. Lebanese university Faculty of sciences I Department of Natural Sciences.

Biology I Fall Semester Exam Review 2014

Chapters 12&13 Notes: DNA, RNA & Protein Synthesis

Reading Assignments. A. Genes and the Synthesis of Polypeptides. Lecture Series 7 From DNA to Protein: Genotype to Phenotype

Honors Biology Fall Final Exam Study Guide

1. Contains the sugar ribose instead of deoxyribose. 2. Single-stranded instead of double stranded. 3. Contains uracil in place of thymine.

CHAPTER : Prokaryotic Genetics

Gene Regulation and Expression

Biology I Level - 2nd Semester Final Review

Related Courses He who asks is a fool for five minutes, but he who does not ask remains a fool forever.

Three different fusions led to three basic ideas: 1) If one fuses a cell in mitosis with a cell in any other stage of the cell cycle, the chromosomes

Chapter 17. From Gene to Protein. Biology Kevin Dees

Biology 112 Practice Midterm Questions

Topic 4 - #14 The Lactose Operon

DNA Technology, Bacteria, Virus and Meiosis Test REVIEW

From gene to protein. Premedical biology

Translation Part 2 of Protein Synthesis

Controlling Gene Expression

13.4 Gene Regulation and Expression

The geneticist s questions. Deleting yeast genes. Functional genomics. From Wikipedia, the free encyclopedia

Name: SBI 4U. Gene Expression Quiz. Overall Expectation:

The geneticist s questions

PROTEIN SYNTHESIS INTRO

Texas Biology Standards Review. Houghton Mifflin Harcourt Publishing Company 26 A T

CHAPTER 3. Cell Structure and Genetic Control. Chapter 3 Outline

Bio 119 Bacterial Genomics 6/26/10

Mitosis vs Meiosis. Mitosis and Meiosis -- Internet Tutorial

Protein Synthesis. Unit 6 Goal: Students will be able to describe the processes of transcription and translation.

From Gene to Protein

Optimization of Immunoblot Protocol for Use with a Yeast Strain Containing the CDC7 Gene Tagged with myc

Biology EOC Review Study Questions

AP Bio Module 16: Bacterial Genetics and Operons, Student Learning Guide

Biology 105/Summer Bacterial Genetics 8/12/ Bacterial Genomes p Gene Transfer Mechanisms in Bacteria p.

15.2 Prokaryotic Transcription *

Complete all warm up questions Focus on operon functioning we will be creating operon models on Monday

networks in molecular biology Wolfgang Huber

Computational Cell Biology Lecture 4

GENE REGULATION AND PROBLEMS OF DEVELOPMENT

Regulation of Gene Expression in Bacteria and Their Viruses

Analysis of Escherichia coli amino acid transporters

Chapter 12. Genes: Expression and Regulation

Protein Synthesis. Unit 6 Goal: Students will be able to describe the processes of transcription and translation.

Chapter 16 Lecture. Concepts Of Genetics. Tenth Edition. Regulation of Gene Expression in Prokaryotes

Honors Biology Reading Guide Chapter 11

Types of RNA. 1. Messenger RNA(mRNA): 1. Represents only 5% of the total RNA in the cell.

RNA Synthesis and Processing

Big Idea 3: Living systems store, retrieve, transmit and respond to information essential to life processes. Tuesday, December 27, 16

Lesson Overview. Ribosomes and Protein Synthesis 13.2

What is the central dogma of biology?

Videos. Bozeman, transcription and translation: Crashcourse: Transcription and Translation -

Study Guide: Fall Final Exam H O N O R S B I O L O G Y : U N I T S 1-5

Topic 8 Mitosis & Meiosis Ch.12 & 13. The Eukaryotic Genome. The Eukaryotic Genome. The Eukaryotic Genome

Regulation of Gene Expression

Gene Switches Teacher Information

BIOLOGY STANDARDS BASED RUBRIC

Designer Genes C Test

The Making of the Fittest: Evolving Switches, Evolving Bodies

Bi 1x Spring 2014: LacI Titration

Chapter 15 Active Reading Guide Regulation of Gene Expression

Full file at CHAPTER 2 Genetics

Predicting Protein Functions and Domain Interactions from Protein Interactions

Variation of Traits. genetic variation: the measure of the differences among individuals within a population

The Gene The gene; Genes Genes Allele;

Organization of Genes Differs in Prokaryotic and Eukaryotic DNA Chapter 10 p

Gene Control Mechanisms at Transcription and Translation Levels

UNIVERSITY OF YORK. BA, BSc, and MSc Degree Examinations Department : BIOLOGY. Title of Exam: Molecular microbiology

Notes Chapter 4 Cell Reproduction. That cell divided and becomes two, two become four, four become eight, and so on.

Measuring TF-DNA interactions

Gene expression in prokaryotic and eukaryotic cells, Plasmids: types, maintenance and functions. Mitesh Shrestha

7.06 Cell Biology EXAM #3 April 21, 2005

Transcription:

Analysis and modelling of protein interaction networks - A study of the two-hybrid experiment Karin Stibius Jensen

Analysis and modelling of protein interaction networks - A study of the two-hybrid experiment 1th May 24 Masters thesis by Karin Stibius Jensen

On the front page we have a network presentation of the protein-protein interactions in yeast [22]. Each point represents a different protein and each line indicates that two proteins are capable of binding to one another. Only the largest cluster, which contains 78 % of all proteins, is shown. The color of a node signifies the phenotypic effect of removing the corresponding protein (red, lethal; green, non-lethal; orange, slow growth; yellow, unknown). 2

Acknowledgments This thesis was done at the Niels Bohr Institute. I would like to thank my supervisor, Kim Sneppen, for suggesting the subject and making it possible for me to work on the project. Also I would like to thank Ph.D. student Jacob Bock Axelsen for comments and discussions, and cand.scient Mette Rasmussen for great support when things did not go my way. A special thanks to NOVO who s scholarship made everyday life a little easier. Last by not least to my family and my boyfriend Thomas thank you for putting up with me through periods of frustration and supporting me throughout the whole process. 1th May 24 Karin Bagger Stibius Jensen 3

Abstract In this thesis I have studied protein interactions networks. Most of the work was done on a large scale data set from a two-hybrid experiment done by Ito et. al. in Japan[1]. The two-hybrid experiment finds protein-protein interactions between manufactured hybrid proteins in the yeast, S. Cerevisiae. The test is between two hybrid proteins. One hybrid is the binding domain of the transcriptional activator GAL4 fused with a protein (the bait). The other hybrid protein is the RNA-polymerase II activation domain of GAL4 fused with another protein (prey). Because of the bait-prey experimental setup the resulting network is examined as a directed network (bait prey). We find a clear asymmetry in this network, where proteins working as bait does not interact with the same proteins as when they are working as prey. Further the asymmetry of the data can be quantified in terms of a systematic tendency for proteins acting as bait to have larger connectivities than proteins working as prey. In the thesis I have investigated two possible scenarios for the asymmetry, by developing a biochemical model for the protein-dna and protein-protein bindings inside the living yeast. One scenario assumes a background activity of bait proteins acting even without the prey, the other scenario explore the asymmetry in the chemistry associated with the bait being automatically located in the right position on the DNA. We conclude that the latter model gives the best description of the observed asymmetry.

Contents summary The subject of this thesis is protein-protein interactions from a biophysical point of view. I have divided the thesis into three parts: Part I : Introduction. Chapter 1: Yeast. First I give an introduction to the model organism, the yeast, Saccharomyces cerevisiae, from which we have the protein-protein interaction data we analyze. I shall revise basic cell physiology, DNA, proteins, transcription, translation and regulation. As a background for the understanding of the experimental setup I shall also look at the different yeast cell types (haploid and diploid), the mating of yeast, different yeast strains, nutritional markers and insertion of plasmids. Chapter 2: The two-hybrid experiment. The main experimental data we have examined is from one large scale two-hybrid experiment, [1]. In this chapter I explain the two-hybrid technique and the underlying ideas involving bait and prey hybrid proteins as well as selection by nutritional markers. There are two labs in the world, [1] and [2], that have created large scale data sets of protein-protein interactions in Saccharomyces cerevisiae from using the two-hybrid technique. I shall describe their approach and give an introduction to the data they find. As a comparison to the two-hybrid system I introduce another technique able to give large scale protein interaction data. This technique uses mass spectrometry to the identification of protein complexes. Chapter 3: Networks. The data give us a network where the proteins are nodes and the interaction between two proteins is a link. I introduce some techniques to analyze a network: Finding connectivity, power distributions, randomizing and finding specific structures and properties of the network. 3

Part II : Data analysis. Chapter 4: Two-hybrid data. Here I describe and analyze the two-hybrid data from the lab of Takashi Ito et al. in Japan [1]. Particular emphasis will be laid upon describing the asymmetry between the bait and prey in the experiment and the triangle connections found in the data set. Chapter 5: Mass Spectrometry data. Here the data set from Yuen Ho el al. is presented and analyzed, and I try to compare the results to that of the two-hybrid data. Part III : Modelling the two-hybrid system Chapter 6: Introduction to the models. It is briefly discussed why it is interesting to look at a model. I describe how I generated scale-free networks, and present one of these. Also I present how we use a scale-free network to simulate protein concentrations in a cell. Chapter 7: The models. I present 3 models all based upon the chemistry of the binding of molecules to DNA. The first model is symmetric in bait and prey and takes into account the chemistry of binding of the bait-prey complex to DNA. In the second asymmetry is created from adding a random firing term in the equations from the symmetric model. In the third model asymmetry is created from the chemistry of individually binding bait to DNA and binding prey to bait. All models are analyzed following the same principles as in the two-hybrid data and finally compared to the results of this data. Chapter 8: Discussion and future prospects. I sum up the results and discuss the reliability of the two-hybrid data. The different hypotheses we have for the asymmetry seen in the twohybrid experiment are compared to the data. Chapter 9: Conclusion. The main results of the thesis are presented in a final conclusion. Appendix A: Change of parameters in the models. Throughout the modelling in chapter 6 and 7 I have used the same few parameters. Here I show a few graphs using different parameters for the real and model networks. 4

Appendix B: Matlab programs. A few remarks about the matlab programs I have written. Appendix C: Glossary. Here is presented a short explanation of words in the text marked the sans serif front. 5

6

Contents 1 Yeast 15 1.1 The central dogma........................ 16 1.2 Regulation of transcription.................... 17 1.3 The yeast cell........................... 18 1.3.1 Yeast strains....................... 18 1.3.2 Markers.......................... 19 1.3.3 Replication and mating.................. 2 1.4 Plasmid.............................. 22 2 The two-hybrid experiment 25 2.1 The two-hybrid method..................... 26 2.1.1 Creating the hybrid proteins............... 28 2.1.2 Hybrid proteins in yeast................. 29 2.2 The high-throughput approach.................. 3 2.2.1 The experiment...................... 3 2.2.2 Summation........................ 31 2.3 Other experiments on protein interactions........... 32 2.3.1 Mass spectrometry.................... 32 3 Networks 35 3.1 Random networks......................... 36 3.1.1 Naturally occurring networks.............. 36 3.2 Analyzing networks........................ 39 3.2.1 Connectivity........................ 41 3.2.2 Connectivity distribution and scaling.......... 43 3.2.3 Correlation between bait and prey............ 44 3.3 Structure in networks....................... 46 3.3.1 Motifs........................... 46 3.4 Randomization of networks.................... 48 3.4.1 Keeping connectivity................... 48 3.4.2 Table of values...................... 51 7

4 Two hybrid data 55 4.1 The Ito data............................ 55 4.2 Data analysis........................... 56 4.2.1 Bait and prey connectivity................ 57 4.2.2 Connectivity distribution................. 61 4.3 The triangle motif........................ 65 5 Mass spectrometry data 71 5.1 Data analysis........................... 72 5.1.1 Bait and prey connectivity................ 74 5.1.2 Connectivity distribution................. 76 5.1.3 Triangles in the mass spectrometry network...... 76 6 Introduction to the models 81 6.1 Simulating the network...................... 82 6.1.1 Triangles in simulated networks............. 86 6.2 Creating a protein interaction network............. 88 7 The models 91 7.1 Effect from DNA binding..................... 93 7.2 The symmetric approach (Model 1)............... 94 7.2.1 Applying the symmetric model............. 95 7.3 Asymmetry from random firing (Model 2)............................. 99 7.3.1 Applying the random firing model............ 99 7.4 Asymmetry from DNA binding (Model 3)............................. 15 7.4.1 Applying the DNA binding model............ 16 8 Discussion and future prospects 113 9 Conclusion 117 A Change of parameters in the models 119 A.1 Random firing........................... 119 A.1.1 γ = 2.5, changing the threshold values......... 119 A.1.2 γ = 1.5.......................... 122 A.1.3 γ = 2.1.......................... 125 A.1.4 γ = 3........................... 128 A.2 DNA binding........................... 131 A.2.1 γ = 2.5.......................... 131 A.2.2 γ = 1.5.......................... 134 8

A.2.3 γ = 2.1.......................... 137 A.2.4 γ = 3........................... 14 B Matlab programs 143 C Glossary 145 9

1

Part I: Introduction 11

12

Introduction Experimental techniques arising in the late 197 s have made it possible to sequence DNA. The genome of several organisms, among other humans, have been sequenced. Many genes in the genome encode proteins of unknown function. Knowing the function of all proteins in a specific biological process is essential in order to understand the details of the process. Often, not only the protein s function in a particular process is unknown, but also in which process the protein participates. The function of a protein is more easily determined if it is known where to look for it. Here the method of two-hybrid interactions can be useful. If a novel protein is found to interact with know proteins in a specific process, there is good reason to look for the novel protein in that process by other means with the hope of discovering it s function. It is therefor important to find criteria for selecting the most probable interactions of biological importance. 13

14

Chapter 1 Yeast Yeast is an eukaryotic organism, see figure 1.1. This means that the cell structure of yeast is largely the same as the cell structure of humans. There are many advantages in studying yeast compared to studying a human cell. For one the yeast genome is much smaller than the human genome. Both the yeast and the human genome have been sequenced. In 1996 the yeast genome was sequenced, and over 6 open reading frames (ORF s) coding for approximately 6 proteins were found. This can be compared to the human genome project, which found approximately 3. genes in a human coding for an estimated 6.-1. different proteins. Besides the advantage of working with a smaller genome yeasts are also suited for studying because: Rapid growth, approximately 9-14 min doubling time depending on nutrition. Easy replica plating. Mutant strains can easily be identified. S. cerevisiae is viable with many different markers. Yeast is nonpathogenic and can therefore be handled without special precaution. Unlike many other microorganism Saccharomyces Cerevisiae has both a stable haploid and a stable diploid state that can be used for genetic manipulation. 15

1.1 The central dogma Chapter 1 Prokaryote Eukaryote Gene Ribosome DNA mrna Transcription Translation Transcription DNA RNA RNA Pre-mRNA Processed mrna DNA DNA Nucleus Translation RNA Protein Cytoplasm mrna Protein Ribosome Protein Figure 1.1: The figure is modified from [3]. On the left we have a schematic representation of the central dogma of a procaryotic cell. The prokaryotic cell has no nucleus, which means that the transcription process from DNA to mrna, the translation process from mrna to Protein and the subsequent folding of the protein takes place in the same medium within the cell. On the right we see the schematic representation of an eukaryotic cell. Here the transcription from DNA to pre-mrna takes place inside the cell nucleus. Inside the nucleus the pre-mrna is processed to yield m-rna ready for transport out of the nucleus to be translated into protein. The presence of pre-mrna allows for variation of produced proteins on a RNA level, where the same pre-mrna strands can result in different processed mrna. 1.1 The central dogma The basic structure of an eukaryotic cell compared to a prokaryotic cell can be seen in figure 1.1. Unlike the prokaryotic cell the eucaryotic cell is divided into compartments, most important for our study is the nucleus where the chromosomes are located. The transcription from DNA to RNA takes place inside the nucleus, see figure 1.1. The process described in figure 1.1 is called the central dogma of molecular biology. It shows the process from DNA to RNA to Protein. First RNA-polymerase needs to bind to the DNA strand in order to 16

1.3 The yeast cell Chapter 1 transcribe the DNA to mrna. The site on the DNA strand where RNApolymerase binds is called the promoter site. Once RNA-polymerase is bound to DNA it translates the bases, A,T,G and C on the DNA strand into the bases, A, U, G and C on a mrna strand. In eukaryotic cells the pre-mrna is processed by splicing before reaching the final stage ready for transport and translation. The process of transcription can be regulated by proteins that can bind to the DNA and either help RNA-polymerase to bind to the promoter site (activator) or hinder the RNA-polymerase to bind to the promoter site (repressor). The site where the activator or repressor proteins bind is called the promotor proximal element. The process will be explained in greater detail later. The processed mrna strand is transported out of the nucleus for translation into protein. The translation is done by the ribosomes. A set of three bases (a codon) is translated into one of 2 amino acids. The strand of amino acids is the protein encoded by the gene in question. Finally the protein folds into a 3-D structure and can then carry out its intended function in the cell. 1.2 Regulation of transcription Proteins are essential to life. They are parts of the structure building the cells, they function as pores in the membranes, they catalyze reactions (enzymes) and they regulate functions like transcription. Also they function in big complexes like the ribosomes and RNA-polymerase. Some proteins are as mentioned able to bind to the DNA strand and regulate transcription of a particular gene, these proteins are called transcriptional regulators. They can do this so the gene is either transcribed in larger amounts (activator) or smaller amounts (repressor). See figure 1.2. Sometimes a protein actually needs to bind to the DNA for the transcription process to work. These transcriptional activators can for example function by having a domain that binds strongly to a specific sequence on the DNA, and another domain that helps RNA-polymerase to bind to the DNA promoter sequence and starts transcription. Without the transcriptional activator the specific gene will not be transcribed. An example of such a transcriptional activator is Gal4 in yeast, which we shall look more closely at when discussing the two-hybrid system. 17

1.3 The yeast cell Chapter 1 Transcriptional activator RNA polymerase mrna DNA xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Promoter proximal element Promoter site Gene to be transcribed RNA polymerase Transcriptional repressor DNA xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Promoter proximal element Promoter site Gene to be transcribed Figure 1.2: Seen are examples of how proteins binding to DNA can change the transcription rate of a gene. Transcriptional activators can function by binding to the DNA promotor proximal element and help RNA-polymerase to bind to a specific promotor site thereby enhancing the rate of transcription of that particular gene. Transcriptional repressors can function by binding to the DNA in a way that blocks for the binding of RNA-polymerase and the transcription of the particular gene. 1.3 The yeast cell The laboratory yeast strain S288C of Saccharomyces Cerevisiae was the first eucaryote to have its entire genome sequenced. The project started in 1989 and finished in the spring of 1996 and was the work of more than a 1 laboratories from Europe, USA, Canada and Japan. The result found was that the total nuclear genome of S. Cerevisiae contain 16 different, well characterized chromosomes including more than 13 million bases. In the nuclear genome over 6 ORF s have been extracted where approximately 33 code for known proteins. The size and composition of the haploid and diploid cells can be seen in table 1.1 1.3.1 Yeast strains There are many different strains of S. Cerevisiae. No real wild type strain of S. Cerevisiae is used in genetic studies, but the strain S288C is often 18

1.3 The yeast cell Chapter 1 Characteristic Haploid cell Diploid cell Size 4 µm 5 6 µm Shape spheroid ellipsoid Volume 7 µm 3 12 µm 3 Composition: Wet weight 6 pg 8 pg dry weight 15 pg 2 pg DNA.17 pg.34 pg RNA 1.2 pg 1.9 pg Protein 6 pg 8 pg Table 1.1: From [4]. Shown in the table are some values of physical characteristics for the haploid and diploid cells of S. Cerevisiae. used as normal standard because it s genome has been sequenced. Other strains contain mutations in one or a number of genes compared to S288C. Characteristic trades for a specific strain can for example be temperature sensitivity, mating type, dependance of a specific nutrition or immunity to some toxin. There are databases that characterize deletions strands where one of the nonessential ORF s is deleted, and the phenotype of this new strand is characterized. From the Saccharomyces Genome Deletion Project deletion of 95 % of the approximately 62 ORF s has been done, which means that more than 2. different deletion strands are available. Important for this thesis are the strains with specific mating type which we will describe later and the presence of nutritional markers. 1.3.2 Markers Different markers in strains are very useful in making sure that only a specific strain or a number of strains with a specific property are present in the experiment. Markers can be very different. Nutritional markers are useful since there are several different markers that only require the presence or absence of a single gene. Examples of nutritional markers can be strains lacking functional TRP1 or LEU2. Some markers are able to perform both negative and positive selection. This means that the mutant carrying the marker will survive under some external condition where a strain without the marker will die, and under other conditions the strain with the marker will die and the strain without the marker will survive. Other markers distinguish themselves by showing a specific phenotype (color, growth pattern etc.) different from the strains without the marker. 19

1.3 The yeast cell Chapter 1 Beside using markers it is also useful to look at reporter genes. These are genes that are only expressed if a certain biological process is functioning. This will often be a gene inserted in a place in the genome regulated by a certain transcription factor. ADE2, HIS3 or URA3 can be used as reporter genes. Such strains will be unable to grow in media laking adenine, histidine and uracil respectively unless the reporter genes are activated. Sometimes the reporter genes are genes from other organisms. It could for example be a gene that codes for the green fluorescent protein, that, as the name implies, will make the cell green and fluorescent. 1.3.3 Replication and mating S. Cerevisiae replicates by budding in which the daughter cell is initiated as an outgrowths from the mother cell, see figure 1.3. Both the haploid and Figure 1.3: From [5]. Seen is an electron microscope wiew of a diploid cell of S. Cerevisiae on the verge of replication by budding. the diploid state of S. Cerevisiae can be maintained through cell cycles and replication. The bud on haploid cells appears adjacent to the previous one, whereas the bud on diploid cells appears on the opposite pole. Each mother cell, whether haploid or diploid, usually form no more than 2-3 buds in it s lifetime. This also means that the age of a particular cell can be counted by the number of scars left on the cell wall. The haploid yeast cell have one of two mating types, either a or α. Which is decided by which of the genes, MATa or MATα, is expressed in the cell. Haploid yeast produce mating factors and pheromone receptors. See figure 1.4. The a cells produce a factor and α receptors, and the α cells produce α factor and a receptors. When mating factors bind to the receptors on the 2

1.3 The yeast cell Chapter 1 surface of the haploid cells, mating will be induced between haploid cells of opposite type, and this mating will result in the formation of a diploid cell. With enough nutrition the diploid cell will undergo mitotic growth by budding. Under starvation conditions the diploid cell will undergo meiosis and form an ascus with four haploid ascospores, two of each mating type. These can either mate to form two diploid cells or, if separated from each other, each of the four ascospores can grow as a haploid strain. Vegetative growth Haploid MATa a a factor α receptor Haploid MATα Mixing and mating α factor Vegetative growth α a receptor Diploid MATa/MATα Vegetative growth a/α Non mating Ascospore Upon stavation Meiosis sporulation Ascospore Separation Separation Acsus Figure 1.4: Modified from [6]. Seen is the mating scheme of S. Cerevisiae where two haploid cells of different mating types can be mated and yield a diploid cell. Under good nutritional conditions the diploid cells will replicate into a colony of diploid cells. Upon starvation however the cells will sporulate. Physical separation of ascospores of different mating type makes it possible to grow haploid colonies of each mating type thus returning to the top of the figure. It should be mentioned that maintaining a specific mating type under replication requires a mutation in the HO allele leaving the cells homothallic. In homothallic strains, like wild type yeast cells and other strains that contain the HO allele, the haploid cells will switch mating type with some frequency depending on the strain. This means that the haploid cells can only be maintained for a certain period before some switch mating type and mate. Most strains used in the laboratory do not contain a working HO allele and are thereby heterothallic strains that do not switch mating type. These 21

1.4 Plasmid Chapter 1 strains can be maintained stable as both haploid and diploid cells, and the haploid cells can mate only if an a cell is brought in contact with an α cell. Controlled crosses of MATa and MATα haploid strains can be carried out by mixing similar amounts of each strain. To test for only the diploid cells one often choose to mate haploid strains with different nutritional markers so that only the diploid cells that contain markers from both strains are selected. 1.4 Plasmid A Plasmid is a small circular double strand of DNA. Plasmids are present inside the cells but replicate independently of the DNA in the chromosomes. Often plasmids are found inside bacteria cells from where they can be extracted and transferred to another cell. Techniques of molecular biology have made it possible to manipulate these plasmids to produce any gene by inserting the gene sequence into the plasmid DNA. These manipulated plasmids can be introduced into a host organism like yeast and use the yeast s own translation and transcription system to produce the genes encoded in the plasmid DNA. Some plasmids are able to incorporate themselves into the hosts genome, while other types of plasmids will stay as circular strands of DNA in the cell. Plasmids are very useful. This is for example the approach used when producing insulin for humans with diabetes. A plasmid encoding the gene for insulin is incorporated into an E-coli bacterial cell thereby making it possible for the E-coli cell to produce insulin. In yeast plasmids can be used to make a protein that the cell needs but are unable to produce for example because of some mutation in the genome. If a yeast strain has a mutation that makes it impossible for it to make a specific protein needed for survival under certain conditions, we can incorporate a plasmid encoding the gene for the protein making the cell able to survive. Also it is possible to test if a different protein is able to work instead of the original one. Another useful procedure involving plasmids is in making hybrid proteins. See figure 1.5. Here a hybrid protein is produced form transcribing a composite gene from a plasmid. The gene in figure 1.5 is composed of a part of the gene for the GAL4 protein from yeast and a gene for another protein of interest. The two genes are placed next to each other without any termination sequence between them, so the hybrid protein is produces as one long polypeptide chain. 22

1.4 Plasmid Chapter 1 Animal cell Bacterium Plasmid DNA Gene of interest Gal4 bindingdomain sequence Gal4 BD Protein of interest Bait Figure 1.5: From [7]. Schematic representation of the creation of a plasmid barring the gene sequence of a hybrid protein. The gene sequence of an animal cell (or another cell) and the Gal4 binding domain sequence are placed adjacent to each other in the plasmid. The plasmid will contain a promoter sequence on one site of the composite gene and a termination sequence on the other side of the composite gene. Thus when transfected into a cell the cell will transcribe the composite gene into one protein. 23

1.4 Plasmid Chapter 1 24

Chapter 2 The two-hybrid experiment The two-hybrid experiment is used for studying protein-protein interactions. With protein-protein interactions we refer to the physical binding of two proteins typically by hydrogen bonding and electrostatic forces, but also van der Waals forces and hydrophobic forces can play some role. Studying this is interesting because protein binding is very important for the cell functions. Many proteins need to bind in complexes in order to perform their task, which could be the regulation of transcription or opening or closing of some transport channel in the membrane. Protein binding is also used to mediate signals through the cell. When one protein binds to another there may be a change of conformation, so that the protein after binding is able to bind to a site it was not able to bind to before. Change of conformation can either be by changing the folding of the protein or by adding or subtracting some functional group from the protein. Some proteins perform only one specific task in the cell, whereas others mediate signals between different parts of the cell, for example to synchronize two biological processes. There are many proteins with functions unknown to us. In order to study these proteins it may be very useful to look for interactions between them and proteins of known function. When screening for protein-protein interactions one occasionally finds that a protein previously associated with a specific biological process also interacts with proteins not associated with this process. This might indicate a relation between the different biological processes that is necessary for the cell or organism to work as a whole. The two-hybrid method for detecting protein-protein interactions was first developed in 1989,[8], [9]. Since then the method has been modified so that it is now possible to use it in a high throughput automated experiment, where interactions of thousand of proteins are tested. 25

2.1 The two-hybrid method Chapter 2 2.1 The two-hybrid method The yeast two-hybrid system builds on the nature of the transcriptional activator GAL4. The GAL4 protein of saccharomyces cerevisiae is required for the expression of genes encoding enzymes needed in galactose utilization, GAL1 and GAL1. The protein consists of two domains that are separable. One domain (N-terminus) binds specific to a DNA sequence, and the other domain (Cterminus) activates transcription by recruiting RNA-polymerase II. Native GAL4 usually binds to the DNA strand as a dimer. However, most of the time we will consider the activation of transcription to be caused from the binding of a single molecule. Figure 2.1: From ref [1]. Shown is the schematic representation of the GAL4 dimer bound on DNA. The native GAL4 protein is 881 bases long where residues 1-147 (1-98) is the DNA binding domain, and residues 148-196 and 768-881 is the activation domains that can work independent of each other. The idea of the two-hybrid experiment, see figure 2.2, is to take the binding domain, BD, of GAL4 and fuse 1 a protein, X, and the activation domain, AD of GAL4 and fuse a protein, Y. The hybrid protein consisting of GAL4 BD and X is called the bait protein, and the hybrid protein with GAL4 AD fused with Y is called the prey protein. In one experiment we need to create two hybrid proteins, one bait and one prey, thus the name of the experimental method. Introducing the two hybrid proteins into a yeast strain with no native GAL4 makes it possible to detect whether the two proteins, X and Y, interact or not. When the bait protein is present in the cell it can bind to the same DNA sequence as the normal GAL4 protein. This is because the bait protein contains the part of the GAL4 protein that actually binds to the DNA sequence 1 Further explanation of the process can be found in section 2.1.1 26

2.1 The two-hybrid method Chapter 2 Binding domain, BD X + Y Activation domain, AD AD Y X BD Promoter Reporter Figure 2.2: From [7] A Schematic presentation of the two-hybrid experiment (the BD). The bait protein alone on the other hand, can not activate transcription, because it is missing the part of the GAL4 protein that performs this task (the AD). The prey hybrid protein can activate transcription by helping RNApolymerase II to bind to the promotor site of a specific gene because of the presence of the AD domain from native GAL4. The prey protein however does not contain the GAL4 BD and are therefore not able to bind to the specific DNA sequence. Without binding to the right DNA sequence it is not possible to direct RNA-polymerase to the right promoter, and transcription of the reporter gene will not be initiated. If the bait and prey proteins interact, the binding and activation domains of GAL4 come close enough to activate transcription as if it was native GAL4. If the bait and prey proteins do not bind there is nothing to activate transcription, and the reporter genes are not expressed. The expression of the reporter genes are necessary for galactose utilization, and the yeast cell will therefore not be able to survive in a medium where this is the main nutrient. Alternatively other reporter genes can replace the genes naturally placed in the ORF. This can for example be genes encoding a specific dye, and in this way the level of expression can be quantified by the intensity of color. The reporter genes could also be other genes necessary for survival in the presence or absence of some nutrient (for example ADE2, HIS3 or URA3 see section 27

2.1 The two-hybrid method Chapter 2 1.3.2). Often several reporters are used in one experiment to insure positive selection. 2.1.1 Creating the hybrid proteins The hybrid proteins are made by creating two plasmids, see figure 2.3. One expresses the bait protein of interest. This is the gene sequence for the GAL4 BD fused with the X protein. The start codon of this hybrid is in front of the GAL4 BD, and the stop codon is after the X protein. In this way the hybrid protein is one long polypeptide chain that folds up to the final bait hybrid. Likewise another plasmid expresses the prey hybrid protein where the GAL4 AD is fused with the Y protein of interest. The hybrid proteins are assumed to fold up into two domains, one with the structure of the GAL4 part and one with the structure of the fused protein (X or Y). X Gal4 binding domain TRP1 encodes NH 2 Gal4 BD X COOH Bait plasmid Y Gal4 activation domain LEU2 encodes NH 2 Gal4 AD Y COOH Prey plasmid Figure 2.3: From [7] Schematic representation of the bait and prey plasmids. TRP1 and LEU2 are nutritional markers for selection in yeast to insure that the plasmids are indeed present in the yeast cell. The plasmids carry nutritional markers for selection in yeast and E. coli 2. 2 Usually the plasmids are purified in E. Coli before they are transferred to yeast. Markers in E. Coli are typically Amp r or Kan r which makes the E. coli cell able to live in a medium containing ampicillin or kanamycin respectively 28

2.2 The high-throughput approach Chapter 2 The nutritional markers are introduced to ensure that the plasmid indeed is present in the cell in question. One approach to make these plasmids can be seen in figure 1.5. Here the plasmids are created and then introduced into E. coli cells. The plasmids contain a marker for selection in E. coli, a marker for selection in a strain of yeast and the hybrid protein in question. The E. coli cells are plated on a medium containing for example ampicillin if the E. coli marker was Amp r, so only cells containing the plasmid will survive for further growth. In this way the E. coli cells with the specific plasmid are purified and ready for transfection into a yeast host. 2.1.2 Hybrid proteins in yeast Haploid yeast comes with one of two different mating types, haploid a or haploid α. These can be mated to a diploid a/α yeast cell, see section 1.3.3 on page 2. A strain of haploid a yeast, S. cerevisiae, is transfected with the bait plasmid, and a strain of haploid α yeast, S. cerevisiae, is transfected with the prey plasmid. Yeast cells exposed to the bait plasmid can be plated on a medium positively selecting for the specific yeast marker on the bait plasmid, and visa versa for the yeast cells transfected with the prey plasmid if one wants to insure that at least one plasmid indeed is present in the cells. The MATa and the MATα strains are mated to form a diploid yeast cell containing both a bait and a prey plasmid. Here the presence of both plasmids can be confirmed by plating the cells on a medium selecting for both the yeast marker in the bait plasmid and the yeast marker in the prey plasmid. With the diploid yeast it is now possible to select for the reporter genes by plating the cells on a medium that requires the function of these. As mentioned in section 1.3.2 and section 2.1 there are often several reporters. They may be native genes, or the yeast strain may be mutated to include reporters in particular ORF s. When plated on a medium selecting for the specific reporters present in the yeast cell, the presence of the proteins encoded by the reporter genes signals that the X protein of the bait hybrid binds to the Y protein of the prey hybrid. When a positive interaction between bait (X) and prey (Y) proteins is found the DNA of the two proteins is extracted and sequenced by PCR. These sequences are referred to as Interaction Sequence Tags (ISTs). Notice that one experiment tests for interaction between two specific hybrid proteins one bait and one prey. 29

2.2 The high-throughput approach Chapter 2 2.2 The high-throughput approach 2.2.1 The experiment By performing the experiment with many different combinations of bait and prey pairs a network of interactions can be found. Such a network of proteinprotein interaction data has been created by Takashi Ito and presented in 21 [1]. Of the approximately 6 proteins in the yeast, S. cerevisiae, 3278 proteins were tested and 4549 interactions were found between them. Figure 2.4: From [1]. Outline of the comprehensive two-hybrid analysis. T. Ito et. al. cloned almost all yeast ORFs individually as a DNA-binding domain fusion (bait) in a MATa strain and as an activation domain fusion (prey)in a MATα strain, and subsequently divided them into pools, each containing 96 clones. These bait and prey clone pools were systematically mated with each other, and the diploid cells formed were selected for the simultaneous activation of tree reporter genes (ADE2, HIS3 and URA3) followed by sequence tagging to obtain ISTs. The test for interactions in this huge system requires close to 3.5 1 7 combinations. This was done by making 62 bait pools and 62 prey pools. In each of the bait pools there are 92 yeast clones each with a different bait plasmid. Likewise in each of the 62 prey pool there are 92 yeast clones with different prey plasmid (2.4). The yeast cells that activate transcription in the absence of any interaction partners have already been removed from the experiment. 3

2.2 The high-throughput approach Chapter 2 Each bait pool was systematically mated with each prey pool by mixing the cells in the two pools. Because mating occurs between cells of opposite mating type all diploid yeast cells contain both a bait and a prey expressing plasmid. Once mated the cells were plated onto a medium lacking adenine, histidine and uracil. The yeast cells were spread out so the colonies would grow separately of each other. Only yeast cells with bait and prey hybrids binding are able to express the genes needed for survival. The survivors from this primary selection were then transferred to a second medium reconfirming the activation of the three reporter genes (ADE2,HIS3 and URA3) and a fourth gene, MEL1, also regulated by GAL4. The positive clones from this final test were then identified by sequencing (interaction sequence tags, IST s). 2.2.2 Summation Creation of bait plasmid. Example markers: Kan r resistance to kanamycin in E. coli hosts. TRP1 amino acid metabolism and tryptophan biosynthesis in yeast. Creation of prey plasmid. Example markers: Amp r resistance to ampicillin in E. coli hosts LEU2 leucine biosynthesis. Transfection of Mata yeast strain with bait plasmid Example of yeast strain: PJ69-2A MATa, GAL2::ADE2, GAL1::HIS3, trp1, leu2, his3, ade2, gal4, gal8. Mata mating type a. Transfection of Matα yeast strain with prey plasmid Example of yeast strain: MaV24K MATα, SPAL1::URA3, UAS- GAL1::HIS3, GAL1::lacZ, trp1, leu2, his3, ade2 ::kanmx, gal4, gal8. Matα mating type α. Mating of Mata and Matα cells to yield a diploid Mata/α. Plating diploid cells on a medium selecting for diploid cell with active reporter genes, ADE2, HIS3 and URA3. Positive colonies were plated onto another medium again testing for function of the three reporter genes plus testing for function of a fourth reporter, MEL1, which is another target of the transcription factor GAL4. 31

2.3 Other experiments on protein interactions Chapter 2 Surviving colonies were sequenced by PCR to identify which particular hybrid proteins were present in the cell to obtain the identification sequence tag (IST), a pair of tag sequences for bait and prey. This gives a large data set with recorded interactions of bait and prey proteins. 2.3 Other experiments on protein interactions The two-hybrid experiment is not the only method for detecting proteinprotein interactions. One example is the use of microarrays where the entire proteome can be mapped out and the proteins tested for binding to the different sites on the microarray [11]. Other examples are the two high-throughput approaches for detection of proteins complexes in the yeast saccharomyces cerevisiae [12] and [13] (see also [14]). In both these experiments a protein, A, inside the yeast is tagged (the bait protein). The bait protein together with any proteins, B, that might bind to it can be trapped and analyzed in order to find out which proteins bind in complexes. This analysis is done by mass spectrometry. I shall later analyze the data from the experiment done by Y. Ho et.al. [12], and I will therefore present the experimental setup in greater detail. 2.3.1 Mass spectrometry The method is called high-throughput mass spectrometric protein complex identification (HMS-PCI) by the authors [12]. The principle of the experiment is that a bait protein is tagged by a Flag epitope tag (other protein sequences with high binding affinity for specific compound can also be used) to trap complexes with this particular protein. In the experiment a bait protein is created in much the same way as the bait protein in the two-hybrid experiment: A plasmid containing the Flag tag combined with the protein, A, is created and introduced into a strain of the yeast saccharomyces cerevisiae. The yeast strain will now produce this bait hybrid protein along with normal production of all other proteins, B. The Flag tag is used to capture the bait protein along with any proteins, B, that are bound to it. The bait protein will usually be overexpressed compared to the normal expression rate for this particular protein, A, because the promotor used in the plasmid is strong (for example the GAL1 promoter). The yeast cells are grown and the protein complexes are formed. To detect and purify complexes in which the bait protein participates, the cells are lysed, meaning that all membranes are destroyed and cell fluid and proteins are free to flow around. The tag now enables capturing of the bait and 32

2.3 Other experiments on protein interactions Chapter 2 associated proteins by immunoprecipitations. This is done by introducing the cell mixture to a trap that have very high affinity for binding the tag. The tagged protein and hopefully the complex bound to it is now ready for further examination. After separating the complexes these are transferred to be resolved by SDS-polyacrylamide gel electrophoresis. This is a way to separate the proteins from each other. First all proteins are denatured with SDS, then the denatured proteins are put in a 2 dim gel that separates them in one direction by charge and in the other direction by mass. The charge separation is done by applying an electric field over the gel. The two dimensional gel is then stained to see where the proteins are abundant. The gel is divided into pieces each of which is dissolved so that the proteins can be identified by mass spectrometry [15],[16]. The principle of mass spectrometry is that it detects the mass to charge ratio m of molecular ions. To identify a protein the mass spectrometer detects how the protein fragments when it is bombarded with electrons. This z fragmentation is like a fingerprint for the protein and can be compared to the mass spectrometry from know proteins in a data base. Sometimes there are several different proteins in the same gel spot and the proteins cannot be resolved by fingerprinting alone. Therefore a method called tandem mass spectrometry (MS/MS) fragmentation [17] was used. Here the proteins were separated before fragmentation by mass in one mass spectrometer, and then the fingerprint from each of the separated proteins were found by a second mass spectrometer. In this way protein complexes were identified, so that each bait protein was associated with the proteins, B, that were precipitated along with the bait (these can be compared to the prey proteins of the two-hybrid experiment). This method does not distinguish between different complexes with the same bait, and it does not tell exactly which proteins are in physical contact with each other. It does however say something about which proteins are associated with each other in biological functions. 33

2.3 Other experiments on protein interactions Chapter 2 Plasmid with bait gene (red) and FLAG epitope tag (pink): Yeast cell with plasmide that expresses the bait protein with a FLAG tag Bait protein with FLAG epitope tag: Proteins binding to bait (prey proteins): Affinity column coated with antiflag for binding the FLAG tag Proteins that do not bind to bait: Washed out Further analysed SDS electrophoresis: Separation by charge Separation by size Protein indentification by tandem-mass spectrometry: 1 2 Extra seperation of proteins Fingerprint of particular protein: m/z m/z Figure 2.5: Schematic presentation of the mass spectrometry experiment for detection of protein complexes. A plasmid in the yeast cell produces a protein with a FLAG tag attached (bait). The cell is lysed and the tag and associated proteins are captured by an affinity column. The proteins captured are separated by SDS electrophoresis. The proteins are extracted from the gel and transferred to a mass spectrometre that can separate the proteins, that were not separated by the SDS electrophoresis. Last, the mass spectrometry fingerprints are found for each protein, and the specific proteins can be identified. 34

Chapter 3 Networks In resent years the study of networks has become increasingly popular. Networks are all around us. Social interactions, airport traffic, internet connections and protein interactions in a cell are just some examples. So what is a network, how do we study it and what can this study tell us? j k m i h l Figure 3.1: The figure shows an example of a network. This small network has 6 nodes and 8 links between them. The node i will have total connectivity, C i = 4, and as a special case the node m will have connectivity, C m = 3, since the link it has to itself is only counted once. In short a network is a number of nodes (points) connected by a number of links (lines), see figure 3.1. The number of nodes is the size of the network (here 6), whereas the number of links determines how connected the network is. The number of links connected to a particular node, i, is the connectivity of that node, eg. C i = 4, and the distribution of connectivities can tell a lot about a particular network. 35

3.1 Random networks Chapter 3 3.1 Random networks Random networks were first studied in the 195 s by the two mathematicians Paul Erdös and Alfred Rényi [18], [19]. They studied networks created by randomly distributing a given number of links between a given number of nodes. Later, random networks have been studied where the network was created by defining a network with a certain number of nodes, N, and a certain probability, p, of connecting two nodes. This means that for all, 1N(N 1), pair of nodes a link is made with the given probability, p, that 2 for example can be defined as the desired average number of links, s, divided by the maximal number of links: s p = 1 N(N 1) (3.1) 2 For this type of network the connectivities are binomial distributed around the average, meaning that all nodes have similar connectivities. This can be seen by considering the (N-1) possible links a given node can have. For each possibility the probability of a link is p. Now the probability that links are found in c of the (N-1) places is: ( ) N 1 P(c) = p c (1 p) N 1 c (3.2) c c = p (N 1) (3.3) σ(c) = (N 1) p(1 p) (3.4) ( ) N 1 where is the binomial coefficient, c is the average connectivity c and σ(c) is the standard deviation of connectivities. We also see that the total number of links, s, is binomial distributed with the average s given by (3.1). In random networks we can talk about the network having a scale, for example the average connectivity that depends on the number of nodes, N, and links, s. The binomial connectivity distribution of these random networks will for large number of nodes, N, be close to a normal distribution, see figure 3.2 and a poisson distribution. 3.1.1 Naturally occurring networks From the study of networks appearing in nature, such as the internet, food webs, phone-call networks and protein interaction networks, it becomes clear from the connectivity distribution that these are not random networks [19]. 36

3.1 Random networks Chapter 3 Random network, 5 nodes and p=.2 Fraction with connectivity, c c Figure 3.2: The figure shows the connectivity distribution of a random network, red dots, where we can see that this has the shape of a gaussian distribution, blue line. The network has 5 nodes, N=5, and a probability of connecting two nodes, p=.2. The distribution is in most cases very wide, which for example means that there are several nodes with connectivity much higher than expected from a binomial distribution. Also, there are typically many nodes with low connectivity, all giving a large standard deviation of connectivity for the network. These networks often have a connectivity distribution somewhat in the form of a power law, see equation 3.5 and figure 3.3. A common terminology used for this type of network is: scale free network. p(c) = ζ(γ) = 1 ζ(γ) 1 (3.5) c γ c γ (3.6) c=1 where p(c) is the fraction of nodes with connectivity, c, and the normalization constant ζ(γ) is Riemann s Zeta function. Notice that equation 3.5 is not defined for c = or γ 1. Noticeable is that the mean value and standard deviation of the connectivity, c, from equation 3.5 is not defined for all values of γ. The mean value of c is found to be: c = 1 ζ(γ) c c γ = c=1 ζ(γ 1) ζ(γ) (3.7) i.e. c is only defined for: γ > 2 37