An Application of Integer Linear Programming to Haplotyping Inference by Parsimony Problem

Size: px
Start display at page:

Download "An Application of Integer Linear Programming to Haplotyping Inference by Parsimony Problem"

Transcription

1 Università degli Studi Roma Tre Dottorato di Ricerca in Informatica e Automazione XVIII Ciclo 2005 An Application of Integer Linear Programming to Haplotyping Inference by Parsimony Problem Alessandra Godi

2

3 Università degli Studi Roma Tre Dottorato di Ricerca in Informatica e Automazione XVIII Ciclo Alessandra Godi An Application of Integer Linear Programming to Haplotyping Inference by Parsimony Problem Advisor Dr. Paola Bertolazzi Reviewers Prof. Martine Labbé

4 Author s address: Alessandra Godi Istituto di Analisi dei Sistemi ed Informatica Antonio Ruberti - CNR Viale Manzoni, 30 - Roma, Italy godi@iasi.cnr.it www:

5 Contents 1 INTRODUCTION 5 2 BIOLOGICAL BACKGROUND DNA and RNA Genes, chromosomes, haplotypes, genotypes and SNPs The importance of SNPs Haplotypes and genotypes in disease association studies Technical methods to obtain genotypes AN OVERVIEW ON HAPLOTYPING INFERENCE The Clark s Rule The Phylogeny (or Coalescent) Haplotyping Problem Statistical models and softwares Maximization Likelihood by Expectation Maximization Bayesian Inference Methods Statistical software tools The Inference Haplotyping in Pedigree The Haplotyping Inference by Parsimony problem SOLVING HIP USING A NEW HEURISTIC: COLLHAPS The COLLHAPS Algorithm The collapse rule The preprocessing The heuristic sequence of collapse steps Haplotype set reduction Precollapsing Postprocessing: removing residual variables Performance measures Experimental Results

6 2 CONTENTS 5 EXISTING ILP FORMULATION FOR HIP An exponential formulation Complete and reduce model An inclusion/exclusion strategy to count the variables of Gusfield s models Experimental results A polynomial formulation Branch-and-cut and experimental results of the polynomial formulation Conclusion about the linear formulations HIP problem is APX-hard SOLVING HIP USING EXPONENTIAL FORMULATIONS Polyhedral study of Gusfield s formulation General polyhedral theory Facets characterization for the HIP problem A Branch-and-Price algorithm for HIP The Branch-and-Price Algorithm Implementation Issues of B&P Computational Experience A new exponential formulation for the HIP problem Basic properties of the set-covering problem Characterization of some SC facets and valid inequalities for the HIP problem SOLVING HIP USING A NEW POLYNOMIAL FORMULATION The basic model as a minimum problem Turning P min into a maximization problem Strengthening of formulation Computational Experience CONCLUSIONS AND FUTURE WORKS 175

7 Acknowledgements I wish to thank all those people who taught me, listened to me, accompanied me, inspired me and distracted me during these three years of PhD. This time has passed away very quickly. I have learned so many interesting things from so many interesting people, and I have had many opportunities for traveling to interesting places to attend conferences and workshops. There are many who deserve to be thanked for their part in making my period of study a pleasant time. Here I can only mention a few of them. In particular I owe my gratitude to Dr. Paola Bertolazzi, my advisor, who introduced me in the computational biology field and has guided me through this work, being the ultimate guide one can possibly hope for. I really wish to thank Prof. Martine Labbé for hosting me at Université Libre de Bruxelles for four months and for a lot of inspiring discussions while I was there. I had the opportunity to learn many things from her and to improve my thesis. It has been a real fun working with her. Then I would like to thank Dr. Giovanni Rinaldi, the director of Istituto di Analisi dei Sistemi ed Informatica - A. Ruberti, for his advises to help me in this experience and for hosting me in his institute that represents my second family. Thanks to Prof. Fernando Nicolò for being a constant guide in the computer science research group of Università di Roma Tre. I wish to thank Prof. Giuseppe Lancia who introduced me the problem addressed in this thesis: he is the Dr. Dolittle of sciences; speaks fluently computer science, mathematics, biology and statistics. Working with him has been really inspiring. Thanks to Dr. Leonardo Tininini for being my office-mate for the last two years and co-author of the heuristic included in this thesis. He has been creative, encouraging and trusting. I am also grateful to Dr. Claudio Gentile and Dr. Paolo Ventura for their help and patience in answering to several questions. Special thanks to my good friends and colleagues Mara and Marta who accompanied me through my PhD, for their friendship and help in good and bad time. 3

8 4 CONTENTS I also thank all friends who distracted me from my PhD (lists may overlap): Luca ( come siamo fortunati! ), Mara, Marta; and all IASI Fellowship : particularly Anna, Barbara, Cristina, Gabriella, Guido, Leonardo, Mariagrazia, and then Mauro (our Frodo Baggings, or definitely better, our Mandrake ). Guys, it was pure fun! Period I spent in Bruxelles was a special time for me. I met a lot of interesting people starting from my nice flat-mate Jennifer to all friends of ULB. They help me in every situation, especially when I was homesick. Thank you! Thanks to mum and dad for all the help and every kind of support. They have the passion for life of 10 people, the humility of 100 people, and the generosity of 1000 people; but the goodness and kindness that only few people have, and I am proud to be their daughter. I thank the rest of my family, especially my brothers Gianmatteo and Francesco who never took my work seriously :-) but they gave me constant encouragement. And last, but not least, I wish to thank my beloved Davide, for making everything better and whose love never ceases to fill me with amaze. In my optimism I dedicate to him this thesis as a story of life and progress, as a small tribute to the many who will follow in the never ending chain of science.

9 Chapter 1 INTRODUCTION The work presented in this thesis is related to mathematical programming techniques for a particular problem with biological relevance called Haplotyping Inference by Parsimony (HIP) problem. Such work is part of an interdisciplinary area called computational biology. That field is concerned with the development and application of dataanalytical and theoretical methods, mathematical modeling and computational simulation techniques to the study of biological systems. Computational biology spans several classical areas such as biology, chemistry, physics, statistics and computer science, and the activities in the area are numerous. From a computational point of view the activities are ranging from algorithmic theory focusing on problems with biological relevance, via construction of computational tools and mathematical models for specific biological problems, to experimental work where a laboratory with test tubes and microscopes is substituted with a fast computer and a hard disk full of computational tools written to analyze huge amounts of biological data to prove or disprove a certain hypothesis. The area of computational biology is also erroneously referred to as bioinformatics. In fact, computational biology is used to refer to activities which mainly focus on constructing models and algorithms that address problems with biological relevance, while bioinformatics is used to refer to activities which mainly focus on constructing and using computational tools to analyze available biological data. This distinction between computational biology and bioinformatics only needs here to expose the main focus of the work: this thesis lies in the computational biology field, because the aim of the work is to analyze some existing integer programming formulation, to propose new models with the associated polyhedral studies and new (exact and approximate) algorithms that address the HIP problem. This problem is motivated by genetic differences among individuals of a species. Most vegetal and animal cells are diploid, i.e., they have two similar, but not identical, versions (or copies) 5

10 6 CHAPTER 1. INTRODUCTION of each chromosome (homologous chromosomes). In general, individuals from the same species are genetically very similar, as for instance humans: the DNA between two random people is about 99.9% identical. The individual uniqueness lies in a small number of bases that can exist where single base DNA differences occur. Thus a SNP (Single Nucleotide Polymorphism) is a single base pair position in genomic DNA at which different nucleotide variants (alleles) exist. In humans, SNPs are almost always biallelic, that is, there are two of the four possible polymorphisms at each site. The knowledge of these two variants is referred to as the phase of the SNP. The sequence of alleles along a chromosome copy is called a haplotype. Instead, the SNP information of the bases pairs sequence at each site of each chromosome is called a genotype, but it does not specify which base (i.e., which allele) occurs on which chromosome. For a given set of SNPs, an individual possesses two haplotypes, one is inherited from the paternal genome and the other from the maternal genome and exactly one genotype associated with the chromosome pair. The inheritance process is complicated by a phenomenon known as recombination which concerns portion exchanges of the paternal and maternal chromosomes. A SNP site where both haplotypes have the same variant (nucleotide) is called a homozygous site; a SNP site where the haplotypes have different variants is called a heterozygous site. Thus, while in genotype data the nucleotide variants at homozygous sites are known but the information regarding which heterozygous site SNP variants came from the same chromosome copy is unknown, in haplotype data alleles are completely known. The determination of the haplotypes within a population is essential. For instance, haplotypes are necessary in evolutionary studies to extract the information needed to detect diseases and to reduce the number of tests to be carried out, in the discovery of a functional gene or in study of an altered response of an organism to a particular terapy. In human pharmacogenetics, haplotype-snps seem to explain why people react differently to different types or amounts of drugs, in fact, since SNPs can affect the structure and function of proteins and enzymes, they can influence how efficiently a drug is absorbed and metabolized. Unfortunately, experimental techniques to obtain the haplotypes of an individual are very expensive, time consuming and labor intensive. However, it is possible to determine the genotype of an individual quickly and easy. The use of computational techniques joint with specific biological models offers a way of defining the haplotypes from the genotype data (i.e., Haplotyping or Haplotype Inference (HI)). The HI problem has been studied since nineties when a wide variety of techniques (statistical and combinatorial methods) were proposed. Statistical approaches try to iteratively determine the haplotype frequencies, and then infer the haplotype-pairs. In the methods based on expectation-maximization (EM) the haplotype frequency estimates are iteratively updated, starting from

11 7 an initial guess and trying to maximize a likelihood function. Other statistical methods are based on Bayesian inference and on the adoption of a more or less biologically-based prior, so as to get more accurate estimates of the haplotype frequencies and consequently of the genotype recontructions. Combinatorial methods are mostly inspired to the Clark s inference rule, based on the principle that, given a genotype and a haplotype compatible with this genotype, the other haplotype can be inferred simply by difference between the genotype and the given haplotype. Clark s rule was applied directly giving rise to the first algorithm for haplotyping. The algorithm has good accuracy but two major drawbacks: it could not even start or it can resolve only a subset of genotypes leaving the other ones unsolved. A second step is due to Gusfield who first used integer programming for haplotyping problem and formulates two different optimization problems. The first one looks for the best sequence of application of the Clark s rule to solve the maximum number of genotypes. The second one is the formulation of the inference problem with the requirement that the number of inferred haplotypes be minimum (HIP problem, i.e., the problem addresses in this thesis). Parsimony principle does not erroneously state that haplotypes with high frequency in a population should be preferred in a haplotype reconstruction (in fact, parsimony is affected by haplotype frequencies only in the weakest sense) but means that haplotypes of a population can not be so different from each other, as supported by real data from the practice and by phylogenetic haplotype tree history. This thesis is organized as follows. Chapter 2 introduces both the basic concepts and the definition set for computational biology which are fundamental for the knowledge of the biological reality we need. Chapter 3 describes the general Haplotyping Inference problem considering several variants and complexity aspects and proposes an overview on the existing solution methods for HI problem in general and for the Haplotyping Inference by Parsimony problem in particular. The first original result for the HIP is contained in Chapter 4: COLLHAPS, a rule-base heuristic which uses a collapsing rule to reach the minimum number of haplotypes. Then, Chapter 5 introduces the existing integer linear programming formulation for the HIP (exponential and polynomial models). Chapter 6 and Chapter 7 describe two different approaches for solving exactly the HIP problem: the first is based on the use of two exponential formulations (one is introduced in the Chapter 5, the other is originally obtained from the previous one by a Fourier-Motzkin procedure), a Branchand-Price algorithm and some polyhedral results (new facets and new valid inequalities); the second approach is based on a new polynomial formulation

12 8 CHAPTER 1. INTRODUCTION which is described from a basic model up to a strengthened one with the use of clique inequalities, symmetry-breaking inequalities and a dominance study. Finally, Chapter 8, concerning the conclusions, presents some ideas and future works for the HIP problem.

13 Chapter 2 BIOLOGICAL BACKGROUND The genetic material of each living organism - plant or animal, bacterium or virus - possesses sequences of basic elements building blocks (usually DNA, sometimes RNA) that are uniquely and specifically present only in its own species. Indeed, complex organisms beings, such as human, possess DNA sequences that are uniquely and specifically present only in particular individuals. These unique variations make it possible to trace genetic material back to its origin, identifying with precision at least what species of organism it came from and often which particular member of that species, or to isolate DNA regions which carry particular pathologies. In this chapter some fundamental concepts concerning the structure of genome, chromosomes, genes, haplotypes, genotypes and polymorphisms, which are relevant to motivate the topic of this thesis, are described. 2.1 DNA and RNA Most questions in computational biology are related to molecular or evolutionary biology and focus on analyzing and comparing the composition of the key biomolecules DNA, RNA and proteins, that together constitute the fundamental building blocks of organisms. The genetic material of an organism is the system which guides and determines the functions and the characteristics of organisms beings they need for the complex task of living. Questions about how the genetic material is stored and used by an organism have been studied intensively. This has revealed that the biomolecules DNA, RNA and proteins are the many important players of the game, and thus important components to model in any method for comparing and analyzing the genetic material of organisms. 9

14 10 CHAPTER 2. BIOLOGICAL BACKGROUND Figure 2.1: An abstract illustration of a segment of a DNA or RNA molecule. It shows that the molecule consists of a backbone of sugars linked together by phosphates with an base side chain attached to each sugar. The two ends of the backbone are conventionally called the 5 end and the 3 end. The DNA (deoxyribonucleic acid) molecule was discovered in 1869 while studying the chemistry of white blood cells. The very similar RNA (ribonucleic acid) molecule was discovered a few years later. DNA and RNA are chainlike molecules, called polymers, that consist of nucleotides linked together by phosphate bonds. A nucleotide consists of a phosphoric acid, a pentose sugar and an amine base or just base. In DNA the pentose sugar is 2-deoxyribose and the base is either adenine (A), guanine (G), cytosine (C), or thymine (T). In RNA the pentose sugar is ribose instead of 2-deoxyribose and the base thymine is exchanged with the very similar base uracil (U). Two of these bases (A and G) belong to the group of purines and the others (C and T or U for the RNA) to the pyrimidines. A DNA or RNA molecule is a uniform backbone of sugars linked together by the phosphates with side chains of bases attached to each sugar. This implies that a DNA or RNA molecule can be specifed uniquely by listing the sequence of base side chains starting from one end of the sequence of nucleotides. The two ends of a nucleotide sequence are conventionally denoted as the 5 end and the 3 end. These names refer to the orientation of the sugars along the backbone. It is common to start the listing of the base side chains from the 5 end of the sequence (see Fig. 2.1). Since there is only four possible base side chains, the listing can be described as a string over a four letter alphabet. Proteins are polymers that consists of amino acids linked together by peptide bonds. An amino acid consists of a central carbon atom, an amino group, a carboxyl group and a side chain. The side chain determines the type of the amino acid. As illustrated in Figure 2.2 chains of amino acids are formed by peptide bonds between the nitrogen atom in the amino group of one amino acid and the carbon atom in the carboxyl group of another amino acid. A protein thus consists of a backbone of the common structure shared between all amino acids with the different side-chains attached to the central carbon atoms. Even though there is an infinite number of different types of amino acids, only twenty of these types are encountered in proteins. The chemical structure of DNA, RNA and protein molecules that makes

15 2.1. DNA AND RNA 11 Figure 2.2: An abstract illustration of a segment of a protein. It shows that the molecule consists of a backbone of elements shared between the amino acids with a variable side chain attached to the central carbon atom in each amino acid. The peptide bonds linking the amino acids are indicated by gray lines. and RNA molecules, it is thus possible to uniquely specify a protein by listing the sequence of side chains. Since there is only twenty possible side chains, the listing can be described as a string over a twenty letter alphabet. it possible to specify them uniquely by listing the sequence of side chains, also called the sequence of residues, is the reason why these biomolecules are often referred to as biological sequences. The correspondence between biological sequences and strings over finite alphabets has many modeling advantages, most prominently its simplicity. For example, a DNA sequence corresponds to a string over the alphabet {A, G, C, T}, where each character represents one of the four possible nucleotides. Similarly, an RNA sequence corresponds to a string over the alphabet {A, G, C, U}. The relevance of modeling biomolecules as strings over finite alphabets follows from the way the genetic material of an organism is stored and used. Probably one of the most amazing discoveries of this century is that the entire genetic material of an organism, called its genome, is (with few exceptions) stored in two complementary DNA sequences that wound around each other in a helix. The genome is the entire set of hereditary instructions for building, running, maintaining an organism and passing life on to the next generation. Two DNA sequences are complementary if the one is the other read backwards with the complementary bases adenine/thymine and guanine/ cytosine interchanged, e.g. ATTCGC and GCGAAT are complementary because ATTCGC with A and T interchanged and G and C interchanged becomes TAAGCG, which is GCGAAT read backwards. Two complementary bases can form strong interactions, called base pairings, by hydrogen bonds. Hence, two complementary DNA sequences placed against each other such that the head (the 5 end) of the one sequence are placed opposite the tail (the 3 end) of the other sequence is glued together by base pairings between opposition complementary bases. The result is a double stranded DNA molecule with the famous double helix structure described by Watson and Crick in [84] (see Fig. 2.3). Genome size is usually stated as the total number of base pairs(bp); the human genome contains roughly 3 billion bp [75] (see Fig. 2.4). Despite the complex three-dimensional structure of this molecule, the ge-

16 12 CHAPTER 2. BIOLOGICAL BACKGROUND Figure 2.3: The base pairing is thus restricted. This restriction is essential when the DNA is being copied: the DNA-helix is first unzipped in two long stretches of sugar-phosphate backbone with a line of free bases sticking up from it, like the teeth of a comb. Each half will then be the template for a new, complementary strand. Biological machines inside the cell put the corresponding free bases onto the split molecule and also proof-read the result to find and correct any mistakes. After the doubling, this gives rise to two exact copies of the original DNA molecule. Figure 2.4: Comparison of largest known DNA sequences.

17 2.2. GENES, CHROMOSOMES, HAPLOTYPES, GENOTYPES AND SNPS 13 netic material it stores only depends on the sequence of nucleotides and can thus be described without loss of information as a string over the alphabet {A, G, C, T}. The genome of an organism contains the templates 1 of all the molecules necessary for the organism to live. A region of the genome that encodes a single molecule is called a gene (see next section). When a particular molecule is needed by the organism, the corresponding gene is transcribed to an RNA sequence. The transcribed RNA sequence is complementary to the complementary DNA sequence of the gene, and thus (except for thymine being replaced by uracil) identical to the gene. Sometimes this RNA sequence is the molecule needed by the organism, but most often it is only intended as an intermediate template for a protein. In eukaryotes (which are higher order organisms such as humans) a gene usually consists of coding parts, called exons, and non-coding parts, called introns. By removing the introns and concatenating the exons, the intermediate template is turned into a sequence of messenger RNA that encodes the protein (see Fig. 2.5). The messenger RNA is translated to a protein by reading it three nucleotides at a time. Each triplet of nucleotides, called a codon, uniquely describes an amino acid which is added to the sequence of amino acids being generated. The correspondence between codons and amino acids are given by the almost universal genetic code shown in Figure Genes, chromosomes, haplotypes, genotypes and SNPs A small piece of the genome that codes for a protein is called gene. Different genes determine the different characteristics, or traits, of an organism. In the simplest terms, one gene might determine the color of a bird s feathers, while another gene would determine the shape of its beak. The number of genes in the genome varies from species to species. More complex organisms tend to have more genes. Bacteria have several hundred to several thousand genes. Estimates of the number of human genes, by contrast, range from 25,000 to 30,000. Most gene products in the human genome are identical in all individuals. Genes are found on chromosomes and are made of DNA. A chromosome is a package containing a chunk of a genome, that is, it contains some of an organism s genes. The important word here is package : chromosomes help a cell to keep a large amount of genetic information neat, organized, and compact. Chromosomes are made of DNA and protein. Most living things have chromosomes that are linear and are kept in the nucleus, a 1 A template is a single DNA strand that serves as pattern for building a new second strand.

18 14 CHAPTER 2. BIOLOGICAL BACKGROUND Figure 2.5: RNA synthesis and processing.

19 2.2. GENES, CHROMOSOMES, HAPLOTYPES, GENOTYPES AND SNPS 15 Figure 2.6: The genetic code that describes how the 64 possible triplets of nucleotides are translated to amino acids. The table is read such that the triplet AUG encodes the amino acid Met. The three triplets UAA, UAG and UGA are termination codons that signal the end of a translation of triplets.

20 16 CHAPTER 2. BIOLOGICAL BACKGROUND sphere-shaped sac within the cell. In a few very simple forms of life, such as bacteria, the entire genome is packaged into a single chromosome. But other organisms, with genomes a thousand or even a million times larger than those of bacteria, divide their hereditary material among a number of different chromosomes. Exactly how many chromosomes we are talking about depends on the species. A mosquito has 6 chromosomes, a pea plant has 14, a sunflower 34, a human being 46, and a dog 78. In the case of humans, the 46 chromosomes are divided into 23 pairs of corresonding (homolougus) chromosomes. So, the structure is clear: the genome contains genes, which are packaged in chromosomes and affect specific characteristics of the organism. A location on a chromosome is called a locus (pl. loci). The locus can be either a single nucleotide or a string of nucleotides. Different variants that are present in the population at a specific locus (or loci) are called alleles, if there are only two variants the locus is biallelic. Without loss of generality, let us consider only biallelic alleles. When an individual inherits DNA from his/her parents one copy of each chromosome is inherited from each parent. This means that for every individual there are two alleles at each locus. The combined outcome of the two alleles at a locus is called a genotype. If a genotype consists of two copies of the same allele it is homozygous and otherwise heterozygous. Traits resulting from genotypes are called phenotypes; they can be quality phonotypes (i.e., healthy/diseased) or quantity phenotypes (i.e., length, colour,...). When an individual inherits a chromosome from a parent it is not one of the two parental copies: each parental chromosome recombine on average about 1.5 times [75] so that the inherited chromosome is a patchwork of the parental chromosomes (recombination). The location at which parental chromosomes recombine differ from generation to generation so that after several generations only small fragments of the original chromosomes remain and only bases that are located close together are inherited together. Recombination does not occur uniformly over chromosomes and it is used, instead of physical distance, to describe distances between loci. Dependence between loci is called Linkage Disequilibrium (LD): if there is a low probability of a recombination between two or more loci, then they have a high probability of being inherited together by successive generations, and the loci are said to be in linkage disequilibrium. These positions are selected in correspondence of genomic sites which have experimentally been confirmed to be polymorphic, that is, where there exists variation between individuals. Areas that are segregating (that is close to each others), but not necessarily coding for the gene of interest, are called genetic markers. When searching for a gene, the hope is that markers are either in a coding part of the gene or are in linkage disequilibrium with the gene. Commonly used genetic markers are Single Nucleotide Polymorphisms (SNPs, pronounced snips ). A SNP is a single-base mutation in a DNA sequence that occur when a single nucleotide

21 2.2. GENES, CHROMOSOMES, HAPLOTYPES, GENOTYPES AND SNPS 17 (A,T,C,or G) in the genome sequence is altered. For example a SNP might change the DNA sequence AAGGCTAA to ACGGCTAA. For a variation to be considered a SNP, it must occur in at least 1% of the population. SNPs, which make up about 90% of all human genetic variation, occur every 100 to 300 bases along the 3-billion-base human genome. Two of every three SNPs involve the replacement of cytosine (C) with thymine (T). SNPs can occur in both coding (gene) and noncoding regions of the genome. Many SNPs have no effect on cell function, but scientists believe others could predispose people to disease or influence their response to a drug (see next section for more information). The mapping of SNPs has been in rapid progress and currently approximately 2.7 million SNPs have been mapped [10], most having been discovered in recent years. These kind of polymorphisms tend to be rare events (in some cases, unique events in the history of the human race), with mutation rates estimated at around 175 total SNP mutations per individual per generation, or per base per generation [60]. Combinations of alleles from different loci which reside on the same copy of a chromosome are called haplotypes. Typically genetic markers are measured one at a time so that it is not always possible to infer haplotype phase with certainty, that is, which alleles belong together on the same chromosome. When the phase is unknown the estimation of haplotypes can be viewed as a missing data problem. It appears that the variation in individuals is limited to an extremely small percentage of the overall genome. In fact, approximately 99.9% of our DNA sequence is conserved; leaving only the remaining 0.1% of the human genome to account for the entire diversity of the human race. These variations consist of insertions, deletions and SNPs within the genome; there is intense interest in identifying the estimated in millions SNPs and determining their role in phenotypic variation. When analyzing multi-locus genotypes, if it is impossible to determine which chromosome of a pair a specific allele came, from the data is said to be unphased. The problem of determining which alleles at each locus of a set of linked diploid loci are physically located on the same chromosome is known as haplotyping or determining phase (see Fig. 2.7). For example, for a set of three linked loci, we have 3 2 = 6 alleles in the unphased genotype, yielding a maximum of 2 3 = 8 possible assignments of alleles to specific chromosomes, or = 4 possible phases when not distinguishing between the chromosomes. Depending on the allele values, some of these phases may be identical to each other, due to homozygosity (where the two alleles at a locus are identical). As opposed to haplotypes, the genotype gives the bases at each SNP for both copies of the chromosome, but loses the information as to the chromosome on which each base appears. Unfortunately, many sequencing techniques provide the genotypes and not the haplotypes (see last section of the chapter). Haplotype analysis has become increasingly common in genetic studies

22 18 CHAPTER 2. BIOLOGICAL BACKGROUND Figure 2.7: (a) The problem of haplotyping, or determining which alleles in a diploid genotype come from the same chromosome; (b) Determining which chromosome came from which parent. of human disease. However, many of these methods rely on phase information, that is, the haplotype information vs. the genotype information. Phase can be inferred by genotyping family members of each subject, but this has its downsides because of logistic and budget issues. Alternatively, laboratory techniques (such as PCR 2 ) have been also used but these are often costly and are not suitable for large scale polymorphism screening. As an alternative to those technologies, many computational methods have been developed for phasing the genotypes (see Chapter 3). 2.3 The importance of SNPs Although more than 99% of human DNA sequences are the same across the population, variations in DNA sequence can have a considerable impact on how humans respond to disease, environmental insults (such as bacteria, viruses, toxins, and chemicals), drugs and other therapies. This makes SNPs of great value for biomedical research and for developing pharmaceutical products or medical diagnostics. In fact, scientists believe SNP maps will help them in identifying the multiple genes associated with such complex diseases, in partic- 2 The Polymerase Chain Reaction is a method for the rapid copying of DNA. The principle itself is very simple: the first step involves the copying of a long, but very specific, DNA fragment - this forms the basis for all subsequent steps. Smaller fragments of a standard length are then synthesized from the DNA copies and then replicated millions of times over.

23 2.3. THE IMPORTANCE OF SNPS 19 ular because their evolutionary stability (not changing much from generation to generation) makes them easier to follow in population studies. Associations between genes and SNPs are difficult to establish with conventional gene-hunting methods because a single altered gene may make only a small contribution to the disease. Genes are the basic physical and functional units of heredity. They basically are specific sequences of bases that encode instructions on how to make proteins. When genes are altered so that the encoded proteins are unable to carry out their normal functions, genetic diseases can result. In the previous section we have seen that genes are carried on chromosomes: the maternal and paternal chromosomes pair up and exchange segments of DNA in a process called recombination. After recombination (which can be interested also exchange of parts of a given gene), the chromosomes contain a mixture of alleles from each parent. Recombination will occur frequently between DNA sequences that are a long way apart but only rarely between sequences that are close together. Therefore, by measuring the frequency of recombination between the disease gene and other DNA sequences whose location is already known, the position of the disease gene can be established. A consequence of recombination is that blocks of sequences on the same chromosome tend to be inherited together (linkage disequilibrium). Several groups worked to find SNPs and ultimately create SNP maps of the human genome. Among these groups are the U.S. Human Genome Project (HGP) and a large group of pharmaceutical companies called the SNP Consortium or TSC project. Their aims were to develop technologies for rapid identification of SNPs, identify common variants in the coding regions of most identified genes and create public resources of DNA samples and cell lines. In the end, many more SNPs (1.8 million total) were discovered than planned originally. Now that the SNP discovery phase of the TSC project is essentially complete, the emphasis has shifted to studying SNPs in populations. Various TSC member laboratories are genotyping (this is the word used to mean the identifying process for SNPs among a population data) a subset of SNPs as part of the Allele Frequency Project. The goal of the TSC allele frequency/genotype project is to determine the frequency of certain SNPs in three major world populations. See the TSC Web site for more information [89]. Besides the TSC Web site, SNP data is also available from the dbsnp database (from the National Center for Biotechnology Information) [90] and HGVbase (Human Genome Variation Database) [91].

24 20 CHAPTER 2. BIOLOGICAL BACKGROUND 2.4 Haplotypes and genotypes in disease association studies The aim of disease genetic association studies is to find or characterize relationships between genes and phenotypes in order to investigate the identities and functions of genes and their roles in presence of diseases, responsiveness to drug therapies, and susceptibility to toxic side-effects. To interpret the results of an association study it is important to understand which mechanisms can lead to association between marker (SNP) and phenotype. What can happen is that the marker is causally related to the phenotype or the marker is in linkage disequilibrium with a causal gene (see previous section). Typically the exact genomic location of the gene responsible of a disease is not known and genetic markers, such as SNPs, are measured instead of the gene of interest. Hence, the first step in identifying a gene which is cause of a disease is to located the chromosome which carries that gene. But it is necessary to individualize the position of the gene more precisely. This analysis is carried out by comparing phenotype distributions between persons with different genotypes: keys in this hunt are the set of SNPs which are situated on the chromosome and the set of persons (actually, their genotypes) who are considered in relation with a quality phenotype (e.g., affected/not affected). Observing the patterns of alleles (the SNP values) is possible to understand which is the closest SNP to the gene which produces the disease and it is easier to realize what is the position of that gene on the genome. Researchers, who work to identify SNPs, are discovering that, as the number of known SNPs increases, identifying the genotype and correlating to phenotype is becoming a huge task. Fortunately, nature may have made this process simpler than would be expected from the number of SNPs. Recent research has shown that groups of SNPs are inherited together in a stretch of DNA, rather than being randomly segregated through genetic recombination. These groups of SNPs are called haplotype blocks or also just haplotypes: in the most general sense, as we have already explained, the haplotype is simply the genotype of a single chromosome or haploid set of chromosome. One advantage of studying haplotypes is that they are more polymorphic than single marker loci; if the SNPs, from which haplotypes are constructed, are closely linked, then it may be easier to demonstrate association between a particular region of the genome with disease, than by using single marker loci. Several recent studies showed that haplotypes, if used as genetic markers, have higher statistical power than individual markers [1]. In fact haplotypes capture the local linkage disequilibrium information and may reflect the presence of additional, undetected mutation sites that are the underlying cause of the disease. Also, haplotypes may reflect two or more mutation sites which

25 2.4. HAPLOTYPES AND GENOTYPES IN DISEASE ASSOCIATION STUDIES 21 must act together to cause a disease, yet are harmless when present on separate chromosomes. In other words, haplotypes are expected to predict the genetic contributions to phenotypes more accurately than by just using single SNP genotypes. Moreover, even if SNPs contained in the haplotype may be found on only one gene, or may be found in multiple genes in the sequence, it is believed that the haplotype provides the context in which those genes act. A major difficulty in using haplotypes as genetic markers lies in determining the haplotype phase for individuals who are heterozygous for more than one marker. There are several approaches to overcome this difficulty. However, in order for an approach to be practical, it needs to meet the low cost and high throughput requirements. Only such approaches can potentially be used in studies using large samples and involving a large number of genetic markers. All in all, both genotype and haplotype data are used in genetic studies. Haplotypes are often more informative. Unfortunately, current experimental methods for haplotype determination are technically complicated and cost prohibitive [24]. In contrast, the genotype SNPs can be detected by using a variety of cheap technologies (see next section). After generating the genotypes of a statistically relevant number of individuals, it is possible to use computer algorithms to infer haplotypes in a process called resolving, phasing or haplotyping [14], [22], [32], [2], [21]. These inferred haplotypes typically have a greater than 90% accuracy. Let us note that a single genotype may be resolved by different, equallyplausible haplotype-pairs (see Fig. 2.8), but the joint inference of a set of genotypes may favor one haplotype-pair over the others for each individual. Such inference is usually based on a model for the data. Informally, most models rely on the observed phenomenon that over relatively short genomic regions, different human genotypes tend to share the same small set of haplotypes [65], [16]. We want to conclude the section with a remark on the International HapMap Project [92] which is conducting an ambitious study to generate haplotype maps based on the genotypes of hundreds of individuals, with the expectation that the resulting data will parse into a few general, common haplotypes. The results of this effort will become public domain, with the HapMap freely available to all researchers, both academic and commercial.

26 22 CHAPTER 2. BIOLOGICAL BACKGROUND Figure 2.8: An example of 6 SNPs along two homologous chromosomes of an individual. (a) Individuals haplotypes. (b) Individuals genotype. Here the set of heterozygous SNPs would be 2,5. (c) Another potential haplotype pair giving rise to the same genotype. Note that only SNPs are presented here. Every two SNPs can be separated by several hundred monomorphic base pairs. 2.5 Technical methods to obtain genotypes One of the aim of the Human Genome Project 3 is the discovery of millions of DNA sequence variants in the human genome. The procedure of detecting SNP is called genotyping. Since genotypes are the data of our problem, we want to give a general idea concerning the methodology for SNP genotyping in terms of the mechanisms of allelic discrimination and the detection modalities; we also describe a genotyping method currently in use. The genotyping methods are preferred to the haplotyping ones because, in general, they possess the following attributes: (a) the assay is easily and quickly developed from sequence information; (b) the cost of assay development is low in terms of marker-specific reagents and time spent by expert personnel on optimization; (c) the assay is easily automated and must require minimal hands-on operation; (d) the data analysis is simple, with automated, accurate genotype definition; (e) the reaction format is flexible and scalable, capable of performing a few hundred to a million assays per day; and (f) once optimized, the total assay cost per genotype (including equipment, reagents, and personnel) is low. The allelic discrimination detects different forms of the same gene that differ by a nucleotide substitution, insertion, or deletion. At DNA level, we can say that the allelic discrimination detects SNPs in a specific sequence. 3 Begun formally in 1990, the Human Genome Project was a 13-year effort coordinated by the U.S. Department of Energy and the National Institutes of Health. The project originally was planned to last 15 years, but rapid technological advances accelerated the completion date to Project goals were to identify all the approximately 20,000-25,000 genes in human DNA, determine the sequences of the 3 billion chemical base pairs that make up human DNA, store this information in databases, improve tools for data analysis, transfer related technologies to the private sector, and address the ethical, legal, and social issues that may arise from the project.

27 2.5. TECHNICAL METHODS TO OBTAIN GENOTYPES 23 Sequence-specific detection relies on four general mechanisms for allelic discrimination: allele-specific hybridization, allele-specific nucleotide incorporation, allele-specific oligonucleotide ligation, and allele-specific invasive cleavage. All four mechanisms are reliable, but each has its pros and cons. For instance, with the hybridization approach, two allele-specific probes 4 are designed to hybridize to the target sequence only when they match perfectly (see Fig. 2.9). Under optimized assay conditions, the one-base mismatch sufficiently destabilizes the hybridization to prevent the allelic probe from annealing 5 to the target sequence. Because no enzymes are involved in allelic discrimination, hybridization is the simplest mechanism for genotyping. The challenge to ensure robust allelic discrimination lies in the design of the probe. With ever more sophisticated probe design algorithms, allele-specific probes can be designed with high success rate. When the allele-specific probes are immobilized on a solid support, labeled target DNA samples are captured, and the hybridization event is visualized with a fluorescence filter by detecting the label after the unbound targets are washed away. Knowing the location of the probe sequences on the solid support allows one to infer the genotype of the target DNA sample. The detection mechanism of a positive allelic discrimination reaction is done by monitoring the light emitted by the products, measuring the mass of the products, or detecting a change in the electrical property when the products are formed. Numerous labels with various light-emitting properties have been synthesized and utilized in detection methods based on light detection or electrical detection. In general, only one label with ordinary properties is needed in genotyping methods where the products are separated or purified from the excess starting reagents. Monitoring light emission is the most widely used detection modality in genotyping, and there are many ways to do so. Luminescence, fluorescence, timeresolved fluorescence, fluorescence resonance energy transfer (FRET), and fluorescence polarization (FP) are useful properties of light utilized in a host of genotyping methods [48]. 4 A sequence of DNA or RNA, labeled or marked with a radioactive isotope, used to detect the presence of complementary nucleotide sequences by hybridization. 5 Annealing, in biology, means for DNA or RNA, to pair by hydrogen bonds to a complementary sequence, forming a double-stranded polynucleotide. The term is often used to describe the binding of a DNA probe, or the binding of a primer to a DNA strand during polymerase chain reaction.

28 24 CHAPTER 2. BIOLOGICAL BACKGROUND Figure 2.9: Allele-specific hybridization (the probe is the sequence-segment with the C ).

29 Chapter 3 AN OVERVIEW ON HAPLOTYPING INFERENCE Any of the four nucleotides {A, T, C, G} could be present at any position in the genome, so it might be imagined that each SNP should have four alleles. Theoretically this is possible, but in practice most SNPs exist as just two variants. This is because of the way in which SNPs arise and spread in a population. A SNP originates when a point mutation occurs in a genome, converting one nucleotide into another. If the mutation is in the reproductive cells of an individual, then one or more of the children might inherit the mutation and, after many generations, the SNP may eventually become established in the population. But there are just two alleles - the original sequence and the mutated version. For a third allele to arise, a new mutation must occur at the same position in the genome in another individual, and this individual and his or her offspring must reproduce in such a way that the new allele becomes established. This scenario is not impossible but it is unlikely; consequently, the vast majority of SNPs are biallelic. That allows us to represent a haplotype h with n SNP as a row vector of length n with binary entries. Each component h j of the vector indicates the state (i.e., the allele) at a particular polymorphic position in this haplotype: h j {0,1}. Similarly, a genotype g, which is the conflated data of two haplotypes, is represented by a n-dimensional vector, where each component g j {0,1,2}: 0 and 1 are related to homozygous sites, while heterozygous sites are denoted by 2. We introduce the conflate operator : {0,1} {0,1,2}, defined as follows: 0 0 = = 1 0 = = 1 25

30 26 CHAPTER 3. AN OVERVIEW ON HAPLOTYPING INFERENCE which generalizes to vectors in the obvious way: given a n-dimensional genotype g and a pair of n-dimensional haplotypes h 1 and h 2, g = h 1 h 2 g j = h 1,j h 2,j (j = 1,...,n) Thus a pair h 1, h 2 of haplotypes is compatible with a genotype g, if h 1 and h 2 both contain 0 in a position where g contains 0, 1 in a position where g contains 1, and opposite binary values where g contains 2 (see Table 3.1 for the haplotype coding and Table 3.2 for the genotyping coding); they are said to generate or explain g and both h 1 and h 2 are said to be consistent with g. Alleles C/A G/A C/G T/C T/C G/A C/G h 1 C G C T T A C h 2 C G G C C G G h 3 A A C T T A C h 4 C G C T T G C h 5 A A C T T G C Table 3.1: Example of haplotype coding. The SNPs are 7 and each of them is biallelic. The most frequent one is encoded by 0 and the least one by 1. The encoded haplotypes (the binary vectors) are in the last column of the table. g 1 C/A G/G C/C T/C T/C A/G C/C g 2 C/C G/G G/C C/T C/C G/A G/C g 3 A/A A/A C/C T/C T/C A/A C/C g 4 C/A G/A C/C T/C T/T G/G C/G Table 3.2: Example of genotype coding. For each SNP of each genotype we have just a mixed information, that is we know if the site is homozygous or heterozygous. The encoded genotypes (vectors in {0,1,2} 7 ) are in the last column of the table. The Haplotyping Inference (HI) problem consists in determining the allele values for a set of SNPs given as a genotype input. In other words, given a set G of genotypes, we have to find a set H of haplotypes, such that for each g G there exist h 1,h 2 H such that h 1 h 2 = g. In literature different versions of haplotyping problems are known and have been extensively studied, under many objective functions, scenarios and applications, in recent years. This chapter is a comprehensive presentation of some approaches proposed for this biological problem and it mainly focuses on the formulations, algorithmic approaches, complexity results and existing software tools.

1. Contains the sugar ribose instead of deoxyribose. 2. Single-stranded instead of double stranded. 3. Contains uracil in place of thymine.

1. Contains the sugar ribose instead of deoxyribose. 2. Single-stranded instead of double stranded. 3. Contains uracil in place of thymine. Protein Synthesis & Mutations RNA 1. Contains the sugar ribose instead of deoxyribose. 2. Single-stranded instead of double stranded. 3. Contains uracil in place of thymine. RNA Contains: 1. Adenine 2.

More information

Objective 3.01 (DNA, RNA and Protein Synthesis)

Objective 3.01 (DNA, RNA and Protein Synthesis) Objective 3.01 (DNA, RNA and Protein Synthesis) DNA Structure o Discovered by Watson and Crick o Double-stranded o Shape is a double helix (twisted ladder) o Made of chains of nucleotides: o Has four types

More information

Full file at CHAPTER 2 Genetics

Full file at   CHAPTER 2 Genetics CHAPTER 2 Genetics MULTIPLE CHOICE 1. Chromosomes are a. small linear bodies. b. contained in cells. c. replicated during cell division. 2. A cross between true-breeding plants bearing yellow seeds produces

More information

LIFE SCIENCE CHAPTER 5 & 6 FLASHCARDS

LIFE SCIENCE CHAPTER 5 & 6 FLASHCARDS LIFE SCIENCE CHAPTER 5 & 6 FLASHCARDS Why were ratios important in Mendel s work? A. They showed that heredity does not follow a set pattern. B. They showed that some traits are never passed on. C. They

More information

GENETICS - CLUTCH CH.1 INTRODUCTION TO GENETICS.

GENETICS - CLUTCH CH.1 INTRODUCTION TO GENETICS. !! www.clutchprep.com CONCEPT: HISTORY OF GENETICS The earliest use of genetics was through of plants and animals (8000-1000 B.C.) Selective breeding (artificial selection) is the process of breeding organisms

More information

Notes Chapter 4 Cell Reproduction. That cell divided and becomes two, two become four, four become eight, and so on.

Notes Chapter 4 Cell Reproduction. That cell divided and becomes two, two become four, four become eight, and so on. 4.1 Cell Division and Mitosis Many organisms start as one cell. Notes Chapter 4 Cell Reproduction That cell divided and becomes two, two become four, four become eight, and so on. Many-celled organisms,

More information

Introduction to Molecular and Cell Biology

Introduction to Molecular and Cell Biology Introduction to Molecular and Cell Biology Molecular biology seeks to understand the physical and chemical basis of life. and helps us answer the following? What is the molecular basis of disease? What

More information

2012 Univ Aguilera Lecture. Introduction to Molecular and Cell Biology

2012 Univ Aguilera Lecture. Introduction to Molecular and Cell Biology 2012 Univ. 1301 Aguilera Lecture Introduction to Molecular and Cell Biology Molecular biology seeks to understand the physical and chemical basis of life. and helps us answer the following? What is the

More information

UNIT 5. Protein Synthesis 11/22/16

UNIT 5. Protein Synthesis 11/22/16 UNIT 5 Protein Synthesis IV. Transcription (8.4) A. RNA carries DNA s instruction 1. Francis Crick defined the central dogma of molecular biology a. Replication copies DNA b. Transcription converts DNA

More information

EVOLUTION ALGEBRA Hartl-Clark and Ayala-Kiger

EVOLUTION ALGEBRA Hartl-Clark and Ayala-Kiger EVOLUTION ALGEBRA Hartl-Clark and Ayala-Kiger Freshman Seminar University of California, Irvine Bernard Russo University of California, Irvine Winter 2015 Bernard Russo (UCI) EVOLUTION ALGEBRA 1 / 10 Hartl

More information

Interphase & Cell Division

Interphase & Cell Division 1 Interphase & Cell Division 2 G1 = cell grows and carries out its normal job. S phase = DNA is copied (replicated/duplicated) G2 = Cell prepares for division 3 During mitosis, the nuclear membrane breaks

More information

Introduction to Genetics. Why do biological relatives resemble one another?

Introduction to Genetics. Why do biological relatives resemble one another? Introduction to Genetics Why do biological relatives resemble one another? Heritage Hair color, eye color, height, and lots of other traits are passed down through families. How does that happen? REPRODUCTION

More information

Notes Chapter 4 Cell Reproduction. That cell divided and becomes two, two become, four become eight, and so on.

Notes Chapter 4 Cell Reproduction. That cell divided and becomes two, two become, four become eight, and so on. Notes Chapter 4 Cell Reproduction 4.1 Cell Division and Mitosis Many organisms start as. That cell divided and becomes two, two become, four become eight, and so on. Many-celled organisms, including you,

More information

Chapter 17. From Gene to Protein. Biology Kevin Dees

Chapter 17. From Gene to Protein. Biology Kevin Dees Chapter 17 From Gene to Protein DNA The information molecule Sequences of bases is a code DNA organized in to chromosomes Chromosomes are organized into genes What do the genes actually say??? Reflecting

More information

8. Use the following terms: interphase, prophase, metaphase, anaphase, telophase, chromosome, spindle fibers, centrioles.

8. Use the following terms: interphase, prophase, metaphase, anaphase, telophase, chromosome, spindle fibers, centrioles. Midterm Exam Study Guide: 2nd Quarter Concepts Cell Division 1. The cell spends the majority of its life in INTERPHASE. This phase is divided up into the G 1, S, and G 2 phases. During this stage, the

More information

Biology 2018 Final Review. Miller and Levine

Biology 2018 Final Review. Miller and Levine Biology 2018 Final Review Miller and Levine bones blood cells elements All living things are made up of. cells If a cell of an organism contains a nucleus, the organism is a(n). eukaryote prokaryote plant

More information

Biology Semester 2 Final Review

Biology Semester 2 Final Review Name Period Due Date: 50 HW Points Biology Semester 2 Final Review LT 15 (Proteins and Traits) Proteins express inherited traits and carry out most cell functions. 1. Give examples of structural and functional

More information

Algorithms in Computational Biology (236522) spring 2008 Lecture #1

Algorithms in Computational Biology (236522) spring 2008 Lecture #1 Algorithms in Computational Biology (236522) spring 2008 Lecture #1 Lecturer: Shlomo Moran, Taub 639, tel 4363 Office hours: 15:30-16:30/by appointment TA: Ilan Gronau, Taub 700, tel 4894 Office hours:??

More information

DNA, Chromosomes, and Genes

DNA, Chromosomes, and Genes N, hromosomes, and Genes 1 You have most likely already learned about deoxyribonucleic acid (N), chromosomes, and genes. You have learned that all three of these substances have something to do with heredity

More information

O 3 O 4 O 5. q 3. q 4. Transition

O 3 O 4 O 5. q 3. q 4. Transition Hidden Markov Models Hidden Markov models (HMM) were developed in the early part of the 1970 s and at that time mostly applied in the area of computerized speech recognition. They are first described in

More information

AQA Biology A-level. relationships between organisms. Notes.

AQA Biology A-level. relationships between organisms. Notes. AQA Biology A-level Topic 4: Genetic information, variation and relationships between organisms Notes DNA, genes and chromosomes Both DNA and RNA carry information, for instance DNA holds genetic information

More information

GENETICS UNIT VOCABULARY CHART. Word Definition Word Part Visual/Mnemonic Related Words 1. adenine Nitrogen base, pairs with thymine in DNA and uracil

GENETICS UNIT VOCABULARY CHART. Word Definition Word Part Visual/Mnemonic Related Words 1. adenine Nitrogen base, pairs with thymine in DNA and uracil Word Definition Word Part Visual/Mnemonic Related Words 1. adenine Nitrogen base, pairs with thymine in DNA and uracil in RNA 2. allele One or more alternate forms of a gene Example: P = Dominant (purple);

More information

Biology I Fall Semester Exam Review 2014

Biology I Fall Semester Exam Review 2014 Biology I Fall Semester Exam Review 2014 Biomolecules and Enzymes (Chapter 2) 8 questions Macromolecules, Biomolecules, Organic Compunds Elements *From the Periodic Table of Elements Subunits Monomers,

More information

Chapters 12&13 Notes: DNA, RNA & Protein Synthesis

Chapters 12&13 Notes: DNA, RNA & Protein Synthesis Chapters 12&13 Notes: DNA, RNA & Protein Synthesis Name Period Words to Know: nucleotides, DNA, complementary base pairing, replication, genes, proteins, mrna, rrna, trna, transcription, translation, codon,

More information

BIOLOGY STANDARDS BASED RUBRIC

BIOLOGY STANDARDS BASED RUBRIC BIOLOGY STANDARDS BASED RUBRIC STUDENTS WILL UNDERSTAND THAT THE FUNDAMENTAL PROCESSES OF ALL LIVING THINGS DEPEND ON A VARIETY OF SPECIALIZED CELL STRUCTURES AND CHEMICAL PROCESSES. First Semester Benchmarks:

More information

Number of questions TEK (Learning Target) Biomolecules & Enzymes

Number of questions TEK (Learning Target) Biomolecules & Enzymes Unit Biomolecules & Enzymes Number of questions TEK (Learning Target) on Exam 8 questions 9A I can compare and contrast the structure and function of biomolecules. 9C I know the role of enzymes and how

More information

Bio 1B Lecture Outline (please print and bring along) Fall, 2007

Bio 1B Lecture Outline (please print and bring along) Fall, 2007 Bio 1B Lecture Outline (please print and bring along) Fall, 2007 B.D. Mishler, Dept. of Integrative Biology 2-6810, bmishler@berkeley.edu Evolution lecture #5 -- Molecular genetics and molecular evolution

More information

Homework Assignment, Evolutionary Systems Biology, Spring Homework Part I: Phylogenetics:

Homework Assignment, Evolutionary Systems Biology, Spring Homework Part I: Phylogenetics: Homework Assignment, Evolutionary Systems Biology, Spring 2009. Homework Part I: Phylogenetics: Introduction. The objective of this assignment is to understand the basics of phylogenetic relationships

More information

MATHEMATICAL MODELS - Vol. III - Mathematical Modeling and the Human Genome - Hilary S. Booth MATHEMATICAL MODELING AND THE HUMAN GENOME

MATHEMATICAL MODELS - Vol. III - Mathematical Modeling and the Human Genome - Hilary S. Booth MATHEMATICAL MODELING AND THE HUMAN GENOME MATHEMATICAL MODELING AND THE HUMAN GENOME Hilary S. Booth Australian National University, Australia Keywords: Human genome, DNA, bioinformatics, sequence analysis, evolution. Contents 1. Introduction:

More information

Computational Biology: Basics & Interesting Problems

Computational Biology: Basics & Interesting Problems Computational Biology: Basics & Interesting Problems Summary Sources of information Biological concepts: structure & terminology Sequencing Gene finding Protein structure prediction Sources of information

More information

Family Trees for all grades. Learning Objectives. Materials, Resources, and Preparation

Family Trees for all grades. Learning Objectives. Materials, Resources, and Preparation page 2 Page 2 2 Introduction Family Trees for all grades Goals Discover Darwin all over Pittsburgh in 2009 with Darwin 2009: Exploration is Never Extinct. Lesson plans, including this one, are available

More information

6A Genes and Cell Division

6A Genes and Cell Division genetics: the study of heredity Life Science Chapter 6 Cell Division 6A Genes and Cell Division gene: contain the cell s blueprints (the information needed to build the cell and cell products) a discrete

More information

The Complete Set Of Genetic Instructions In An Organism's Chromosomes Is Called The

The Complete Set Of Genetic Instructions In An Organism's Chromosomes Is Called The The Complete Set Of Genetic Instructions In An Organism's Chromosomes Is Called The What is a genome? A genome is an organism's complete set of genetic instructions. Single strands of DNA are coiled up

More information

Name Class Date. KEY CONCEPT Gametes have half the number of chromosomes that body cells have.

Name Class Date. KEY CONCEPT Gametes have half the number of chromosomes that body cells have. Section 1: Chromosomes and Meiosis KEY CONCEPT Gametes have half the number of chromosomes that body cells have. VOCABULARY somatic cell autosome fertilization gamete sex chromosome diploid homologous

More information

Variation of Traits. genetic variation: the measure of the differences among individuals within a population

Variation of Traits. genetic variation: the measure of the differences among individuals within a population Genetic variability is the measure of the differences among individuals within a population. Because some traits are more suited to certain environments, creating particular niches and fits, we know that

More information

Table of Contents. Chapter Preview. 5.1 Mendel s Work. 5.2 Probability and Heredity. 5.3 The Cell and Inheritance. 5.4 Genes, DNA, and Proteins

Table of Contents. Chapter Preview. 5.1 Mendel s Work. 5.2 Probability and Heredity. 5.3 The Cell and Inheritance. 5.4 Genes, DNA, and Proteins Table of Contents Chapter Preview 5.1 Mendel s Work 5.2 Probability and Heredity 5.3 The Cell and Inheritance 5.4 Genes, DNA, and Proteins Chapter 5 Preview Questions 1. What carries the instructions that

More information

Cell Growth and Genetics

Cell Growth and Genetics Cell Growth and Genetics Cell Division (Mitosis) Cell division results in two identical daughter cells. The process of cell divisions occurs in three parts: Interphase - duplication of chromosomes and

More information

NOTES - Ch. 16 (part 1): DNA Discovery and Structure

NOTES - Ch. 16 (part 1): DNA Discovery and Structure NOTES - Ch. 16 (part 1): DNA Discovery and Structure By the late 1940 s scientists knew that chromosomes carry hereditary material & they consist of DNA and protein. (Recall Morgan s fruit fly research!)

More information

2. What is meiosis? The process of forming gametes (sperm and egg) 4. Where does meiosis take place? Ovaries- eggs and testicles- sperm

2. What is meiosis? The process of forming gametes (sperm and egg) 4. Where does meiosis take place? Ovaries- eggs and testicles- sperm Name KEY Period Biology Review Standard 3 Main Idea Explain the significance of meiosis and fertilization in genetic variation. How I can demonstrate what a smart. Person I am 1. What is fertilization?

More information

Guided Notes Unit 4: Cellular Reproduction

Guided Notes Unit 4: Cellular Reproduction Name: Date: Block: Chapter 5: Cell Growth and Division I. Background Guided Notes Unit 4: Cellular Reproduction a. "Where a cell exists, there must have been a preexisting cell..." - Rudolf Virchow b.

More information

Name: Date: Hour: Unit Four: Cell Cycle, Mitosis and Meiosis. Monomer Polymer Example Drawing Function in a cell DNA

Name: Date: Hour: Unit Four: Cell Cycle, Mitosis and Meiosis. Monomer Polymer Example Drawing Function in a cell DNA Unit Four: Cell Cycle, Mitosis and Meiosis I. Concept Review A. Why is carbon often called the building block of life? B. List the four major macromolecules. C. Complete the chart below. Monomer Polymer

More information

THINGS I NEED TO KNOW:

THINGS I NEED TO KNOW: THINGS I NEED TO KNOW: 1. Prokaryotic and Eukaryotic Cells Prokaryotic cells do not have a true nucleus. In eukaryotic cells, the DNA is surrounded by a membrane. Both types of cells have ribosomes. Some

More information

Guided Reading Chapter 1: The Science of Heredity

Guided Reading Chapter 1: The Science of Heredity Name Number Date Guided Reading Chapter 1: The Science of Heredity Section 1-1: Mendel s Work 1. Gregor Mendel experimented with hundreds of pea plants to understand the process of _. Match the term with

More information

Genetic Algorithms. Donald Richards Penn State University

Genetic Algorithms. Donald Richards Penn State University Genetic Algorithms Donald Richards Penn State University Easy problem: Find the point which maximizes f(x, y) = [16 x(1 x)y(1 y)] 2, x, y [0,1] z (16*x*y*(1-x)*(1-y))**2 0.829 0.663 0.497 0.331 0.166 1

More information

Lesson 4: Understanding Genetics

Lesson 4: Understanding Genetics Lesson 4: Understanding Genetics 1 Terms Alleles Chromosome Co dominance Crossover Deoxyribonucleic acid DNA Dominant Genetic code Genome Genotype Heredity Heritability Heritability estimate Heterozygous

More information

Heredity and Genetics WKSH

Heredity and Genetics WKSH Chapter 6, Section 3 Heredity and Genetics WKSH KEY CONCEPT Mendel s research showed that traits are inherited as discrete units. Vocabulary trait purebred law of segregation genetics cross MAIN IDEA:

More information

DNA Structure and Function

DNA Structure and Function DNA Structure and Function Nucleotide Structure 1. 5-C sugar RNA ribose DNA deoxyribose 2. Nitrogenous Base N attaches to 1 C of sugar Double or single ring Four Bases Adenine, Guanine, Thymine, Cytosine

More information

Unit 3 - Molecular Biology & Genetics - Review Packet

Unit 3 - Molecular Biology & Genetics - Review Packet Name Date Hour Unit 3 - Molecular Biology & Genetics - Review Packet True / False Questions - Indicate True or False for the following statements. 1. Eye color, hair color and the shape of your ears can

More information

Lecture Notes: BIOL2007 Molecular Evolution

Lecture Notes: BIOL2007 Molecular Evolution Lecture Notes: BIOL2007 Molecular Evolution Kanchon Dasmahapatra (k.dasmahapatra@ucl.ac.uk) Introduction By now we all are familiar and understand, or think we understand, how evolution works on traits

More information

Sexual Reproduction and Genetics

Sexual Reproduction and Genetics Chapter Test A CHAPTER 10 Sexual Reproduction and Genetics Part A: Multiple Choice In the space at the left, write the letter of the term, number, or phrase that best answers each question. 1. How many

More information

Name Block Date Final Exam Study Guide

Name Block Date Final Exam Study Guide Name Block Date Final Exam Study Guide Unit 7: DNA & Protein Synthesis List the 3 building blocks of DNA (sugar, phosphate, base) Use base-pairing rules to replicate a strand of DNA (A-T, C-G). Transcribe

More information

Students: Model the processes involved in cell replication, including but not limited to: Mitosis and meiosis

Students: Model the processes involved in cell replication, including but not limited to: Mitosis and meiosis 1. Cell Division Students: Model the processes involved in cell replication, including but not limited to: Mitosis and meiosis Mitosis Cell division is the process that cells undergo in order to form new

More information

Linear Regression (1/1/17)

Linear Regression (1/1/17) STA613/CBB540: Statistical methods in computational biology Linear Regression (1/1/17) Lecturer: Barbara Engelhardt Scribe: Ethan Hada 1. Linear regression 1.1. Linear regression basics. Linear regression

More information

Family Trees for all grades. Learning Objectives. Materials, Resources, and Preparation

Family Trees for all grades. Learning Objectives. Materials, Resources, and Preparation page 2 Page 2 2 Introduction Family Trees for all grades Goals Discover Darwin all over Pittsburgh in 2009 with Darwin 2009: Exploration is Never Extinct. Lesson plans, including this one, are available

More information

Cell Growth and Division

Cell Growth and Division Cell Growth and Division Why do cells divide* Life and reproduction require cell division You require constant cell reproduction to live Mitosis: development (a) mitotic cell division (b) mitotic cell

More information

Curriculum Map. Biology, Quarter 1 Big Ideas: From Molecules to Organisms: Structures and Processes (BIO1.LS1)

Curriculum Map. Biology, Quarter 1 Big Ideas: From Molecules to Organisms: Structures and Processes (BIO1.LS1) 1 Biology, Quarter 1 Big Ideas: From Molecules to Organisms: Structures and Processes (BIO1.LS1) Focus Standards BIO1.LS1.2 Evaluate comparative models of various cell types with a focus on organic molecules

More information

Darwin's theory of natural selection, its rivals, and cells. Week 3 (finish ch 2 and start ch 3)

Darwin's theory of natural selection, its rivals, and cells. Week 3 (finish ch 2 and start ch 3) Darwin's theory of natural selection, its rivals, and cells Week 3 (finish ch 2 and start ch 3) 1 Historical context Discovery of the new world -new observations challenged long-held views -exposure to

More information

Unit 5- Concept 1 THE DNA DISCOVERY

Unit 5- Concept 1 THE DNA DISCOVERY Unit 5- Concept 1 THE DNA DISCOVERY Inheritance has always puzzled people No one really knew how it worked Mendel wasn t known till the late 1800 s He didn t even know what chromosomes were! DNA was discovered

More information

2. What was the Avery-MacLeod-McCarty experiment and why was it significant? 3. What was the Hershey-Chase experiment and why was it significant?

2. What was the Avery-MacLeod-McCarty experiment and why was it significant? 3. What was the Hershey-Chase experiment and why was it significant? Name Date Period AP Exam Review Part 6: Molecular Genetics I. DNA and RNA Basics A. History of finding out what DNA really is 1. What was Griffith s experiment and why was it significant? 1 2. What was

More information

Unit 6 Reading Guide: PART I Biology Part I Due: Monday/Tuesday, February 5 th /6 th

Unit 6 Reading Guide: PART I Biology Part I Due: Monday/Tuesday, February 5 th /6 th Name: Date: Block: Chapter 6 Meiosis and Mendel Section 6.1 Chromosomes and Meiosis 1. How do gametes differ from somatic cells? Unit 6 Reading Guide: PART I Biology Part I Due: Monday/Tuesday, February

More information

BENCHMARK 1 STUDY GUIDE SPRING 2017

BENCHMARK 1 STUDY GUIDE SPRING 2017 BENCHMARK 1 STUDY GUIDE SPRING 2017 Name: There will be semester one content on this benchmark as well. Study your final exam review guide from last semester. New Semester Material: (Chapter 10 Cell Growth

More information

SCI-LS Genetics_khetrick Exam not valid for Paper Pencil Test Sessions

SCI-LS Genetics_khetrick Exam not valid for Paper Pencil Test Sessions SCI-LS Genetics_khetrick Exam not valid for Paper Pencil Test Sessions [Exam ID:78GZGM 1 The diagram above shows a picture of the DNA molecule. The DNA molecule can be described as A being flat like a

More information

Major questions of evolutionary genetics. Experimental tools of evolutionary genetics. Theoretical population genetics.

Major questions of evolutionary genetics. Experimental tools of evolutionary genetics. Theoretical population genetics. Evolutionary Genetics (for Encyclopedia of Biodiversity) Sergey Gavrilets Departments of Ecology and Evolutionary Biology and Mathematics, University of Tennessee, Knoxville, TN 37996-6 USA Evolutionary

More information

2. The following molecules are considered polymers except Mark all that apply a. Starch b. DNA c. Proteins d. Lipids e. Salt

2. The following molecules are considered polymers except Mark all that apply a. Starch b. DNA c. Proteins d. Lipids e. Salt Life s Major Molecules 1. Which is an organic molecule? a. Ne b. O2 c. CH4 d. NaCl e. H2O 2. The following molecules are considered polymers except Mark all that apply a. Starch b. DNA c. Proteins d. Lipids

More information

Molecular and cellular biology is about studying cell structure and function

Molecular and cellular biology is about studying cell structure and function Chapter 1 Exploring the World of the Cell In This Chapter Discovering the microscopic world Getting matter and energy Reading the genetic code Molecular and cellular biology is about studying cell structure

More information

EVOLUTION ALGEBRA. Freshman Seminar University of California, Irvine. Bernard Russo. University of California, Irvine. Winter 2015

EVOLUTION ALGEBRA. Freshman Seminar University of California, Irvine. Bernard Russo. University of California, Irvine. Winter 2015 EVOLUTION ALGEBRA Freshman Seminar University of California, Irvine Bernard Russo University of California, Irvine Winter 2015 Bernard Russo (UCI) EVOLUTION ALGEBRA 1 / 15 Understanding Genetics The study

More information

DNA THE CODE OF LIFE 05 JULY 2014

DNA THE CODE OF LIFE 05 JULY 2014 LIFE SIENES N THE OE OF LIFE 05 JULY 2014 Lesson escription In this lesson we nswer questions on: o N, RN and Protein synthesis o The processes of mitosis and meiosis o omparison of the processes of meiosis

More information

Meiosis and Mendel. Chapter 6

Meiosis and Mendel. Chapter 6 Meiosis and Mendel Chapter 6 6.1 CHROMOSOMES AND MEIOSIS Key Concept Gametes have half the number of chromosomes that body cells have. Body Cells vs. Gametes You have body cells and gametes body cells

More information

From gene to protein. Premedical biology

From gene to protein. Premedical biology From gene to protein Premedical biology Central dogma of Biology, Molecular Biology, Genetics transcription replication reverse transcription translation DNA RNA Protein RNA chemically similar to DNA,

More information

Short Answers Worksheet Grade 6

Short Answers Worksheet Grade 6 Short Answers Worksheet Grade 6 Short Answer 1. What is the role of the nucleolus? 2. What are the two different kinds of endoplasmic reticulum? 3. Name three cell parts that help defend the cell against

More information

Introduction to molecular biology. Mitesh Shrestha

Introduction to molecular biology. Mitesh Shrestha Introduction to molecular biology Mitesh Shrestha Molecular biology: definition Molecular biology is the study of molecular underpinnings of the process of replication, transcription and translation of

More information

Reading Assignments. A. Genes and the Synthesis of Polypeptides. Lecture Series 7 From DNA to Protein: Genotype to Phenotype

Reading Assignments. A. Genes and the Synthesis of Polypeptides. Lecture Series 7 From DNA to Protein: Genotype to Phenotype Lecture Series 7 From DNA to Protein: Genotype to Phenotype Reading Assignments Read Chapter 7 From DNA to Protein A. Genes and the Synthesis of Polypeptides Genes are made up of DNA and are expressed

More information

Texas Biology Standards Review. Houghton Mifflin Harcourt Publishing Company 26 A T

Texas Biology Standards Review. Houghton Mifflin Harcourt Publishing Company 26 A T 2.B.6. 1 Which of the following statements best describes the structure of DN? wo strands of proteins are held together by sugar molecules, nitrogen bases, and phosphate groups. B wo strands composed of

More information

CCHS 2016_2017 Biology Fall Semester Exam Review

CCHS 2016_2017 Biology Fall Semester Exam Review CCHS 2016_2017 Biology Fall Semester Exam Review Biomolecule General Knowledge Macromolecule Monomer (building block) Function Structure 1. What type of biomolecule is hair, skin, and nails? Energy Storage

More information

Cell Division: the process of copying and dividing entire cells The cell grows, prepares for division, and then divides to form new daughter cells.

Cell Division: the process of copying and dividing entire cells The cell grows, prepares for division, and then divides to form new daughter cells. Mitosis & Meiosis SC.912.L.16.17 Compare and contrast mitosis and meiosis and relate to the processes of sexual and asexual reproduction and their consequences for genetic variation. 1. Students will describe

More information

CHAPTER 23 THE EVOLUTIONS OF POPULATIONS. Section C: Genetic Variation, the Substrate for Natural Selection

CHAPTER 23 THE EVOLUTIONS OF POPULATIONS. Section C: Genetic Variation, the Substrate for Natural Selection CHAPTER 23 THE EVOLUTIONS OF POPULATIONS Section C: Genetic Variation, the Substrate for Natural Selection 1. Genetic variation occurs within and between populations 2. Mutation and sexual recombination

More information

Protein Synthesis. Unit 6 Goal: Students will be able to describe the processes of transcription and translation.

Protein Synthesis. Unit 6 Goal: Students will be able to describe the processes of transcription and translation. Protein Synthesis Unit 6 Goal: Students will be able to describe the processes of transcription and translation. Protein Synthesis: Protein synthesis uses the information in genes to make proteins. 2 Steps

More information

Name: Period: EOC Review Part F Outline

Name: Period: EOC Review Part F Outline Name: Period: EOC Review Part F Outline Mitosis and Meiosis SC.912.L.16.17 Compare and contrast mitosis and meiosis and relate to the processes of sexual and asexual reproduction and their consequences

More information

Designer Genes C Test

Designer Genes C Test Northern Regional: January 19 th, 2019 Designer Genes C Test Name(s): Team Name: School Name: Team Number: Rank: Score: Directions: You will have 50 minutes to complete the test. You may not write on the

More information

Introduction to population genetics & evolution

Introduction to population genetics & evolution Introduction to population genetics & evolution Course Organization Exam dates: Feb 19 March 1st Has everybody registered? Did you get the email with the exam schedule Summer seminar: Hot topics in Bioinformatics

More information

Heredity Composite. Multiple Choice Identify the choice that best completes the statement or answers the question.

Heredity Composite. Multiple Choice Identify the choice that best completes the statement or answers the question. Heredity Composite Multiple Choice Identify the choice that best completes the statement or answers the question. 1. When a plant breeder crossed two red roses, 78% of the offspring had red flowers and

More information

1. Draw, label and describe the structure of DNA and RNA including bonding mechanisms.

1. Draw, label and describe the structure of DNA and RNA including bonding mechanisms. Practicing Biology BIG IDEA 3.A 1. Draw, label and describe the structure of DNA and RNA including bonding mechanisms. 2. Using at least 2 well-known experiments, describe which features of DNA and RNA

More information

Guided Notes Unit 6: Classical Genetics

Guided Notes Unit 6: Classical Genetics Name: Date: Block: Chapter 6: Meiosis and Mendel I. Concept 6.1: Chromosomes and Meiosis Guided Notes Unit 6: Classical Genetics a. Meiosis: i. (In animals, meiosis occurs in the sex organs the testes

More information

Peddie Summer Day School

Peddie Summer Day School Peddie Summer Day School Course Syllabus: BIOLOGY Teacher: Mr. Jeff Tuliszewski Text: Biology by Miller and Levine, Prentice Hall, 2010 edition ISBN 9780133669510 Guided Reading Workbook for Biology ISBN

More information

Processes of Evolution

Processes of Evolution 15 Processes of Evolution Forces of Evolution Concept 15.4 Selection Can Be Stabilizing, Directional, or Disruptive Natural selection can act on quantitative traits in three ways: Stabilizing selection

More information

BIOLOGY I: COURSE OVERVIEW

BIOLOGY I: COURSE OVERVIEW BIOLOGY I: COURSE OVERVIEW The academic standards for High School Biology I establish the content knowledge and skills for Tennessee students in order to prepare them for the rigorous levels of higher

More information

Outline for today s lecture (Ch. 14, Part I)

Outline for today s lecture (Ch. 14, Part I) Outline for today s lecture (Ch. 14, Part I) Ploidy vs. DNA content The basis of heredity ca. 1850s Mendel s Experiments and Theory Law of Segregation Law of Independent Assortment Introduction to Probability

More information

SAT in Bioinformatics: Making the Case with Haplotype Inference

SAT in Bioinformatics: Making the Case with Haplotype Inference SAT in Bioinformatics: Making the Case with Haplotype Inference Inês Lynce 1 and João Marques-Silva 2 1 IST/INESC-ID, Technical University of Lisbon, Portugal ines@sat.inesc-id.pt 2 School of Electronics

More information

Interest Grabber. Analyzing Inheritance

Interest Grabber. Analyzing Inheritance Interest Grabber Section 11-1 Analyzing Inheritance Offspring resemble their parents. Offspring inherit genes for characteristics from their parents. To learn about inheritance, scientists have experimented

More information

Sugars, such as glucose or fructose are the basic building blocks of more complex carbohydrates. Which of the following

Sugars, such as glucose or fructose are the basic building blocks of more complex carbohydrates. Which of the following Name: Score: / Quiz 2 on Lectures 3 &4 Part 1 Sugars, such as glucose or fructose are the basic building blocks of more complex carbohydrates. Which of the following foods is not a significant source of

More information

JUNE EXAM QUESTIONS (PAPER 2) 30 JULY 2014

JUNE EXAM QUESTIONS (PAPER 2) 30 JULY 2014 JUNE EXAM QUESTIONS (PAPER 2) 30 JULY 2014 Lesson Description In this lesson we: Revise questions appearing in paper 2 in some provinces for work covered in Term 1 and 2 Test Yourself Select the most correct

More information

Darwin's theory of evolution by natural selection. Week 3

Darwin's theory of evolution by natural selection. Week 3 Darwin's theory of evolution by natural selection Week 3 1 Announcements -HW 1 - will be on chapters 1 and 2 DUE: 2-27 2 Summary *Essay talk *Explaining evolutionary change *Getting to know natural selection

More information

Protein Synthesis. Unit 6 Goal: Students will be able to describe the processes of transcription and translation.

Protein Synthesis. Unit 6 Goal: Students will be able to describe the processes of transcription and translation. Protein Synthesis Unit 6 Goal: Students will be able to describe the processes of transcription and translation. Types of RNA Messenger RNA (mrna) makes a copy of DNA, carries instructions for making proteins,

More information

Mile-stones leading to the concept of nature of the gene: 1. The discovery of discrete units of inheritance in 1860s. - Mendel s pea experiments(

Mile-stones leading to the concept of nature of the gene: 1. The discovery of discrete units of inheritance in 1860s. - Mendel s pea experiments( Homework IV. Bioenergetics 1. Calculate the G for ATP hydrolysis in a cell in which the [ATP]/[ADP] ratio had climbed to 100:1 while the P i concentration remained at10 mm. How does this compare to the

More information

DNA and GENETICS UNIT NOTES

DNA and GENETICS UNIT NOTES DNA and GENETICS UNIT NOTES NAME: DO NOT LOSE! 1 DNA - Deoxyribose Nucleic Acid Shape is called double DNA has the information for our cells to make. DNA through transcription makes m mrna through translation

More information

What Mad Pursuit (1988, Ch.5) Francis Crick (1916 ) British molecular Biologist 12 BIOLOGY, CH 1

What Mad Pursuit (1988, Ch.5) Francis Crick (1916 ) British molecular Biologist 12 BIOLOGY, CH 1 1 Almost all aspects of life are engineered at the molecular level, and without understanding molecules we can only have a very sketchy understanding of life itself. What Mad Pursuit (1988, Ch.5) Francis

More information

Chapter 13 Meiosis and Sexual Reproduction

Chapter 13 Meiosis and Sexual Reproduction Biology 110 Sec. 11 J. Greg Doheny Chapter 13 Meiosis and Sexual Reproduction Quiz Questions: 1. What word do you use to describe a chromosome or gene allele that we inherit from our Mother? From our Father?

More information

Name Date Period Unit 1 Basic Biological Principles 1. What are the 7 characteristics of life?

Name Date Period Unit 1 Basic Biological Principles 1. What are the 7 characteristics of life? Unit 1 Basic Biological Principles 1. What are the 7 characteristics of life? Eukaryotic cell parts you should be able a. to identify and label: Nucleus b. Nucleolus c. Rough/smooth ER Ribosomes d. Golgi

More information

Humans have two copies of each chromosome. Inherited from mother and father. Genotyping technologies do not maintain the phase

Humans have two copies of each chromosome. Inherited from mother and father. Genotyping technologies do not maintain the phase Humans have two copies of each chromosome Inherited from mother and father. Genotyping technologies do not maintain the phase Genotyping technologies do not maintain the phase Recall that proximal SNPs

More information

Exploring Life Content Assessment 1

Exploring Life Content Assessment 1 Exploring Life Content Assessment 1 INSTRUCTIONS: Below you will find 42 questions that explore your understanding of science. Each question is numbered in the left-hand column and has only one correct

More information