Early History up to Schedule. Proteins DNA & RNA Schwann and Schleiden Cell Theory Charles Darwin publishes Origin of Species

Similar documents
Computational Biology: Basics & Interesting Problems

Hidden Markov Models in Bioinformatics 14.11

"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky

1. In most cases, genes code for and it is that

Bio 1B Lecture Outline (please print and bring along) Fall, 2007

Related Courses He who asks is a fool for five minutes, but he who does not ask remains a fool forever.

Massachusetts Institute of Technology Computational Evolutionary Biology, Fall, 2005 Notes for November 7: Molecular evolution

MATHEMATICAL MODELS - Vol. III - Mathematical Modeling and the Human Genome - Hilary S. Booth MATHEMATICAL MODELING AND THE HUMAN GENOME

BME 5742 Biosystems Modeling and Control

Organization of Genes Differs in Prokaryotic and Eukaryotic DNA Chapter 10 p

Principles of Genetics

Phylogenetic Trees. Phylogenetic Trees Five. Phylogeny: Inference Tool. Phylogeny Terminology. Picture of Last Quagga. Importance of Phylogeny 5.

Chapter 17. From Gene to Protein. Biology Kevin Dees

SEQUENCE ALIGNMENT BACKGROUND: BIOINFORMATICS. Prokaryotes and Eukaryotes. DNA and RNA

HMM for modeling aligned multiple sequences: phylo-hmm & multivariate HMM

Algorithms in Computational Biology (236522) spring 2008 Lecture #1

Algorithms in Bioinformatics

(Lys), resulting in translation of a polypeptide without the Lys amino acid. resulting in translation of a polypeptide without the Lys amino acid.

Evolutionary Analysis of Viral Genomes

2. What was the Avery-MacLeod-McCarty experiment and why was it significant? 3. What was the Hershey-Chase experiment and why was it significant?

Lecture 18 June 2 nd, Gene Expression Regulation Mutations

3/1/17. Content. TWINSCAN model. Example. TWINSCAN algorithm. HMM for modeling aligned multiple sequences: phylo-hmm & multivariate HMM

O 3 O 4 O 5. q 3. q 4. Transition

From Gene to Protein

Full file at CHAPTER 2 Genetics

Outline. Genome Evolution. Genome. Genome Architecture. Constraints on Genome Evolution. New Evolutionary Synthesis 11/8/16

Honors Biology Reading Guide Chapter 11

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Introduction to Molecular and Cell Biology

Outline. Genome Evolution. Genome. Genome Architecture. Constraints on Genome Evolution. New Evolutionary Synthesis 11/1/18

Data Mining in Bioinformatics HMM

2012 Univ Aguilera Lecture. Introduction to Molecular and Cell Biology

GCD3033:Cell Biology. Transcription

Biology. Biology. Slide 1 of 26. End Show. Copyright Pearson Prentice Hall

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

full file at

BINF6201/8201. Molecular phylogenetic methods

Translation Part 2 of Protein Synthesis

Newly made RNA is called primary transcript and is modified in three ways before leaving the nucleus:

How Molecules Evolve. Advantages of Molecular Data for Tree Building. Advantages of Molecular Data for Tree Building

CHAPTERS 24-25: Evidence for Evolution and Phylogeny

Bioinformatics Chapter 1. Introduction

UNIT 5. Protein Synthesis 11/22/16

The Eukaryotic Genome and Its Expression. The Eukaryotic Genome and Its Expression. A. The Eukaryotic Genome. Lecture Series 11

Molecular Phylogenetics (part 1 of 2) Computational Biology Course João André Carriço

EVOLUTIONARY DISTANCES

GSBHSRSBRSRRk IZTI/^Q. LlML. I Iv^O IV I I I FROM GENES TO GENOMES ^^^H*" ^^^^J*^ ill! BQPIP. illt. goidbkc. itip31. li4»twlil FIFTH EDITION

Introduction to Bioinformatics. Shifra Ben-Dor Irit Orr

METHODS FOR DETERMINING PHYLOGENY. In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task.

Phylogenetic Tree Reconstruction

CGS 5991 (2 Credits) Bioinformatics Tools

Molecular evolution - Part 1. Pawan Dhar BII

Multiple Choice Review- Eukaryotic Gene Expression

Warm-Up- Review Natural Selection and Reproduction for quiz today!!!! Notes on Evidence of Evolution Work on Vocabulary and Lab

Translation. A ribosome, mrna, and trna.

Designer Genes C Test

Flow of Genetic Information

FUNDAMENTALS OF MOLECULAR EVOLUTION

Controlling Gene Expression

Proteomics. 2 nd semester, Department of Biotechnology and Bioinformatics Laboratory of Nano-Biotechnology and Artificial Bioengineering

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree)

Algorithmics and Bioinformatics

Understanding relationship between homologous sequences

Regulation of Gene Expression

Comparative Bioinformatics Midterm II Fall 2004

Honors Biology Final Exam Highlights First Semester Final Review, 200 Multiple Choice Questions, 200 Points

Molecular phylogeny - Using molecular sequences to infer evolutionary relationships. Tore Samuelsson Feb 2016

Bioinformatics Practical for Biochemists

Reading Assignments. A. Genes and the Synthesis of Polypeptides. Lecture Series 7 From DNA to Protein: Genotype to Phenotype

The Gene The gene; Genes Genes Allele;

Algorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment

Evolutionary Models. Evolutionary Models

Introduction to molecular biology. Mitesh Shrestha

Lecture 27. Phylogeny methods, part 4 (Models of DNA and protein change) p.1/26

Biology 112 Practice Midterm Questions

Sequences, Structures, and Gene Regulatory Networks

A DNA Sequence 2017/12/6 1

GENETICS - CLUTCH CH.22 EVOLUTIONARY GENETICS.

Practice Test on Cell Biology (the REAL test is on Friday the 17th) KEY LO: Describe and explain the Central Dogma. SLE: Meet NGSS.


Bioinformatics course

Videos. Bozeman, transcription and translation: Crashcourse: Transcription and Translation -

Phylogenetics: Parsimony

Organelle genome evolution

C3020 Molecular Evolution. Exercises #3: Phylogenetics

Bioinformatics 2 - Lecture 4

Genetic Variation: The genetic substrate for natural selection. Horizontal Gene Transfer. General Principles 10/2/17.

Dr. Amira A. AL-Hosary

SCIENTIFIC EVIDENCE TO SUPPORT THE THEORY OF EVOLUTION. Using Anatomy, Embryology, Biochemistry, and Paleontology

GREENWOOD PUBLIC SCHOOL DISTRICT Genetics Pacing Guide FIRST NINE WEEKS Semester 1

Information Theoretic Distance Measures in Phylogenomics

C.DARWIN ( )

Bioinformatics Practical for Biochemists

Protein Synthesis. Unit 6 Goal: Students will be able to describe the processes of transcription and translation.

Homework Assignment, Evolutionary Systems Biology, Spring Homework Part I: Phylogenetics:

Interphase & Cell Division

Endowed with an Extra Sense : Mathematics and Evolution

Lecture Notes: BIOL2007 Molecular Evolution

7. Tests for selection

Transcription:

Schedule Bioinformatics and Computational Biology: History and Biological Background (JH) 0.0 he Parsimony criterion GKN.0 Stochastic Models of Sequence Evolution GKN 7.0 he Likelihood criterion GKN 0.0 ut: 9-0 = (Friday) rees in phylogenetics and population genetics GKN.0 Estimating phylogenies and genealogies I GKN 7.0 Estimating phylogenies and genealogies II GKN.0 Estimating phylogenies and genealogies III. Alignment Algorithms I (Optimisation) (JH) 7. Alignment Algorithms II (Statistical Inference) (JH) 0. Finding Signals in Sequences (JH). Stochastic Grammars and their Biological Applications: Hidden Markov Models (JH) 7.0 Stochastic Grammars and their Biological Applications: Context Free Grammars (JH). RNA molecules and their analysis (JH). Open Problems in Bioinformatics and Computational Biology I (JH) 8. Possibly: Evolving Grammars, Pedigrees from Genomes Open Problems in Bioinformatics and Computational Biology II (GKN). Possibly: he phylogeny of language: traits and dates, What can FIV sequences tell us about their host cat population? Bioinformatics and Computational Biology: History & Biological Background Early History up to 9 88 Schwann and Schleiden Cell heory 89 Charles Darwin publishes Origin of Species 86 Mendel discovers basic laws of inheritance (largely ignored) 869 Miescher Discovers DNA 900 Mendels laws rediscovered. 9 Avery shows DNA contains genetic information 9 Corey & Pauling Secondary structure elements of a protein. 9 Watson & Crick proposes DNA structure and states Proteins Proteins: a string of amino acids. Often folds up in a well defined dimensional structure. Has enzymatic, structural and regulatory functions. DNA & RNA DNA: he Information carrier in the genetic material. Usually double helix. RNA: messenger tape from DNA to protein, regulatory, enzymatic and structural roles as well. More labile than DNA

An Example: t-rna History up to 9-66 Promoter Gene 9 Sanger first protein sequence Bovine Insulin 97 Kendrew structure of Whale Myoglobin 98 Crick, Goldschmidt,. Central Dogma 98 First quantitative method for phylogeny reconstruction (UGPMA - Sokal and Michener) 99 Operon Models proposed (Jakob and Monod) 966 Genetic Code Determined 967 First RNA sequencing From Paul Higgs he Central Dogma he Genetic Code Genetic Code: Mapping from - nucleotides (codons) to amino acids (0) + stop codon. his 6--> mapping creates the distinction silent/replacement substitution. Substitutions Number Percent otal in all codons 9 00 Synonymous Nonsynonymous 7 Missense 9 7 Nonsense Ser hr Glu Met Cys Leu Met Gly Gly CA AC GAG AG G A AG GGG GGA * * * * * * * ** CG ACA GGG AA A CA AG GG AA Ser hr Gly Ile yr Leu Met Gly Ile

History 966-80 969-70 emin + Baltimore Reverse ranscriptase 970 Needleman-Wunch algorithm for pairwise alignment 97-7 Hartigan-Fitch-Sankoff algorithm for assigning nucleotides to inner nodes on a tree. Genes, Gene Structure & Alternative Splicing Presently estimated Gene Number:.000, Average Gene Size: 7 kb he largest gene: Dystrophin. Mb - 0.6% coding 6 hours to transcribe. he shortest gene: trna YR 00% coding Largest exon: ApoB exon 6 is 7.6 kb Smallest: <0bp Average exon number: 9 Largest exon number: itin 6 Smallest: Largest intron: WWOX intron 8 is 800 kb Smallest: 0s of bp Largest polypeptide: itin 8.8 smallest: tens small hormones. Intronless Genes: mitochondrial genes, many RNA genes, Interferons, Histones,.. 976/79 First viral genome MS/!X7 977/8 Sharp/Roberts Introns 979 Alternative Splicing 980 Mitochondrial Genome (6.69bp) and the discovery of alternative codes. A challenge to automated annotation.. How widespread is it?. Is it always functional?. How does it evolve? Cartegni,L. et al.(00) Listening to Silence and understanding nonsense: Exonic mutations that affect splicing Nature Reviews Genetics..8-, HMG p9-9 Strings and Comparing Strings 970 Needleman-Wunch algorithm for pairwise alignment for maximizing similarity 97 Sellers-Sankoff algorithm for pairwise alignment for minimizing distance (Parsimony) Initial condition: D 0,0 =0. D i,j := D(s[:i], s[:j]) D i,j =min{d i-,j- + d(s[i],s[j]), D i,j- + g, D i-,j +g} Alignment: CAGG I= v=) g=0 i v Cost 7 -G G 0 9 7 0 0 0 0 0 0 0 0 0 0 0 0 0 C A G G 97- Sankoff algorithm for multiple alignment for minimizing distance (Parsimony) and finding phylogeny simultaneously History 980-9 98 Felsenstein Proposes algorithm to calculate probability of observed nucleotides on leaves on a tree. 98-8 Griffiths, Hudson he Ancestral Recombination Graph. 987/89 First biological use of Hidden Markov Model (HMM) (Lander and Green, Churchill) 99 horne, Kishino and Felsenstein proposes statistical model for pairwise alignment. 99 First biological use of stochastic context free grammar (Haussler)

Homology: Genealogical Structures he existence of a common ancestor (for instance for sequences) Phylogeny Only finding common ancestors. Only one ancestor. Pedigree: cagtct ccagtcg ccggtcg ime slices ime All positions have found a common ancestors on one sequence All positions have found a common ancestors Ancestral Recombination Graph the ARG i. Finding common ancestors. ii. A sequence encounters Recombinations iii. A point ARG is a phylogeny N Population Enumerating rees: Unrooted & valency History 99-00 99 First prokaryotic genome H. influenzae 996 First unicellular eukaryotic genome Yeast 998 he first multi-cellular eukaryotic genome C.elegans 000 Drosophila melanogaster, Arabidopsis thaliana 00 Human Genome n" (n " )! #( j " ) = (n " )! j= Recursion: n = (n-) n- Initialisation: = = = 6 0 7 9 8 n" 7.9 0 0. 0.0 0 6. 00 9 0 0 00 Mouse Genome 00 Chimp Genome

6 7 he Human Genome http://www.sanger.ac.uk/hgp/ & R.Harding & HMG (00) p 8 6 9 0 7 8 9 0 X Y mitochondria Molecular Evolution and Gene Finding: wo HMMs AGGGACCAAAGCG... P coding {AG-->GG} or AGGGACAAGGCG... P non-coding {AG-->GG} 76 6 8 0 8 8 07 00 79 97 98 (chromosome ) "#globin 0 88 86 $ globin 7 66 8 6 Myoglobin.06.*0 9 bp *.000 6*0 bp Simple Prokaryotic Simple Eukaryotic Exon Exon Exon *0 flanking flanking *0 bp *0 DNA: Protein: AGCCAGCGAAAGGACAGGA aa aa aa aa aa aa aa aa aa aa 0 bp H H H hree Questions for Hidden Structures. What is the probability of the data? What is the most probable hidden configuration? What is the probability of specific hidden state? raining: Given a set of instances, find parameters making them probable if they were independent. HMM/Stochastic Regular Grammar O O O O O O 6 O 7 O 8 O 9 O 0 W L i i j j L W W R H P = O = P(O H = ) SCFG - Stochastic Context Free Grammars H P = j " O p j, H = j Bioinformatics and Computational Biology: History and Biological Background (JH) he Parsimony criterion GKN Stochastic Models of Sequence Evolution GKN he Likelihood criterion GKN rees in phylogenetics and population genetics GKN Estimating phylogenies and genealogies I GKN Estimating phylogenies and genealogies II GKN Estimating phylogenies and genealogies III GKN Alignment Algorithms I (Optimisation) (JH) Alignment Algorithms II (Statistical Inference) (JH) Finding Signals in Sequences (JH) Stochastic Grammars and their Biological Applications: Hidden Markov Models (JH) Stochastic Grammars and their Biological Applications: Context Free Grammars (JH) RNA molecules and their analysis (JH) Open Problems in Bioinformatics and Computational Biology I (JH) Possibly: Evolving Grammars, Pedigrees from Genomes Open Problems in Bioinformatics and Computational Biology II (GKN) Possibly: he phylogeny of language: traits and dates, What can FIV sequences tell us about their host cat population?