Biology Tutorial. Aarti Balasubramani Anusha Bharadwaj Massa Shoura Stefan Giovan

Similar documents
THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types

Sara C. Madeira. Universidade da Beira Interior. (Thanks to Ana Teresa Freitas, IST for useful resources on this subject)

What is the central dogma of biology?

CISC 889 Bioinformatics (Spring 2004) Sequence pairwise alignment (I)

Algorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment

Introduction to Bioinformatics

Syllabus of BIOINF 528 (2017 Fall, Bioinformatics Program)

Sequence Alignment Techniques and Their Uses

Chapter 1. DNA is made from the building blocks adenine, guanine, cytosine, and. Answer: d

Bioinformatics. Dept. of Computational Biology & Bioinformatics

Tutorial 4 Substitution matrices and PSI-BLAST

Tools and Algorithms in Bioinformatics

Tiffany Samaroo MB&B 452a December 8, Take Home Final. Topic 1

Large-Scale Genomic Surveys

Sugars, such as glucose or fructose are the basic building blocks of more complex carbohydrates. Which of the following

BIOINFORMATICS: An Introduction

Multiple Choice Review- Eukaryotic Gene Expression

Bioinformatics Chapter 1. Introduction

2. What was the Avery-MacLeod-McCarty experiment and why was it significant? 3. What was the Hershey-Chase experiment and why was it significant?

Bioinformatics (GLOBEX, Summer 2015) Pairwise sequence alignment

Bioinformatics for Biologists

9/2/17. Molecular and Cellular Biology. 3. The Cell From Genes to Proteins. key processes

Introduction to Molecular and Cell Biology

2012 Univ Aguilera Lecture. Introduction to Molecular and Cell Biology

Cells and the Stuff They re Made of. Indiana University P575 1

Collected Works of Charles Dickens

Pairwise & Multiple sequence alignments

"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky

Sequence analysis and comparison

9/11/18. Molecular and Cellular Biology. 3. The Cell From Genes to Proteins. key processes

Quantifying sequence similarity

CAP 5510 Lecture 3 Protein Structures

CONCEPT OF SEQUENCE COMPARISON. Natapol Pornputtapong 18 January 2018

3. SEQUENCE ANALYSIS BIOINFORMATICS COURSE MTAT

Algorithms in Bioinformatics

UNIT 5. Protein Synthesis 11/22/16

08/21/2017 BLAST. Multiple Sequence Alignments: Clustal Omega

Ch 3: Chemistry of Life. Chemistry Water Macromolecules Enzymes

Homology Modeling. Roberto Lins EPFL - summer semester 2005

MATHEMATICAL MODELS - Vol. III - Mathematical Modeling and the Human Genome - Hilary S. Booth MATHEMATICAL MODELING AND THE HUMAN GENOME

Sequence Alignment: A General Overview. COMP Fall 2010 Luay Nakhleh, Rice University

Behavioral Science, Math, Science, and Physical Education Fall COURSE OUTLINE Critical Concepts in Biology

Unit 1: Chemistry - Guided Notes

7-2 Eukaryotic Cell Structure

Berg Tymoczko Stryer Biochemistry Sixth Edition Chapter 1:

Chapter 5. Proteomics and the analysis of protein sequence Ⅱ

ALL LECTURES IN SB Introduction

Sequence and Structure Alignment Z. Luthey-Schulten, UIUC Pittsburgh, 2006 VMD 1.8.5

Module: Sequence Alignment Theory and Applications Session: Introduction to Searching and Sequence Alignment

2. Draw two water molecules. Using a dotted line, show a hydrogen bond that could form between them.

Basic Biology. Content Skills Learning Targets Assessment Resources & Technology

Flow of Genetic Information

BME Engineering Molecular Cell Biology. Structure and Dynamics of Cellular Molecules. Basics of Cell Biology Literature Reading

Homology Modeling (Comparative Structure Modeling) GBCB 5874: Problem Solving in GBCB

CHAPTER 3. Cell Structure and Genetic Control. Chapter 3 Outline

Cellular Neuroanatomy I The Prototypical Neuron: Soma. Reading: BCP Chapter 2

BME 5742 Biosystems Modeling and Control

Practical considerations of working with sequencing data

Introduction to Comparative Protein Modeling. Chapter 4 Part I

Biochemistry 324 Bioinformatics. Pairwise sequence alignment

InDel 3-5. InDel 8-9. InDel 3-5. InDel 8-9. InDel InDel 8-9

From gene to protein. Premedical biology

2. Macromolecule - molecules composed of repeating molecular units (biopolymers).

Dr. Amira A. AL-Hosary

Number of questions TEK (Learning Target) Biomolecules & Enzymes

Single alignment: Substitution Matrix. 16 march 2017

Alignment principles and homology searching using (PSI-)BLAST. Jaap Heringa Centre for Integrative Bioinformatics VU (IBIVU)

Reading Assignments. A. Genes and the Synthesis of Polypeptides. Lecture Series 7 From DNA to Protein: Genotype to Phenotype

2011 The Simple Homeschool Simple Days Unit Studies Cells

Chapter 6: A Tour of the Cell

Comparative genomics: Overview & Tools + MUMmer algorithm

CELL BIOLOGY. Which of the following cell structures does not have membranes? A. Ribosomes B. Mitochondria C. Chloroplasts D.

Week 10: Homology Modelling (II) - HHpred

Study Guide: Fall Final Exam H O N O R S B I O L O G Y : U N I T S 1-5

The diagram below represents levels of organization within a cell of a multicellular organism.

Moreover, the circular logic

Sequence analysis and Genomics

Honors Biology Fall Final Exam Study Guide

Overview of Cells. Prokaryotes vs Eukaryotes The Cell Organelles The Endosymbiotic Theory

Bioinformatics and BLAST

First generation sequencing and pairwise alignment (High-tech, not high throughput) Analysis of Biological Sequences

Tools and Algorithms in Bioinformatics

BCMP 201 Protein biochemistry

Midterm Review Guide. Unit 1 : Biochemistry: 1. Give the ph values for an acid and a base. 2. What do buffers do? 3. Define monomer and polymer.

Sequence Alignments. Dynamic programming approaches, scoring, and significance. Lucy Skrabanek ICB, WMC January 31, 2013

SPRINGFIELD TECHNICAL COMMUNITY COLLEGE ACADEMIC AFFAIRS

Similarity searching summary (2)

In-Depth Assessment of Local Sequence Alignment

Biology I Fall Semester Exam Review 2014

From DNA to protein, i.e. the central dogma

Basics of protein structure

Algorithmics and Bioinformatics

Biology. 7-2 Eukaryotic Cell Structure 10/29/2013. Eukaryotic Cell Structures

Computational Biology: Basics & Interesting Problems

Biology Midterm Review

Examples of Protein Modeling. Protein Modeling. Primary Structure. Protein Structure Description. Protein Sequence Sources. Importing Sequences to MOE

Cell. A Montagud E Navarro P Fernández de Córdoba JF Urchueguía

MEDICAL UNIVERSITY OF VARNA

Short Answers Worksheet Grade 6

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

Transcription:

Biology Tutorial Aarti Balasubramani Anusha Bharadwaj Massa Shoura Stefan Giovan

Viruses A T4 bacteriophage injecting DNA into a cell. Influenza A virus Electron micrograph of HIV. Cone-shaped cores are sectioned in various orientations. Viral genomic RNA is located in the electron-dense wide end of core. http://stc/istc.nsf/va_webpages/influenzaengprint http://pathmicro.med.sc.edu

Life Begins with Cells

All cells are Prokaryotic or Eukaryotic http://course1.winona.edu/

Eukaryotic Cell Endothelial cells under the microscope. Nuclei are stained blue with DAPI, microtubules are marked green by an antibody bound to FITC and actin filaments are labeled red with phalloidin bound to TRITC. Bovine pulmonary artery endothelial cells

Cell Organelles Nucleus= contains the genetic material Mitochondrion= produces energy

http://microbewiki.kenyon.edu/ Golgi complex=protein distribution Endoplasmic Reticulum and Ribosomes=protein factory Lysosome=degradation

Plasma Membrane

DNA Replication Base Pairing A=T C G http://www.youtube.com/watch?v=tev62zrm2p0&feature=related

Life Cycle of a Cell Cell division RNA and protein synthesis RNA and protein synthesis Resting cells DNA Replication

The Central Dogma of Biology Replication

Transcription http://www.youtube.com/watch?v=ztpkv7wc3yu

Translation http://www.youtube.com/watch?v=-zb6r1mmtkc

Cellular Biology Outline Organelle Structure/Function Central Dogma Biochemistry Energy Storage/Utilization Macromolecules Bioinformatics Sequences and Databases Alignments, Tree Building, Modeling

Cells are Composed of a Molecular Hierarchy } } } Small molecules Macromolecules Supramolecular complexes

BONDS, JUST BONDS Covalent nuclei share common electrons STRONG!! Non-Covalent No common electrons WEAK!! Ionic Non-Ionic http://publications.nigms.nih.gov/chemhealth/images/ch1_bonds.gif

Macromolecular Structures are Stabilized by Weak Forces Force Strength, kj mol -1 Distance Dependence Effective Range, nm Van der Waals interactions 0.4-4 r 6 0.2 Hydrogen bonds 4-48 r 3 0.3 Electrostatic interactions (unscreened) 20-50 r 1 5-50 Hydrophobic interactions <40??

Hydrophobic Interactions Structures formed by amphipathic molecules in H 2 O Vibrational frequencies of O-H bond of H 2 O in ice, liquid H 2 O and CCl 4 van Holde, Johnson & Ho Principles of Physical Biochemistry Prentice Hall, Upper Saddle River, NJ (1998)

What Is DNA Made of? 5 3

DNA The Double Helix

Levels of Chromatin Packing

The Human Genome

DNA to Amino Acids

Amino Acids Proteins Building Blocks

The Making of a Polypeptide Chain

The Four Levels of Protein Structure 3-dimensional folding of molecule Linear arrangement of monomeric unit Local regular structure Spatial arrangement of multiple subunits

Single Nucleotide Mutations

DNA Mutations

Experimental Techniques

Restriction Digestion

Use of Restriction Digestion to Identify Mutations (a) Wild-type and mutant DNA sequences

Gel Electrophoresis

Gel Electrophoresis-Visualizing DNA

The Polymerase Chain Reaction (PCR)

Cloning a human gene in a bacterial plasmid

Cellular Biology Outline Organelle Structure/Function Central Dogma Biochemistry Energy Storage/Utilization Macromolecules Bioinformatics Sequences and Databases Alignments, Tree Building, Modeling

Phenotype Tree Building How Related are Organisms? What do they eat? Where do they live? How do they divide? Move? Etc. Qualitative http://nai.arc.nasa.gov/seminars/68_rivera/tree.jpg

Genotype Tree Building How Related are Organisms? How similar is their genome? Proteome? MOLECULAR EVOLUTION Quantitative http://nai.arc.nasa.gov/seminars/68_rivera/tree.jpg

Comparison of Genomes 1977- Φ-X174 genome sequenced Only about 5.4 kbp 1997- E. coli K-12 genome sequenced About 4.6x10 3 kbp 2007- Watson s Genome sequenced! About 3x10 6 kbp! About 0.1% difference between human genomes and 1% difference between humans and chimps!

Bioinformatics is Highly Interdisciplinary Proteomics and Genomics Structural and Computational Biology Systems Biology Computer Science, Probabilistic Modeling Computational Sequence Analysis What s in a sequence?

Power of Prediction Can we predict structural and functional properties of proteins given its sequence? predict the consequences of a mutation? design proteins or drugs with specific functions? Every thing we need to know is at our fingertips, just need a better understanding of the natural world

Protein Structure F U TS G H TS http://www.news.cornell.edu/stories/aug06/protein_folding.jpg

Secondary Structure Prediction 2 o structures form beneficial H-bonds (lower E) -helices, -sheets Dihedral angles (, ) Source: Wikipedia

Tertiary Structure Prediction Homology/Comparative Modeling BEST Structure of very related protein is known Fold Recognition/Threading OFTEN IS ENOUGH Similar folds available but no close relative Knowledge Based or A Priori Predictions ONLY POSSIBLE FOR VERY SHORT PROTEINS Fold prediction but without experimental quality

Sequence Alignments FASTA Text Format >header my sequence >header my thesis THISISMYSEQ THESISTHYSTING Alignment T H I S I S M Y S E Q T H E S I S T H Y S T I N G What can we learn from this?

Beta Chain Alignments Pairwise Dot Plot Global(N-W) or Local(S-W) Simple Database Searches FASTA/BLAST Multiple Alignments CLUSTAL Advanced Strategies PSI/PHI-BLAST, HMM s Dot plot of two subunits in Human Hemoglobin Alpha Chain

Databases Nucleotide Sequence Database Collaboration DDBJ, EMBL, GenBank at NCBI Amino Acid Databases UniProt, SWISS-PROT, TrEMBL Structural PDB, MMDB, MSD Very Many Derivations! http://www.ncbi.nlm.nih.gov/database/

Scoring Matrices PAM Matrix : Point Accepted Mutation PAM1 estimates substitution rate if 1% of AA had changed. Standards: PAM30 and PAM60 BLOSUM : BLOcks of Amino Acid SUbstitution Matrix BLOSUM80 blocks together sequences with greater then 80% similarity. PAM1 BLOSUM80 Less Divergent More Divergent PAM250 BLOSUM45

FASTA and BLAST FASTA - FAST All, Rapid AA or NT Alignments BLAST Basic Local Alignment Search Tool Scoring Alignments S ln K Raw and Bit Scores; S ' ln 2 Significance of Local Alignment; E mn x u Significance of Global Alignment; Z 2 S '

Nucleotide Sequence Distances Jukes-Cantor, single parameter 3 4 d ln 1 p 4 3 Kimura, 2 parameter A G 1 1 1 1 d ln ln 2 1 2 p q 4 1 2q C T A G C T

Distance Based Tree Building Tree Building => UPGMA Smallest distance element -> nearest neighbors t t 0.5d 1 2 12 1 5 2 4 3 1-2 0.1-3 0.8 0.8-4 0.8 1 0.3-5 0.9 0.9 0.3 0.2-0.05 0.05 1 2

Distance Based Tree Building Tree Building => UPGMA Smallest distance element -> nearest neighbors t t 0.5d 4 5 45 1 5 2 4 3 6 (1,2) - 3 0.8-4 0.9 0.3-5 0.9 0.3 0.2-6 1 2 0.10 0.10 4 5

Distance Based Tree Building Tree Building => UPGMA Smallest distance element -> nearest neighbors t 0.5d 3 37 1 5 2 4 3 6 (1,2) - 3 0.8-7 (4,5) 0.9 0.3-6 0.15 7 1 2 3 4 5

Distance Based Tree Building Tree Building => UPGMA Smallest distance element -> nearest neighbors t 0.5d 6 68 1 5 2 4 3 6 (1,2) - - 8 (3,4,5) 0.85-0.425 9 8 7 6 1 2 3 4 5

Distance Based Tree Building UPGMA is efficient but makes non-biological assumption that rate of substitution is constant for all branches Useful in a variety of applications such as microarray data processing Neighbor-Joining does not make this assumption and is still efficient More accurate for use in phylogenetic analyses Also -> Maximum Parsimony, Maximum Likelihood, Minimum Evolution, and Bayesian methods

Energy Calculations

Molecular Mechanics E K V V V V i i, bonding i, nonbond E : Total energy K : Kinetic energy V : Potential energy Sum of covalent and noncovalent interactions K 1 2 1 miv i i 2 2 i p 2 i m i v : Velocity of particle i i p : Momentum of particle i i F V V x V i i y i i V z i F i : Force acting on particle i (gradient of potential energy)

Fold It!! FOLD IT http://fold.it/portal/info/science

Beta Chain Pairwise Alignment Dot Plot Visual and Qualitative Needleman-Wunsch Global Alignment Alignment over entire sequence Smith-Waterman Local Alignment Alignment over subsequences Dot plot of two subunits in Human Hemoglobin Alpha Chain http://lectures.molgen.mpg.de/pairwise/dotplots/

N-W Alignment Produces Optimal Global Alignment Without exhaustive pairwise comparison Scoring Matrix, S F M D T P L N E F 1 K H M 1 E 1 D 1 P 1 L 1 E 1 Simple scoring matrix for these sequences Matches get a score of +1 Mismatches (blank) get a score of -2 One could also use BLOSUM or PAM scoring matrix for example

N-W Alignment Produces Optimal Global Alignment Without exhaustive pairwise comparison Alignment Matrix, F F M D T P L N E 0-2 -4-6 -8-10 -12-14 -16 F -2 +1 K -4 H -6 M -8 E -10 D -12 P -14 L -16 E -18 Fi 1, j 1 S kl Fij max Fi 1, j gap Fi, j 1 gap

N-W Alignment Produces Optimal Global Alignment Without exhaustive pairwise comparison Build Scoring Matrix, F F M D T P L N E 0-2 -4-6 -8-10 -12-14 -16 F -2 +1-1 -3-5 -7-9 -11-13 K -4-1 -1 H -6 M -8 E -10 D -12 P -14 L -16 E -18 Fi 1, j 1 S kl Fij max Fi 1, j gap Fi, j 1 gap

N-W Alignment Produces Optimal Global Alignment Without exhaustive pairwise comparison Build Scoring Matrix, F F M D T P L N E 0-2 -4-6 -8-10 -12-14 -16 F -2 +1-1 -3-5 -7-9 -11-13 K -4-1 -1-3 -5-7 -9-11 -13 H -6-3 -3-3 -5-7 -9-11 -13 M -8-5 -2-4 -5-7 -9-11 -13 E -10-7 -4-4 -6-7 -9-11 -10 D -12-9 -6-3 -5-7 -9-11 -12 P -14-11 -8-5 -5-4 -6-8 -10 L -16-13 -10-7 -7-6 -3-5 -7 E -18-15 -12-9 -9-8 -5-5 -4 Fi 1, j 1 S kl Fij max Fi 1, j gap Fi, j 1 gap Overall alignment score

N-W Alignment Produces Optimal Global Alignment Without exhaustive pairwise comparison Trace Back to Determine Optimum Alignment F M D T P L N E 0-2 -4-6 -8-10 -12-14 -16 F -2 +1-1 -3-5 -7-9 -11-13 K -4-1 -1-3 -5-7 -9-11 -13 H -6-3 -3-3 -5-7 -9-11 -13 M -8-5 -2-4 -5-7 -9-11 -13 E -10-7 -4-4 -6-7 -9-11 -10 D -12-9 -6-3 -5-7 -9-11 -12 P -14-11 -8-5 -5-4 -6-8 -10 L -16-13 -10-7 -7-6 -3-5 -7 E -18-15 -12-9 -9-8 -5-5 -4 Match or Mismatch Gap in Sequence 1 Gap in Sequence 2 Seq1: F K HME D- P L - E Seq2: F - - M- DT P L NE

Smith-Waterman Alignment Local alignment, Similar in Nature to N-W S takes only non-negative values Highest value in matrix corresponds to end of alignment, need not be in corner No penalty for gaps at ends Most rigorous method of aligning nucleotide or protein sequence domains

Database Searches Optimal pairwise alignment produced by S-W, but insufficient in scanning databases Scan for likely matches before performing more rigorous alignments FASTA, BLAST Scan for words scoring higher than some threshold, extend alignment until score drops

Advanced Database Searches When BLAST falls short Detecting homology between distantly related proteins Very long (>20kbp) genome sequences with highly conserved regions and highly variable regions PSI-BLAST (Position-Specific Iterated) BLAST generates Position Specific Scoring Matrix PSSM used as query to re-search database Also, PHI-BLAST, HMMs

Multiple Sequence Alignments Exact Approaches e.g. N-W alignments Prohibitive for many or long sequences Progressive Approaches e.g. CLUSTAL Iterative Approaches Consistency-Based Approaches Structure-Based Methods

Distance Between Sequences Based on theory of molecular evolution differences distances Simplest method, Hamming distance, d 100 p Multiple substitutions at single site? Poisson correction, d ln 1 p Assume: Probability of observing a change is small, but constant across all sites Rate of mutation is constant over time Mutations at different sites occur independently

James Watson, Francis Crick and Rosalind Franklin