Biology Tutorial Aarti Balasubramani Anusha Bharadwaj Massa Shoura Stefan Giovan
Viruses A T4 bacteriophage injecting DNA into a cell. Influenza A virus Electron micrograph of HIV. Cone-shaped cores are sectioned in various orientations. Viral genomic RNA is located in the electron-dense wide end of core. http://stc/istc.nsf/va_webpages/influenzaengprint http://pathmicro.med.sc.edu
Life Begins with Cells
All cells are Prokaryotic or Eukaryotic http://course1.winona.edu/
Eukaryotic Cell Endothelial cells under the microscope. Nuclei are stained blue with DAPI, microtubules are marked green by an antibody bound to FITC and actin filaments are labeled red with phalloidin bound to TRITC. Bovine pulmonary artery endothelial cells
Cell Organelles Nucleus= contains the genetic material Mitochondrion= produces energy
http://microbewiki.kenyon.edu/ Golgi complex=protein distribution Endoplasmic Reticulum and Ribosomes=protein factory Lysosome=degradation
Plasma Membrane
DNA Replication Base Pairing A=T C G http://www.youtube.com/watch?v=tev62zrm2p0&feature=related
Life Cycle of a Cell Cell division RNA and protein synthesis RNA and protein synthesis Resting cells DNA Replication
The Central Dogma of Biology Replication
Transcription http://www.youtube.com/watch?v=ztpkv7wc3yu
Translation http://www.youtube.com/watch?v=-zb6r1mmtkc
Cellular Biology Outline Organelle Structure/Function Central Dogma Biochemistry Energy Storage/Utilization Macromolecules Bioinformatics Sequences and Databases Alignments, Tree Building, Modeling
Cells are Composed of a Molecular Hierarchy } } } Small molecules Macromolecules Supramolecular complexes
BONDS, JUST BONDS Covalent nuclei share common electrons STRONG!! Non-Covalent No common electrons WEAK!! Ionic Non-Ionic http://publications.nigms.nih.gov/chemhealth/images/ch1_bonds.gif
Macromolecular Structures are Stabilized by Weak Forces Force Strength, kj mol -1 Distance Dependence Effective Range, nm Van der Waals interactions 0.4-4 r 6 0.2 Hydrogen bonds 4-48 r 3 0.3 Electrostatic interactions (unscreened) 20-50 r 1 5-50 Hydrophobic interactions <40??
Hydrophobic Interactions Structures formed by amphipathic molecules in H 2 O Vibrational frequencies of O-H bond of H 2 O in ice, liquid H 2 O and CCl 4 van Holde, Johnson & Ho Principles of Physical Biochemistry Prentice Hall, Upper Saddle River, NJ (1998)
What Is DNA Made of? 5 3
DNA The Double Helix
Levels of Chromatin Packing
The Human Genome
DNA to Amino Acids
Amino Acids Proteins Building Blocks
The Making of a Polypeptide Chain
The Four Levels of Protein Structure 3-dimensional folding of molecule Linear arrangement of monomeric unit Local regular structure Spatial arrangement of multiple subunits
Single Nucleotide Mutations
DNA Mutations
Experimental Techniques
Restriction Digestion
Use of Restriction Digestion to Identify Mutations (a) Wild-type and mutant DNA sequences
Gel Electrophoresis
Gel Electrophoresis-Visualizing DNA
The Polymerase Chain Reaction (PCR)
Cloning a human gene in a bacterial plasmid
Cellular Biology Outline Organelle Structure/Function Central Dogma Biochemistry Energy Storage/Utilization Macromolecules Bioinformatics Sequences and Databases Alignments, Tree Building, Modeling
Phenotype Tree Building How Related are Organisms? What do they eat? Where do they live? How do they divide? Move? Etc. Qualitative http://nai.arc.nasa.gov/seminars/68_rivera/tree.jpg
Genotype Tree Building How Related are Organisms? How similar is their genome? Proteome? MOLECULAR EVOLUTION Quantitative http://nai.arc.nasa.gov/seminars/68_rivera/tree.jpg
Comparison of Genomes 1977- Φ-X174 genome sequenced Only about 5.4 kbp 1997- E. coli K-12 genome sequenced About 4.6x10 3 kbp 2007- Watson s Genome sequenced! About 3x10 6 kbp! About 0.1% difference between human genomes and 1% difference between humans and chimps!
Bioinformatics is Highly Interdisciplinary Proteomics and Genomics Structural and Computational Biology Systems Biology Computer Science, Probabilistic Modeling Computational Sequence Analysis What s in a sequence?
Power of Prediction Can we predict structural and functional properties of proteins given its sequence? predict the consequences of a mutation? design proteins or drugs with specific functions? Every thing we need to know is at our fingertips, just need a better understanding of the natural world
Protein Structure F U TS G H TS http://www.news.cornell.edu/stories/aug06/protein_folding.jpg
Secondary Structure Prediction 2 o structures form beneficial H-bonds (lower E) -helices, -sheets Dihedral angles (, ) Source: Wikipedia
Tertiary Structure Prediction Homology/Comparative Modeling BEST Structure of very related protein is known Fold Recognition/Threading OFTEN IS ENOUGH Similar folds available but no close relative Knowledge Based or A Priori Predictions ONLY POSSIBLE FOR VERY SHORT PROTEINS Fold prediction but without experimental quality
Sequence Alignments FASTA Text Format >header my sequence >header my thesis THISISMYSEQ THESISTHYSTING Alignment T H I S I S M Y S E Q T H E S I S T H Y S T I N G What can we learn from this?
Beta Chain Alignments Pairwise Dot Plot Global(N-W) or Local(S-W) Simple Database Searches FASTA/BLAST Multiple Alignments CLUSTAL Advanced Strategies PSI/PHI-BLAST, HMM s Dot plot of two subunits in Human Hemoglobin Alpha Chain
Databases Nucleotide Sequence Database Collaboration DDBJ, EMBL, GenBank at NCBI Amino Acid Databases UniProt, SWISS-PROT, TrEMBL Structural PDB, MMDB, MSD Very Many Derivations! http://www.ncbi.nlm.nih.gov/database/
Scoring Matrices PAM Matrix : Point Accepted Mutation PAM1 estimates substitution rate if 1% of AA had changed. Standards: PAM30 and PAM60 BLOSUM : BLOcks of Amino Acid SUbstitution Matrix BLOSUM80 blocks together sequences with greater then 80% similarity. PAM1 BLOSUM80 Less Divergent More Divergent PAM250 BLOSUM45
FASTA and BLAST FASTA - FAST All, Rapid AA or NT Alignments BLAST Basic Local Alignment Search Tool Scoring Alignments S ln K Raw and Bit Scores; S ' ln 2 Significance of Local Alignment; E mn x u Significance of Global Alignment; Z 2 S '
Nucleotide Sequence Distances Jukes-Cantor, single parameter 3 4 d ln 1 p 4 3 Kimura, 2 parameter A G 1 1 1 1 d ln ln 2 1 2 p q 4 1 2q C T A G C T
Distance Based Tree Building Tree Building => UPGMA Smallest distance element -> nearest neighbors t t 0.5d 1 2 12 1 5 2 4 3 1-2 0.1-3 0.8 0.8-4 0.8 1 0.3-5 0.9 0.9 0.3 0.2-0.05 0.05 1 2
Distance Based Tree Building Tree Building => UPGMA Smallest distance element -> nearest neighbors t t 0.5d 4 5 45 1 5 2 4 3 6 (1,2) - 3 0.8-4 0.9 0.3-5 0.9 0.3 0.2-6 1 2 0.10 0.10 4 5
Distance Based Tree Building Tree Building => UPGMA Smallest distance element -> nearest neighbors t 0.5d 3 37 1 5 2 4 3 6 (1,2) - 3 0.8-7 (4,5) 0.9 0.3-6 0.15 7 1 2 3 4 5
Distance Based Tree Building Tree Building => UPGMA Smallest distance element -> nearest neighbors t 0.5d 6 68 1 5 2 4 3 6 (1,2) - - 8 (3,4,5) 0.85-0.425 9 8 7 6 1 2 3 4 5
Distance Based Tree Building UPGMA is efficient but makes non-biological assumption that rate of substitution is constant for all branches Useful in a variety of applications such as microarray data processing Neighbor-Joining does not make this assumption and is still efficient More accurate for use in phylogenetic analyses Also -> Maximum Parsimony, Maximum Likelihood, Minimum Evolution, and Bayesian methods
Energy Calculations
Molecular Mechanics E K V V V V i i, bonding i, nonbond E : Total energy K : Kinetic energy V : Potential energy Sum of covalent and noncovalent interactions K 1 2 1 miv i i 2 2 i p 2 i m i v : Velocity of particle i i p : Momentum of particle i i F V V x V i i y i i V z i F i : Force acting on particle i (gradient of potential energy)
Fold It!! FOLD IT http://fold.it/portal/info/science
Beta Chain Pairwise Alignment Dot Plot Visual and Qualitative Needleman-Wunsch Global Alignment Alignment over entire sequence Smith-Waterman Local Alignment Alignment over subsequences Dot plot of two subunits in Human Hemoglobin Alpha Chain http://lectures.molgen.mpg.de/pairwise/dotplots/
N-W Alignment Produces Optimal Global Alignment Without exhaustive pairwise comparison Scoring Matrix, S F M D T P L N E F 1 K H M 1 E 1 D 1 P 1 L 1 E 1 Simple scoring matrix for these sequences Matches get a score of +1 Mismatches (blank) get a score of -2 One could also use BLOSUM or PAM scoring matrix for example
N-W Alignment Produces Optimal Global Alignment Without exhaustive pairwise comparison Alignment Matrix, F F M D T P L N E 0-2 -4-6 -8-10 -12-14 -16 F -2 +1 K -4 H -6 M -8 E -10 D -12 P -14 L -16 E -18 Fi 1, j 1 S kl Fij max Fi 1, j gap Fi, j 1 gap
N-W Alignment Produces Optimal Global Alignment Without exhaustive pairwise comparison Build Scoring Matrix, F F M D T P L N E 0-2 -4-6 -8-10 -12-14 -16 F -2 +1-1 -3-5 -7-9 -11-13 K -4-1 -1 H -6 M -8 E -10 D -12 P -14 L -16 E -18 Fi 1, j 1 S kl Fij max Fi 1, j gap Fi, j 1 gap
N-W Alignment Produces Optimal Global Alignment Without exhaustive pairwise comparison Build Scoring Matrix, F F M D T P L N E 0-2 -4-6 -8-10 -12-14 -16 F -2 +1-1 -3-5 -7-9 -11-13 K -4-1 -1-3 -5-7 -9-11 -13 H -6-3 -3-3 -5-7 -9-11 -13 M -8-5 -2-4 -5-7 -9-11 -13 E -10-7 -4-4 -6-7 -9-11 -10 D -12-9 -6-3 -5-7 -9-11 -12 P -14-11 -8-5 -5-4 -6-8 -10 L -16-13 -10-7 -7-6 -3-5 -7 E -18-15 -12-9 -9-8 -5-5 -4 Fi 1, j 1 S kl Fij max Fi 1, j gap Fi, j 1 gap Overall alignment score
N-W Alignment Produces Optimal Global Alignment Without exhaustive pairwise comparison Trace Back to Determine Optimum Alignment F M D T P L N E 0-2 -4-6 -8-10 -12-14 -16 F -2 +1-1 -3-5 -7-9 -11-13 K -4-1 -1-3 -5-7 -9-11 -13 H -6-3 -3-3 -5-7 -9-11 -13 M -8-5 -2-4 -5-7 -9-11 -13 E -10-7 -4-4 -6-7 -9-11 -10 D -12-9 -6-3 -5-7 -9-11 -12 P -14-11 -8-5 -5-4 -6-8 -10 L -16-13 -10-7 -7-6 -3-5 -7 E -18-15 -12-9 -9-8 -5-5 -4 Match or Mismatch Gap in Sequence 1 Gap in Sequence 2 Seq1: F K HME D- P L - E Seq2: F - - M- DT P L NE
Smith-Waterman Alignment Local alignment, Similar in Nature to N-W S takes only non-negative values Highest value in matrix corresponds to end of alignment, need not be in corner No penalty for gaps at ends Most rigorous method of aligning nucleotide or protein sequence domains
Database Searches Optimal pairwise alignment produced by S-W, but insufficient in scanning databases Scan for likely matches before performing more rigorous alignments FASTA, BLAST Scan for words scoring higher than some threshold, extend alignment until score drops
Advanced Database Searches When BLAST falls short Detecting homology between distantly related proteins Very long (>20kbp) genome sequences with highly conserved regions and highly variable regions PSI-BLAST (Position-Specific Iterated) BLAST generates Position Specific Scoring Matrix PSSM used as query to re-search database Also, PHI-BLAST, HMMs
Multiple Sequence Alignments Exact Approaches e.g. N-W alignments Prohibitive for many or long sequences Progressive Approaches e.g. CLUSTAL Iterative Approaches Consistency-Based Approaches Structure-Based Methods
Distance Between Sequences Based on theory of molecular evolution differences distances Simplest method, Hamming distance, d 100 p Multiple substitutions at single site? Poisson correction, d ln 1 p Assume: Probability of observing a change is small, but constant across all sites Rate of mutation is constant over time Mutations at different sites occur independently
James Watson, Francis Crick and Rosalind Franklin