Computational Molecular Biology (
|
|
- Leo Jenkins
- 5 years ago
- Views:
Transcription
1 Computational Molecular Biology ( cmgm.stanford.edu/biochem218/) Biochemistry 218/Medical Information Sciences 231 Douglas L. Brutlag, Lee Kozar Jimmy Huang, Josh Silverman
2 Lecture Syllabus ( cmgm.stanford.edu/biochem218/) Doug Brutlag, 1999 Date Topic Lecturer Sept. 22 Representations of Sequences and Structures Doug Brutlag Sept 27 PubMed & Full Text Journal Access Janet Morrison Sept 29 Molecular Biology Databases on the Web Doug Brutlag Oct 4 Molecular Databases II Doug Brutlag Oct 6 Pattern Matching with Consensus Sequences Doug Brutlag Oct. 11 Quantitative and Probabilistic Pattern Matching Doug Brutlag Oct. 13 Sequence Alignment Doug Brutlag Oct. 18 Rapid Sequence Similarity Search I Doug Brutlag Oct. 20 Rapid Sequence Similarity Search II Doug Brutlag Oct. 25 Near-Optimal Sequence Alignments Doug Brutlag Oct. 27 Multiple Sequence Alignment Doug Brutlag Nov 1 Sequence Based Phylogenies Doug Brutlag Nov. 3 Sequence Blocks and Profiles Doug Brutlag Nov. 8 Discrete Protein Sequence Motifs Doug Brutlag Nov. 10 Protein Microenvironments Russ Altman Nov. 15 Probabilistic Protein Motifs Tom Wu Nov. 17 Motif Discovery Using Gibb's Sampling Scott Schmidler Nov. 22 Nov. 24 Issues in Predicting Protein Secondary Structure Scott Schmidler Protein Folds and Protein Structure SuperpositionAmit P. Singh Nov. 30 Protein Ligand Docking Amit P. Singh
3 Course Availability Gates B03 Monday and Wednesday 2:15-3:45 PM Stanford Center for Professional Development scpd.stanford.edu/ Live on SITN Channel E1 Stanford Online Course available 24 hours/day, 7 days/week Students may register in any quarter
4 Course Requirements Lectures Theoretical background of current methods Strengths and weaknesses of current approaches Future directions for improvements Demonstrations Implementations (Mac, PC, Unix, Web) Illustrate homework Six to eight homework assignments All homework submitted electronically as attachments Doug Brutlag, 1999 Due one week after assigned Final project (DUE NOVEMBER 30TH) Critically review an area Critically analyze your own data sets Propose new approach Implement a new approach
5 Bioinformatics Text
6 Reviews
7 Durbin et al.
8 Gusfield
9 Baldi Bioinformatics
10 Genomics, Bioinformatics & Medicine Genomics Molecular Diagnostics Molecular Epidemiology Bioinformatics Identify Drug Targets Rational Drug Design Genetic Therapy
11 Genomics, Bioinformatics & Medicine Genomics Molecular Diagnostics Molecular Epidemiology Bioinformatics Identify Drug Targets Rational Drug Design Genetic Therapy Machine Learning Artificial Intelligence Algorithms Robotics Statistics & Probability Graph Theory Databases Information Theory
12 National Center for Biotechnology Information (
13 Human Genome Resources ( ncbi.nlm.nih.gov/genome/guide/)
14 Genes on Chromosome 7 ( genemap/map. /map.cgi?chr=7)
15 Central Paradigm of Molecular Biology DNA RNA Protein Phenotype Molecules Structure Function Processes Mechanism Specificity Regulation
16 Central Paradigm of Bioinformatics Genetic Information SRAAINKHIVA VSYQTVSRVVN VSTATVSRALA GVTTTVSHVIN SGVSAVSAILN GVSEMTRRDLN TAYATIHVRVE GSQPTVSRELA MSIATITRGSN ISRETVGRILK FDISRLSHLFR LRPSRLAHLFR MTVETISRLLG TLEFHLHRLFK
17 Central Paradigm of Bioinformatics Genetic Information Molecular Structure SRAAINKHIVA VSYQTVSRVVN VSTATVSRALA GVTTTVSHVIN SGVSAVSAILN GVSEMTRRDLN TAYATIHVRVE GSQPTVSRELA MSIATITRGSN ISRETVGRILK FDISRLSHLFR LRPSRLAHLFR MTVETISRLLG TLEFHLHRLFK
18 Central Paradigm of Bioinformatics Genetic Information Molecular Structure Biochemical Function SRAAINKHIVA VSYQTVSRVVN VSTATVSRALA GVTTTVSHVIN SGVSAVSAILN GVSEMTRRDLN TAYATIHVRVE GSQPTVSRELA MSIATITRGSN ISRETVGRILK FDISRLSHLFR LRPSRLAHLFR MTVETISRLLG TLEFHLHRLFK
19 Central Paradigm of Bioinformatics Genetic Information Molecular Structure Biochemical Function Symptoms (Phenotype) SRAAINKHIVA VSYQTVSRVVN VSTATVSRALA GVTTTVSHVIN SGVSAVSAILN GVSEMTRRDLN TAYATIHVRVE GSQPTVSRELA MSIATITRGSN ISRETVGRILK FDISRLSHLFR LRPSRLAHLFR MTVETISRLLG TLEFHLHRLFK
20 Central Paradigm of Bioinformatics Genetic Information Molecular Structure Biochemical Function Symptoms (Phenotype) SRAAINKHIVA VSYQTVSRVVN VSTATVSRALA GVTTTVSHVIN SGVSAVSAILN GVSEMTRRDLN TAYATIHVRVE GSQPTVSRELA MSIATITRGSN ISRETVGRILK FDISRLSHLFR LRPSRLAHLFR MTVETISRLLG TLEFHLHRLFK
21 Central Paradigm of Bioinformatics Genetic Information Molecular Structure Biochemical Function Phenotype
22 Challenges Understanding Genetic Information Genetic Information Molecular Structure Biochemical Function Phenotype Genetic information is redundant Structural information is redundant Single genes have multiple functions Genes are one dimensional but function depends on three-dimensional structure
23 Redundancy in Genomic Sequences DNA is double-stranded Genetic code Acceptable amino-acid replacements Intron-exon variation Strain variation Sequencing errors
24 Multiple Representations of Sequences
25 Multiple Representations of Sequences Sequences of Common Structure or Function Doug Brutlag, 1999 Sequence Alignments VLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHF------DLSHGS : : : : : : : : : : : 2 HLTPEEKSAVTALWGKV--NVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGN Initial Score = 63 Optimized Score = 98 Significance = 5.51 Residue Identity = 14% Matches = 21 Mismatches = 22 Gaps = 2 Conservative Substitutions = 11
26 Multiple Representations of Sequences Consensus Sequences Zinc Finger (C2H2 type) CX{2,4}CX{12}HX{3,5}H Sequences of Common Structure or Function Doug Brutlag, 1999 Sequence Alignments VLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHF------DLSHGS : : : : : : : : : : : 2 HLTPEEKSAVTALWGKV--NVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGN Initial Score = 63 Optimized Score = 98 Significance = 5.51 Residue Identity = 14% Matches = 21 Mismatches = 22 Gaps = 2 Conservative Substitutions = 11
27 Multiple Representations of Sequences Blocks, Profiles or Templates Position A R N D C Q E G H I L K M F P S T W Y V Doug Brutlag, 1999 Consensus Sequences Zinc Finger (C2H2 type) CX{2,4}CX{12}HX{3,5}H Sequences of Common Structure or Function Sequence Alignments VLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHF------DLSHGS : : : : : : : : : : : 2 HLTPEEKSAVTALWGKV--NVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGN Initial Score = 63 Optimized Score = 98 Significance = 5.51 Residue Identity = 14% Matches = 21 Mismatches = 22 Gaps = 2 Conservative Substitutions = 11
28 Multiple Representations of Sequences Blocks, Profiles or Templates Position A R N D C Q E G H I L K M F P S T W Y V Doug Brutlag, 1999 Consensus Sequences Zinc Finger (C2H2 type) CX{2,4}CX{12}HX{3,5}H Sequences of Common Structure or Function Sequence Alignments Hidden Markov Model VLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHF------DLSHGS : : : : : : : : : : : 2 HLTPEEKSAVTALWGKV--NVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGN Initial Score = 63 Optimized Score = 98 Significance = 5.51 Residue Identity = 14% Matches = 21 Mismatches = 22 Gaps = 2 Conservative Substitutions = 11 AA1 D 2 D 3 D 4 D 5 I 1 I 2 I 3 I 4 I 5 AA2 AA3 AA4 AA5 AA6
29 Multiple Representations of Protein Structure K b (b i - b o )2 + all bonds K ( i - o )2 + all angles all torsion angles all non-bonded pairs all partial charge pairs AA1 K [1 - cos (n i + )] + {(r o /r ij )12-2 (r o /r ij )6} q i q j / r ij AA6 82% Hydrophobic 18% Hydrophilic AA2 AA3 HTH? AA4 AA FPTTKTYFPHF-DLS-----HGS : : : YPWTQRFFESFGDLSTPDAVMGN 40 50
30 Sequence Alignment ( alion/) X X F--SGGNTHIYMNHVEQCKEILRREPKELCELVISGLPYKFRYLSTKE-QLK-Y : :: : : : : : ::::: :: GDFIHTLGDAHIYLNHIEPLKIQLQREPRPFPKLRILRKVEKIDDFKAEDFQIEGYN X X Region End Score = Similarity-weights - Penalties Region Start where: Region End Region Start Penalty = Gap-penalty + Size-of-gap x Gap-size-penalty
31 Smith-Waterman Similarity Search Query: HU-NS1 Maximal Score: 452 PAM Matrix: 200 Gap Penalty: 5 Gap Extension: 0.5 No. Score Match Length DB ID Description Pred. No DBHB_ECOLI DNA-BINDING PROTEIN H 8.74e DBHB_SALTY DNA-BINDING PROTEIN H 1.54e DBHA_ECOLI DNA-BINDING PROTEIN H 1.64e DBHA_SALTY DNA-BINDING PROTEIN H 1.64e DBH_BACST DNA-BINDING PROTEIN I 1.35e DBH_BACSU DNA-BINDING PROTEIN I 1.35e DBH_VIBPR DNA-BINDING PROTEIN H 2.35e DBH_PSEAE DNA-BINDING PROTEIN H 2.14e DBH1_RHILE DNA-BINDING PROTEIN H 1.47e DBH_CLOPA DNA-BINDING PROTEIN H 2.52e DBH_RHIME DNA-BINDING PROTEIN H 3.18e DBH5_RHILE DNA-BINDING PROTEIN H 9.29e DBH_ANASP DNA-BINDING PROTEIN H 3.32e DBH_CRYPH DNA-BINDING PROTEIN H 2.70e DBH_THETH DNA-BINDING PROTEIN I 1.07e IHFA_SERMA INTEGRATION HOST FACT 4.46e IHFA_RHOCA INTEGRATION HOST FACT 3.52e IHFA_SALTY INTEGRATION HOST FACT 5.90e IHFA_ECOLI INTEGRATION HOST FACT 9.87e IHFB_ECOLI INTEGRATION HOST FACT 7.71e IHFB_SERMA INTEGRATION HOST FACT 7.71e TF1_BPSP1 TRANSCRIPTION FACTOR 3.42e DBH_THEAC DNA-BINDING PROTEIN H 2.12e GLGA_ECOLI GLYCOGEN SYNTHASE (EC 3.80e-01 Doug Brutlag, 1999
32 Decypher Similarity Search ( decypher.stanford.edu/)
33 Prosite Consensus Patterns ( expasy.ch/prosite/) Active site of trypsin-like serine proteases G D S G G Zinc Finger (C 2 H 2 type) C.{2,4} C.{12} H.{3,5} H N-Glycosylation Site N [^P] [S T] [^P] Homeobox Domain Signature [LIVMF].{5} [LIVM].{4} [IV] [RKQ]. W.{8} [RK]
34 The Optimal Way to Develop Patterns ( ch/images/cartoon/ /images/cartoon/prosite.gif)
35 EMOTIF Pattern Discovery (
36 Identifying Protein Functions ( emotif-search)
37 Identifying Protein Function
38 Mapping Sequence Motifs to Structural Motifs (
39 Motifs as Potential Drug Targets HIV Reverse Transcriptase o..y[vlim]dd[vli]oo.ii
40 ematrix: : Position-Specific Scoring Matrices ( Position Structural or functional motif Examples of motif HSGEQLAETLGMSRAAINKHIQ VTLYDVAEYAGVSYQTVSRVVN AMIKDVALKAKVSTATVSRALM ATIKDVAKRAGVSTTTVSHVIN ITIYDLAELSGVSASAVSAILN LHLKDAAALLGVSEMTIRRDLN TAYAELAKQFGVSPGTIHVRVE GSLTEAAHLLGTSQPTVSRELA MSQRELKNELGAGIATITRGSN ITRQEIGQIVGCSRETVGRILK FDIASVAQHVCLSPSRLSHLFR LRIDEVARHVCLSPSRLAHLFR MTRGDIGNYLGLTVETISRLLG VTLEALADQVGMSPFHLHRLFK. A R N D C Q E G H I L K M F P S T W Y V
41 ematrix-search exon/ematrix/ematrix-search.html
42 ematrix Search Results (
43 Block Signatures for a Protein Family ( fhcrc.org/) INKHIQ VSRVVN ASRALM VSHVIN VSAILN IRRDLN THVRVE GSSELA MTRGSN VGRILK LSHLFR LAHLFR ISRLLG LHRLFK HSGEQLAETLGMSRAAINKHIQ VTLYDVAEYAGVSYQTVSRVVN AMIKDVALKAKVSTATVSRALM ATIKDVAKRAGVSTTTVSHVIN ITIYDLAELSGVSASAVSAILN LHLKDAAALLGVSEMTIRRDLN TAYAELAKQFGVSPGTIHVRVE GSLTEAAHLLGTSQPTVSRELA MSQRELKNELGAGIATITRGSN ITRQEIGQIVGCSRETVGRILK FDIASVAQHVCLSPSRLSHLFR LRIDEVARHVCLSPSRLAHLFR MTRGDIGNYLGLTVETISRLLG VTLEALADQVGMSPFHLHRLFK SRAAINKHIVA VSYQTVSRVVN VSTATVSRALA GVTTTVSHVIN SGVSAVSAILN GVSEMTRRDLN TAYATIHVRVE GSQPTVSRELA MSIATITRGSN ISRETVGRILK FDISRLSHLFR LRPSRLAHLFR MTVETISRLLG TLEFHLHRLFK
44 Hidden Markov Models (after Haussler) D 2 D 3 D 4 D 5 I 1 I 2 I 3 I 4 I 5 AA1 AA2 AA3 AA4 AA5 AA6
45 Multiple Representations of Sequences Blocks, 1 2 Profiles or 7 Templates Doug Brutlag, 1999 Position A R N D C Q E G H I L K M F P S T W Y V Consensus Sequences Zinc Finger (C2H2 type) CX{2,4}CX{12}HX{3,5}H Sequences of Common Structure or Function Sequence Alignments Hidden Markov Model VLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHF------DLSHGS : : : : : : : : : : : 2 HLTPEEKSAVTALWGKV--NVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGN Initial Score = 63 Optimized Score = 98 Significance = 5.51 Residue Identity = 14% Matches = 21 Mismatches = 22 Gaps = 2 Conservative Substitutions = 11 AA1 D 2 D 3 D 4 D 5 I 1 I 2 I 3 I 4 I 5 AA2 AA3 AA4 AA5 AA6
46 Protein Identification BLAST No Smith & Waterman No MOTIFS No HMMs No Yes Yes Yes Yes Homolog Homolog Motif Superfamily PROFILES Yes No SCOP Yes No Molecular or Structural Biology Maybe Domain Domain Function
47 Sequence Representations Consensus Deterministic Alignment Blocks or Weight Matrices Templates or Profiles Bayesian Networks Hidden Markov Models Probabilistic
Protein Bioinformatics. Rickard Sandberg Dept. of Cell and Molecular Biology Karolinska Institutet sandberg.cmb.ki.
Protein Bioinformatics Rickard Sandberg Dept. of Cell and Molecular Biology Karolinska Institutet rickard.sandberg@ki.se sandberg.cmb.ki.se Outline Protein features motifs patterns profiles signals 2 Protein
More informationWeek 10: Homology Modelling (II) - HHpred
Week 10: Homology Modelling (II) - HHpred Course: Tools for Structural Biology Fabian Glaser BKU - Technion 1 2 Identify and align related structures by sequence methods is not an easy task All comparative
More informationLarge-Scale Genomic Surveys
Bioinformatics Subtopics Fold Recognition Secondary Structure Prediction Docking & Drug Design Protein Geometry Protein Flexibility Homology Modeling Sequence Alignment Structure Classification Gene Prediction
More informationSequence analysis and comparison
The aim with sequence identification: Sequence analysis and comparison Marjolein Thunnissen Lund September 2012 Is there any known protein sequence that is homologous to mine? Are there any other species
More informationCSCE555 Bioinformatics. Protein Function Annotation
CSCE555 Bioinformatics Protein Function Annotation Why we need to do function annotation? Fig from: Network-based prediction of protein function. Molecular Systems Biology 3:88. 2007 What s function? The
More informationSTRUCTURAL BIOINFORMATICS I. Fall 2015
STRUCTURAL BIOINFORMATICS I Fall 2015 Info Course Number - Classification: Biology 5411 Class Schedule: Monday 5:30-7:50 PM, SERC Room 456 (4 th floor) Instructors: Vincenzo Carnevale - SERC, Room 704C;
More informationGibbs Sampling Methods for Multiple Sequence Alignment
Gibbs Sampling Methods for Multiple Sequence Alignment Scott C. Schmidler 1 Jun S. Liu 2 1 Section on Medical Informatics and 2 Department of Statistics Stanford University 11/17/99 1 Outline Statistical
More informationCISC 889 Bioinformatics (Spring 2004) Sequence pairwise alignment (I)
CISC 889 Bioinformatics (Spring 2004) Sequence pairwise alignment (I) Contents Alignment algorithms Needleman-Wunsch (global alignment) Smith-Waterman (local alignment) Heuristic algorithms FASTA BLAST
More informationHomology Modeling (Comparative Structure Modeling) GBCB 5874: Problem Solving in GBCB
Homology Modeling (Comparative Structure Modeling) Aims of Structural Genomics High-throughput 3D structure determination and analysis To determine or predict the 3D structures of all the proteins encoded
More informationGrundlagen der Bioinformatik Summer semester Lecturer: Prof. Daniel Huson
Grundlagen der Bioinformatik, SS 10, D. Huson, April 12, 2010 1 1 Introduction Grundlagen der Bioinformatik Summer semester 2010 Lecturer: Prof. Daniel Huson Office hours: Thursdays 17-18h (Sand 14, C310a)
More informationStatistical Machine Learning Methods for Bioinformatics II. Hidden Markov Model for Biological Sequences
Statistical Machine Learning Methods for Bioinformatics II. Hidden Markov Model for Biological Sequences Jianlin Cheng, PhD Department of Computer Science University of Missouri 2008 Free for Academic
More informationBioinformatics Chapter 1. Introduction
Bioinformatics Chapter 1. Introduction Outline! Biological Data in Digital Symbol Sequences! Genomes Diversity, Size, and Structure! Proteins and Proteomes! On the Information Content of Biological Sequences!
More informationSyllabus of BIOINF 528 (2017 Fall, Bioinformatics Program)
Syllabus of BIOINF 528 (2017 Fall, Bioinformatics Program) Course Name: Structural Bioinformatics Course Description: Instructor: This course introduces fundamental concepts and methods for structural
More informationSequence Alignment Techniques and Their Uses
Sequence Alignment Techniques and Their Uses Sarah Fiorentino Since rapid sequencing technology and whole genomes sequencing, the amount of sequence information has grown exponentially. With all of this
More informationProtein Structures. Sequences of amino acid residues 20 different amino acids. Quaternary. Primary. Tertiary. Secondary. 10/8/2002 Lecture 12 1
Protein Structures Sequences of amino acid residues 20 different amino acids Primary Secondary Tertiary Quaternary 10/8/2002 Lecture 12 1 Angles φ and ψ in the polypeptide chain 10/8/2002 Lecture 12 2
More informationComputational Biology: Basics & Interesting Problems
Computational Biology: Basics & Interesting Problems Summary Sources of information Biological concepts: structure & terminology Sequencing Gene finding Protein structure prediction Sources of information
More informationChapter 5. Proteomics and the analysis of protein sequence Ⅱ
Proteomics Chapter 5. Proteomics and the analysis of protein sequence Ⅱ 1 Pairwise similarity searching (1) Figure 5.5: manual alignment One of the amino acids in the top sequence has no equivalent and
More informationProtein Secondary Structure Prediction
Protein Secondary Structure Prediction Doug Brutlag & Scott C. Schmidler Overview Goals and problem definition Existing approaches Classic methods Recent successful approaches Evaluating prediction algorithms
More informationCISC 636 Computational Biology & Bioinformatics (Fall 2016)
CISC 636 Computational Biology & Bioinformatics (Fall 2016) Predicting Protein-Protein Interactions CISC636, F16, Lec22, Liao 1 Background Proteins do not function as isolated entities. Protein-Protein
More informationStructure to Function. Molecular Bioinformatics, X3, 2006
Structure to Function Molecular Bioinformatics, X3, 2006 Structural GeNOMICS Structural Genomics project aims at determination of 3D structures of all proteins: - organize known proteins into families
More informationBioinformatics. Dept. of Computational Biology & Bioinformatics
Bioinformatics Dept. of Computational Biology & Bioinformatics 3 Bioinformatics - play with sequences & structures Dept. of Computational Biology & Bioinformatics 4 ORGANIZATION OF LIFE ROLE OF BIOINFORMATICS
More informationSTRUCTURAL BIOINFORMATICS II. Spring 2018
STRUCTURAL BIOINFORMATICS II Spring 2018 Syllabus Course Number - Classification: Chemistry 5412 Class Schedule: Monday 5:30-7:50 PM, SERC Room 456 (4 th floor) Instructors: Ronald Levy, SERC 718 (ronlevy@temple.edu)
More informationSequence and Structure Alignment Z. Luthey-Schulten, UIUC Pittsburgh, 2006 VMD 1.8.5
Sequence and Structure Alignment Z. Luthey-Schulten, UIUC Pittsburgh, 2006 VMD 1.8.5 Why Look at More Than One Sequence? 1. Multiple Sequence Alignment shows patterns of conservation 2. What and how many
More informationStatistical Machine Learning Methods for Biomedical Informatics II. Hidden Markov Model for Biological Sequences
Statistical Machine Learning Methods for Biomedical Informatics II. Hidden Markov Model for Biological Sequences Jianlin Cheng, PhD William and Nancy Thompson Missouri Distinguished Professor Department
More informationCOMP 598 Advanced Computational Biology Methods & Research. Introduction. Jérôme Waldispühl School of Computer Science McGill University
COMP 598 Advanced Computational Biology Methods & Research Introduction Jérôme Waldispühl School of Computer Science McGill University General informations (1) Office hours: by appointment Office: TR3018
More informationAmino Acid Structures from Klug & Cummings. 10/7/2003 CAP/CGS 5991: Lecture 7 1
Amino Acid Structures from Klug & Cummings 10/7/2003 CAP/CGS 5991: Lecture 7 1 Amino Acid Structures from Klug & Cummings 10/7/2003 CAP/CGS 5991: Lecture 7 2 Amino Acid Structures from Klug & Cummings
More informationCMPS 6630: Introduction to Computational Biology and Bioinformatics. Tertiary Structure Prediction
CMPS 6630: Introduction to Computational Biology and Bioinformatics Tertiary Structure Prediction Tertiary Structure Prediction Why Should Tertiary Structure Prediction Be Possible? Molecules obey the
More informationChristian Sigrist. November 14 Protein Bioinformatics: Sequence-Structure-Function 2018 Basel
Christian Sigrist General Definition on Conserved Regions Conserved regions in proteins can be classified into 5 different groups: Domains: specific combination of secondary structures organized into a
More informationIntroduction to Bioinformatics
CSCI8980: Applied Machine Learning in Computational Biology Introduction to Bioinformatics Rui Kuang Department of Computer Science and Engineering University of Minnesota kuang@cs.umn.edu History of Bioinformatics
More informationReading Assignments. A. Genes and the Synthesis of Polypeptides. Lecture Series 7 From DNA to Protein: Genotype to Phenotype
Lecture Series 7 From DNA to Protein: Genotype to Phenotype Reading Assignments Read Chapter 7 From DNA to Protein A. Genes and the Synthesis of Polypeptides Genes are made up of DNA and are expressed
More informationToday s Lecture: HMMs
Today s Lecture: HMMs Definitions Examples Probability calculations WDAG Dynamic programming algorithms: Forward Viterbi Parameter estimation Viterbi training 1 Hidden Markov Models Probability models
More informationBioinformatics. Proteins II. - Pattern, Profile, & Structure Database Searching. Robert Latek, Ph.D. Bioinformatics, Biocomputing
Bioinformatics Proteins II. - Pattern, Profile, & Structure Database Searching Robert Latek, Ph.D. Bioinformatics, Biocomputing WIBR Bioinformatics Course, Whitehead Institute, 2002 1 Proteins I.-III.
More informationComputational Genomics and Molecular Biology, Fall
Computational Genomics and Molecular Biology, Fall 2014 1 HMM Lecture Notes Dannie Durand and Rose Hoberman November 6th Introduction In the last few lectures, we have focused on three problems related
More informationIntroduction to Bioinformatics
Introduction to Bioinformatics Jianlin Cheng, PhD Department of Computer Science Informatics Institute 2011 Topics Introduction Biological Sequence Alignment and Database Search Analysis of gene expression
More informationPage 1. References. Hidden Markov models and multiple sequence alignment. Markov chains. Probability review. Example. Markovian sequence
Page Hidden Markov models and multiple sequence alignment Russ B Altman BMI 4 CS 74 Some slides borrowed from Scott C Schmidler (BMI graduate student) References Bioinformatics Classic: Krogh et al (994)
More informationAlgorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment
Algorithms in Bioinformatics FOUR Sami Khuri Department of Computer Science San José State University Pairwise Sequence Alignment Homology Similarity Global string alignment Local string alignment Dot
More informationBIOINFORMATICS: An Introduction
BIOINFORMATICS: An Introduction What is Bioinformatics? The term was first coined in 1988 by Dr. Hwa Lim The original definition was : a collective term for data compilation, organisation, analysis and
More informationLecture 2, 5/12/2001: Local alignment the Smith-Waterman algorithm. Alignment scoring schemes and theory: substitution matrices and gap models
Lecture 2, 5/12/2001: Local alignment the Smith-Waterman algorithm Alignment scoring schemes and theory: substitution matrices and gap models 1 Local sequence alignments Local sequence alignments are necessary
More informationCMPS 3110: Bioinformatics. Tertiary Structure Prediction
CMPS 3110: Bioinformatics Tertiary Structure Prediction Tertiary Structure Prediction Why Should Tertiary Structure Prediction Be Possible? Molecules obey the laws of physics! Conformation space is finite
More informationSara C. Madeira. Universidade da Beira Interior. (Thanks to Ana Teresa Freitas, IST for useful resources on this subject)
Bioinformática Sequence Alignment Pairwise Sequence Alignment Universidade da Beira Interior (Thanks to Ana Teresa Freitas, IST for useful resources on this subject) 1 16/3/29 & 23/3/29 27/4/29 Outline
More informationSequences, Structures, and Gene Regulatory Networks
Sequences, Structures, and Gene Regulatory Networks Learning Outcomes After this class, you will Understand gene expression and protein structure in more detail Appreciate why biologists like to align
More informationAlgorithms in Computational Biology (236522) spring 2008 Lecture #1
Algorithms in Computational Biology (236522) spring 2008 Lecture #1 Lecturer: Shlomo Moran, Taub 639, tel 4363 Office hours: 15:30-16:30/by appointment TA: Ilan Gronau, Taub 700, tel 4894 Office hours:??
More informationBiol478/ August
Biol478/595 29 August # Day Inst. Topic Hwk Reading August 1 M 25 MG Introduction 2 W 27 MG Sequences and Evolution Handouts 3 F 29 MG Sequences and Evolution September M 1 Labor Day 4 W 3 MG Database
More informationFirst generation sequencing and pairwise alignment (High-tech, not high throughput) Analysis of Biological Sequences
First generation sequencing and pairwise alignment (High-tech, not high throughput) Analysis of Biological Sequences 140.638 where do sequences come from? DNA is not hard to extract (getting DNA from a
More informationSyllabus BINF Computational Biology Core Course
Course Description Syllabus BINF 701-702 Computational Biology Core Course BINF 701/702 is the Computational Biology core course developed at the KU Center for Computational Biology. The course is designed
More informationSingle alignment: Substitution Matrix. 16 march 2017
Single alignment: Substitution Matrix 16 march 2017 BLOSUM Matrix BLOSUM Matrix [2] (Blocks Amino Acid Substitution Matrices ) It is based on the amino acids substitutions observed in ~2000 conserved block
More informationAlgorithms in Bioinformatics
Algorithms in Bioinformatics Sami Khuri Department of omputer Science San José State University San José, alifornia, USA khuri@cs.sjsu.edu www.cs.sjsu.edu/faculty/khuri Pairwise Sequence Alignment Homology
More informationCAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools. Giri Narasimhan
CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools Giri Narasimhan ECS 254; Phone: x3748 giri@cis.fiu.edu www.cis.fiu.edu/~giri/teach/bioinfs15.html Describing & Modeling Patterns
More informationOverview Multiple Sequence Alignment
Overview Multiple Sequence Alignment Inge Jonassen Bioinformatics group Dept. of Informatics, UoB Inge.Jonassen@ii.uib.no Definition/examples Use of alignments The alignment problem scoring alignments
More informationTools and Algorithms in Bioinformatics
Tools and Algorithms in Bioinformatics GCBA815, Fall 2015 Week-4 BLAST Algorithm Continued Multiple Sequence Alignment Babu Guda, Ph.D. Department of Genetics, Cell Biology & Anatomy Bioinformatics and
More informationNeural Networks for Protein Structure Prediction Brown, JMB CS 466 Saurabh Sinha
Neural Networks for Protein Structure Prediction Brown, JMB 1999 CS 466 Saurabh Sinha Outline Goal is to predict secondary structure of a protein from its sequence Artificial Neural Network used for this
More informationMotif Extraction and Protein Classification
Motif Extraction and Protein Classification Vered Kunik 1 Zach Solan 2 Shimon Edelman 3 Eytan Ruppin 1 David Horn 2 1 School of Computer Science, Tel Aviv University, Tel Aviv 69978, Israel {kunikver,ruppin}@tau.ac.il
More informationAn Introduction to Sequence Similarity ( Homology ) Searching
An Introduction to Sequence Similarity ( Homology ) Searching Gary D. Stormo 1 UNIT 3.1 1 Washington University, School of Medicine, St. Louis, Missouri ABSTRACT Homologous sequences usually have the same,
More informationEECS730: Introduction to Bioinformatics
EECS730: Introduction to Bioinformatics Lecture 05: Index-based alignment algorithms Slides adapted from Dr. Shaojie Zhang (University of Central Florida) Real applications of alignment Database search
More informationGiri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748
CAP 5510: Introduction to Bioinformatics Giri Narasimhan ECS 254; Phone: x3748 giri@cis.fiu.edu www.cis.fiu.edu/~giri/teach/bioinfs07.html 2/8/07 CAP5510 1 Pattern Discovery 2/8/07 CAP5510 2 Patterns Nature
More informationInDel 3-5. InDel 8-9. InDel 3-5. InDel 8-9. InDel InDel 8-9
Lecture 5 Alignment I. Introduction. For sequence data, the process of generating an alignment establishes positional homologies; that is, alignment provides the identification of homologous phylogenetic
More informationHidden Markov Models in computational biology. Ron Elber Computer Science Cornell
Hidden Markov Models in computational biology Ron Elber Computer Science Cornell 1 Or: how to fish homolog sequences from a database Many sequences in database RPOBESEQ Partitioned data base 2 An accessible
More informationMultiple sequence alignment
Multiple sequence alignment Multiple sequence alignment: today s goals to define what a multiple sequence alignment is and how it is generated; to describe profile HMMs to introduce databases of multiple
More informationTiffany Samaroo MB&B 452a December 8, Take Home Final. Topic 1
Tiffany Samaroo MB&B 452a December 8, 2003 Take Home Final Topic 1 Prior to 1970, protein and DNA sequence alignment was limited to visual comparison. This was a very tedious process; even proteins with
More informationSequence Database Search Techniques I: Blast and PatternHunter tools
Sequence Database Search Techniques I: Blast and PatternHunter tools Zhang Louxin National University of Singapore Outline. Database search 2. BLAST (and filtration technique) 3. PatternHunter (empowered
More informationMATHEMATICAL MODELS - Vol. III - Mathematical Modeling and the Human Genome - Hilary S. Booth MATHEMATICAL MODELING AND THE HUMAN GENOME
MATHEMATICAL MODELING AND THE HUMAN GENOME Hilary S. Booth Australian National University, Australia Keywords: Human genome, DNA, bioinformatics, sequence analysis, evolution. Contents 1. Introduction:
More informationModule: Sequence Alignment Theory and Applications Session: Introduction to Searching and Sequence Alignment
Module: Sequence Alignment Theory and Applications Session: Introduction to Searching and Sequence Alignment Introduction to Bioinformatics online course : IBT Jonathan Kayondo Learning Objectives Understand
More informationCONCEPT OF SEQUENCE COMPARISON. Natapol Pornputtapong 18 January 2018
CONCEPT OF SEQUENCE COMPARISON Natapol Pornputtapong 18 January 2018 SEQUENCE ANALYSIS - A ROSETTA STONE OF LIFE Sequence analysis is the process of subjecting a DNA, RNA or peptide sequence to any of
More informationMSc Drug Design. Module Structure: (15 credits each) Lectures and Tutorials Assessment: 50% coursework, 50% unseen examination.
Module Structure: (15 credits each) Lectures and Assessment: 50% coursework, 50% unseen examination. Module Title Module 1: Bioinformatics and structural biology as applied to drug design MEDC0075 In the
More informationBioinformatics (GLOBEX, Summer 2015) Pairwise sequence alignment
Bioinformatics (GLOBEX, Summer 2015) Pairwise sequence alignment Substitution score matrices, PAM, BLOSUM Needleman-Wunsch algorithm (Global) Smith-Waterman algorithm (Local) BLAST (local, heuristic) E-value
More informationComputational Molecular Biology Biochem 218 BioMedical Informatics Genomics and Bioinformatics
omputational Molecular Biology Biochem 218 BioMedical Informatics 231 http://biochem218.stanford.edu/ enomics and Bioinformatics Doug Brutlag rofessor meritus Biochemistry & Medicine (by courtesy) aculty,
More informationBiology Tutorial. Aarti Balasubramani Anusha Bharadwaj Massa Shoura Stefan Giovan
Biology Tutorial Aarti Balasubramani Anusha Bharadwaj Massa Shoura Stefan Giovan Viruses A T4 bacteriophage injecting DNA into a cell. Influenza A virus Electron micrograph of HIV. Cone-shaped cores are
More informationProtein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche
Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche The molecular structure of a protein can be broken down hierarchically. The primary structure of a protein is simply its
More informationSYLLABUS AND COURSE POLICIES
SYLLABUS AND COURSE POLICIES BIOLOGY 334-MOLECULAR BIOLOGY- FALL 2015 Time and Room: 9:45-11:35 T-Th CLSB 1A001 We will run with one break: 9:45-10:35, break (10 min.), 10:45-11:35. These are approximate
More informationALL LECTURES IN SB Introduction
1. Introduction 2. Molecular Architecture I 3. Molecular Architecture II 4. Molecular Simulation I 5. Molecular Simulation II 6. Bioinformatics I 7. Bioinformatics II 8. Prediction I 9. Prediction II ALL
More informationCAP 5510 Lecture 3 Protein Structures
CAP 5510 Lecture 3 Protein Structures Su-Shing Chen Bioinformatics CISE 8/19/2005 Su-Shing Chen, CISE 1 Protein Conformation 8/19/2005 Su-Shing Chen, CISE 2 Protein Conformational Structures Hydrophobicity
More informationGiri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748
CAP 5510: Introduction to Bioinformatics Giri Narasimhan ECS 254; Phone: x3748 giri@cis.fiu.edu www.cis.fiu.edu/~giri/teach/bioinfs07.html 2/15/07 CAP5510 1 EM Algorithm Goal: Find θ, Z that maximize Pr
More informationPROTEIN FUNCTION PREDICTION WITH AMINO ACID SEQUENCE AND SECONDARY STRUCTURE ALIGNMENT SCORES
PROTEIN FUNCTION PREDICTION WITH AMINO ACID SEQUENCE AND SECONDARY STRUCTURE ALIGNMENT SCORES Eser Aygün 1, Caner Kömürlü 2, Zafer Aydin 3 and Zehra Çataltepe 1 1 Computer Engineering Department and 2
More informationCMPS 6630: Introduction to Computational Biology and Bioinformatics. Structure Comparison
CMPS 6630: Introduction to Computational Biology and Bioinformatics Structure Comparison Protein Structure Comparison Motivation Understand sequence and structure variability Understand Domain architecture
More informationMolecular and Cellular Biology
You Do Not Need to write down the following infos because all the following slides and all lecture notes will be uploaded at the link: http://itbe.hanyang.ac.kr This/today s file will be uploaded next
More informationBioinformatics for Biologists
Bioinformatics for Biologists Sequence Analysis: Part I. Pairwise alignment and database searching Fran Lewitter, Ph.D. Head, Biocomputing Whitehead Institute Bioinformatics Definitions The use of computational
More informationQuantifying sequence similarity
Quantifying sequence similarity Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, February 16 th 2016 After this lecture, you can define homology, similarity, and identity
More informationEBI web resources II: Ensembl and InterPro
EBI web resources II: Ensembl and InterPro Yanbin Yin http://www.ebi.ac.uk/training/online/course/ 1 Homework 3 Go to http://www.ebi.ac.uk/interpro/training.htmland finish the second online training course
More informationBioinformatics and BLAST
Bioinformatics and BLAST Overview Recap of last time Similarity discussion Algorithms: Needleman-Wunsch Smith-Waterman BLAST Implementation issues and current research Recap from Last Time Genome consists
More informationProtein Structure & Motifs
& Motifs Biochemistry 201 Molecular Biology January 12, 2000 Doug Brutlag Introduction Proteins are more flexible than nucleic acids in structure because of both the larger number of types of residues
More informationChemogenomic: Approaches to Rational Drug Design. Jonas Skjødt Møller
Chemogenomic: Approaches to Rational Drug Design Jonas Skjødt Møller Chemogenomic Chemistry Biology Chemical biology Medical chemistry Chemical genetics Chemoinformatics Bioinformatics Chemoproteomics
More informationSequence analysis and Genomics
Sequence analysis and Genomics October 12 th November 23 rd 2 PM 5 PM Prof. Peter Stadler Dr. Katja Nowick Katja: group leader TFome and Transcriptome Evolution Bioinformatics group Paul-Flechsig-Institute
More information1. In most cases, genes code for and it is that
Name Chapter 10 Reading Guide From DNA to Protein: Gene Expression Concept 10.1 Genetics Shows That Genes Code for Proteins 1. In most cases, genes code for and it is that determine. 2. Describe what Garrod
More informationTranslation Part 2 of Protein Synthesis
Translation Part 2 of Protein Synthesis IN: How is transcription like making a jello mold? (be specific) What process does this diagram represent? A. Mutation B. Replication C.Transcription D.Translation
More informationData Mining in Bioinformatics HMM
Data Mining in Bioinformatics HMM Microarray Problem: Major Objective n Major Objective: Discover a comprehensive theory of life s organization at the molecular level 2 1 Data Mining in Bioinformatics
More informationComparative Bioinformatics Midterm II Fall 2004
Comparative Bioinformatics Midterm II Fall 2004 Objective Answer, part I: For each of the following, select the single best answer or completion of the phrase. (3 points each) 1. Deinococcus radiodurans
More information3. SEQUENCE ANALYSIS BIOINFORMATICS COURSE MTAT
3. SEQUENCE ANALYSIS BIOINFORMATICS COURSE MTAT.03.239 25.09.2012 SEQUENCE ANALYSIS IS IMPORTANT FOR... Prediction of function Gene finding the process of identifying the regions of genomic DNA that encode
More informationHMM for modeling aligned multiple sequences: phylo-hmm & multivariate HMM
I529: Machine Learning in Bioinformatics (Spring 2017) HMM for modeling aligned multiple sequences: phylo-hmm & multivariate HMM Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington
More informationMOLECULAR MODELING IN BIOLOGY (BIO 3356) SYLLABUS
New York City College of Technology School of Arts and Sciences Department of Biological Sciences MOLECULAR MODELING IN BIOLOGY (BIO 3356) SYLLABUS Course Information Course Title: Molecular Modeling in
More informationMETABOLIC PATHWAY PREDICTION/ALIGNMENT
COMPUTATIONAL SYSTEMIC BIOLOGY METABOLIC PATHWAY PREDICTION/ALIGNMENT Hofestaedt R*, Chen M Bioinformatics / Medical Informatics, Technische Fakultaet, Universitaet Bielefeld Postfach 10 01 31, D-33501
More informationAmino Acid Structures from Klug & Cummings. Bioinformatics (Lec 12)
Amino Acid Structures from Klug & Cummings 2/17/05 1 Amino Acid Structures from Klug & Cummings 2/17/05 2 Amino Acid Structures from Klug & Cummings 2/17/05 3 Amino Acid Structures from Klug & Cummings
More informationAdvanced Certificate in Principles in Protein Structure. You will be given a start time with your exam instructions
BIRKBECK COLLEGE (University of London) Advanced Certificate in Principles in Protein Structure MSc Structural Molecular Biology Date: Thursday, 1st September 2011 Time: 3 hours You will be given a start
More informationUpdated: 10/11/2018 Page 1 of 5
A. Academic Division: Health Sciences B. Discipline: Biology C. Course Number and Title: BIOL1230 Biology I MASTER SYLLABUS 2018-2019 D. Course Coordinator: Justin Tickhill Assistant Dean: Melinda Roepke,
More informationStudy and Implementation of Various Techniques Involved in DNA and Protein Sequence Analysis
Study and Implementation of Various Techniques Involved in DNA and Protein Sequence Analysis Kumud Joseph Kujur, Sumit Pal Singh, O.P. Vyas, Ruchir Bhatia, Varun Singh* Indian Institute of Information
More informationGenome Annotation. Qi Sun Bioinformatics Facility Cornell University
Genome Annotation Qi Sun Bioinformatics Facility Cornell University Some basic bioinformatics tools BLAST PSI-BLAST - Position-Specific Scoring Matrix HMM - Hidden Markov Model NCBI BLAST How does BLAST
More informationIntroduction to sequence alignment. Local alignment the Smith-Waterman algorithm
Lecture 2, 12/3/2003: Introduction to sequence alignment The Needleman-Wunsch algorithm for global sequence alignment: description and properties Local alignment the Smith-Waterman algorithm 1 Computational
More informationComputational Biology From The Perspective Of A Physical Scientist
Computational Biology From The Perspective Of A Physical Scientist Dr. Arthur Dong PP1@TUM 26 November 2013 Bioinformatics Education Curriculum Math, Physics, Computer Science (Statistics and Programming)
More informationSimilarity searching summary (2)
Similarity searching / sequence alignment summary Biol4230 Thurs, February 22, 2016 Bill Pearson wrp@virginia.edu 4-2818 Pinn 6-057 What have we covered? Homology excess similiarity but no excess similarity
More informationInferring Transcriptional Regulatory Networks from Gene Expression Data II
Inferring Transcriptional Regulatory Networks from Gene Expression Data II Lectures 9 Oct 26, 2011 CSE 527 Computational Biology, Fall 2011 Instructor: Su-In Lee TA: Christopher Miles Monday & Wednesday
More informationBME 5742 Biosystems Modeling and Control
BME 5742 Biosystems Modeling and Control Lecture 24 Unregulated Gene Expression Model Dr. Zvi Roth (FAU) 1 The genetic material inside a cell, encoded in its DNA, governs the response of a cell to various
More information