Sequence Analysis and Databases 2: Sequences and Multiple Alignments
|
|
- Osborn Garrison
- 5 years ago
- Views:
Transcription
1 1 Sequence Analysis and Databases 2: Sequences and Multiple Alignments Jose María González-Izarzugaza Martínez CNIO Spanish National Cancer Research Centre
2 2 Sequence Comparisons: How? - Pairwise Alignment of 2 Sequences: - Aligning a couple of sequences - Searching for homologues (BLAST) - Multiple Sequence Alignments (n>2): - Advanced Sequence Alignments: - Patterns, Profiles, HMMs - Distant Homologues with PSI-BLAST
3 3 Multiple Sequence Alignments Rationale: - They try to align more than 2 homologous sequences. - Conserved regions must be important due to selective pressure. - Better Alignments as we can focus on general rules. Algorithmic Complexity: - 2 Sequences: NxM (As seen for Pairwise Alignments) - 3 Sequences: NxMxL - 4 Sequences: NxMxLxJ - Not feasible in general for a set of homologous proteins. Examples of Heuristics to skip this algorithmic problem: - T_coffee : [ - Clustalw : [
4 4 Clustalw Algorithm The Algorithm in bare-words Sort the sequences by similarity Align the two most similar ones. Label the sequences as aligned. Repeat until there are no unlabeled seqs Computational Complexity: Now it is similar to performing N pairwise alignments, thus it is feasible for the computer to calculate them for a family of homologues.
5 5 View of a MSA using Belvu
6 6 PairWise vs MSA - If a couple of homologues have diverged more than 20% the signal between them is so low that BLAST is not able to catch it. In other words, we can NOT use BLAST to find remote homologues. - A Multiple Sequence Alignment (MSA) is able to spot important regions. Since important regions have higher selective pressure they change (evolve) less than other regions Conservation is related to Importance. -If the matches between a set of sequences occur in the conserved regions, the chances of these 2 sequences being homologues increase. So How can we use all this information to improve our knowledge? Can a MSA spot remote homologues?
7 7 Sequence Comparisons: How? - Pairwise Alignment of 2 Sequences: - Aligning a couple of sequences - Searching for homologues (BLAST) - Multiple Sequence Alignments (n>2): - Advanced Sequence Alignments: - Patterns, Profiles, HMMs - Distant Homologues with PSI-BLAST
8 8 Advanced Searches: Consensus Algorithm: For each position in the sequence, the consensus position will represent the most repeated monomer. Pros: - The most basic way of summarizing a MSA into a single line. - Easy to implement, easy to understand Cons: - Not taking into account the frequencies, but the most represented ACTGACTACGTACA ATGCGTACCATACA ATCAGTATCGTAGA ATCAGTATCGAACA ATCAGTATCGTACA Consensus Sequence
9 9 Advanced Searches: Patterns -Patterns are also called Regular Expressions by bioinformaticians - Useful when dealing with motifs - It is a complex (but powerful) language, not always easy at first glance - Ambiguity can be depicted: [A,B] : Can be both A or B {A, B} : Anything but A or B X: Any monomer -Repetitions are easily represented: A{2,4}: Can be either AA, AAA or AAAA A+: Any number of A s (but at least one) A*: Any number of A s (or even none) - MSA s are reduced to a single line - Not taking into account the frequencies AGLV AGLV AG[IL]V AGIV [AC]-x-G-x{4}-{L,I} [Ala or Cys]-any-Gly-any-any-any-any-[any but Leu or Ile]
10 10 Advanced Searches: Profiles - Profiles are Position Specific Scoring Matrices (PSSM). - Same concept as the Scoring Matrices (i.e. BLOSUM), but these ones are calculated on scratch using each position in the MSA instead of being pre-calculated. -Thus, PSSM s are 20xN matrices, being N the length of the sequence - PSSM s take into account information specific to the family of proteins, so is Inferred from the alignment Few Assumptions - We can align a sequence and a profile using Smith & Waterman Algorithm Search for homologues MSA PSSM PSSM borrowed from F.Abascal
11 11 Advanced Searches: HMM-profiles - HMM stands for Hidden Markov Model. - Originally implemented for Speech Recognition - They are much robust than the simple profiles, specially when dealing with gaps. However, they are harder to implement. -HMMer is a package similar to BLAST but using HMMs x hidden states y observable outputs a transition probabilities b output probabilities
12 12 Sequence Comparisons: How? - Pairwise Alignment of 2 Sequences: - Aligning a couple of sequences - Searching for homologues (BLAST) - Multiple Sequence Alignments (n>2): - Advanced Sequence Alignments: - Patterns, Profiles, HMMs - Distant Homologues with PSI-BLAST
13 13 Remote Homologues: PSI-Blast - PSI-BLAST Position Specific Iterated Blast - PSI-Blast is useful to retrieve remote homologues (id<20%) - Algorithm: 1) Run BLAST [Iteration #0] 2) Generate PSSM with the results better than a given threshold (e-value) 3) Run BLAST again using the PSSM as Input, [Iterations #1 to #N] 4) Update PSSM with new results 5) Repeat from 3 until convergence* *Convergence: When we can not find new results
14 14 Remote Homologues: PSI-Blast Target Sequence BLAS T DataBase PSI-BLAST PSSM Closely Related Homologues Remotely Related Homologues
15 15 Acknowledgements Federico Abascal (Original Text) Juan Carlos Sanchez Alfonso Valencia
16 16 XXX
Week 10: Homology Modelling (II) - HHpred
Week 10: Homology Modelling (II) - HHpred Course: Tools for Structural Biology Fabian Glaser BKU - Technion 1 2 Identify and align related structures by sequence methods is not an easy task All comparative
More informationTHEORY. Based on sequence Length According to the length of sequence being compared it is of following two types
Exp 11- THEORY Sequence Alignment is a process of aligning two sequences to achieve maximum levels of identity between them. This help to derive functional, structural and evolutionary relationships between
More informationSequence Alignments. Dynamic programming approaches, scoring, and significance. Lucy Skrabanek ICB, WMC January 31, 2013
Sequence Alignments Dynamic programming approaches, scoring, and significance Lucy Skrabanek ICB, WMC January 31, 213 Sequence alignment Compare two (or more) sequences to: Find regions of conservation
More informationSimilarity searching summary (2)
Similarity searching / sequence alignment summary Biol4230 Thurs, February 22, 2016 Bill Pearson wrp@virginia.edu 4-2818 Pinn 6-057 What have we covered? Homology excess similiarity but no excess similarity
More informationExercise 5. Sequence Profiles & BLAST
Exercise 5 Sequence Profiles & BLAST 1 Substitution Matrix (BLOSUM62) Likelihood to substitute one amino acid with another Figure taken from https://en.wikipedia.org/wiki/blosum 2 Substitution Matrix (BLOSUM62)
More informationProtein Bioinformatics. Rickard Sandberg Dept. of Cell and Molecular Biology Karolinska Institutet sandberg.cmb.ki.
Protein Bioinformatics Rickard Sandberg Dept. of Cell and Molecular Biology Karolinska Institutet rickard.sandberg@ki.se sandberg.cmb.ki.se Outline Protein features motifs patterns profiles signals 2 Protein
More informationChapter 5. Proteomics and the analysis of protein sequence Ⅱ
Proteomics Chapter 5. Proteomics and the analysis of protein sequence Ⅱ 1 Pairwise similarity searching (1) Figure 5.5: manual alignment One of the amino acids in the top sequence has no equivalent and
More informationMultiple sequence alignment
Multiple sequence alignment Multiple sequence alignment: today s goals to define what a multiple sequence alignment is and how it is generated; to describe profile HMMs to introduce databases of multiple
More informationSimilarity or Identity? When are molecules similar?
Similarity or Identity? When are molecules similar? Mapping Identity A -> A T -> T G -> G C -> C or Leu -> Leu Pro -> Pro Arg -> Arg Phe -> Phe etc If we map similarity using identity, how similar are
More informationEECS730: Introduction to Bioinformatics
EECS730: Introduction to Bioinformatics Lecture 07: profile Hidden Markov Model http://bibiserv.techfak.uni-bielefeld.de/sadr2/databasesearch/hmmer/profilehmm.gif Slides adapted from Dr. Shaojie Zhang
More informationCISC 889 Bioinformatics (Spring 2004) Sequence pairwise alignment (I)
CISC 889 Bioinformatics (Spring 2004) Sequence pairwise alignment (I) Contents Alignment algorithms Needleman-Wunsch (global alignment) Smith-Waterman (local alignment) Heuristic algorithms FASTA BLAST
More informationTemplate-Based 3D Structure Prediction
Template-Based 3D Structure Prediction Sequence and Structure-based Template Detection and Alignment Issues The rate of new sequences is growing exponentially relative to the rate of protein structures
More informationStatistical Machine Learning Methods for Bioinformatics II. Hidden Markov Model for Biological Sequences
Statistical Machine Learning Methods for Bioinformatics II. Hidden Markov Model for Biological Sequences Jianlin Cheng, PhD Department of Computer Science University of Missouri 2008 Free for Academic
More informationSequence and Structure Alignment Z. Luthey-Schulten, UIUC Pittsburgh, 2006 VMD 1.8.5
Sequence and Structure Alignment Z. Luthey-Schulten, UIUC Pittsburgh, 2006 VMD 1.8.5 Why Look at More Than One Sequence? 1. Multiple Sequence Alignment shows patterns of conservation 2. What and how many
More informationAlignment principles and homology searching using (PSI-)BLAST. Jaap Heringa Centre for Integrative Bioinformatics VU (IBIVU)
Alignment principles and homology searching using (PSI-)BLAST Jaap Heringa Centre for Integrative Bioinformatics VU (IBIVU) http://ibivu.cs.vu.nl Bioinformatics Nothing in Biology makes sense except in
More informationSingle alignment: Substitution Matrix. 16 march 2017
Single alignment: Substitution Matrix 16 march 2017 BLOSUM Matrix BLOSUM Matrix [2] (Blocks Amino Acid Substitution Matrices ) It is based on the amino acids substitutions observed in ~2000 conserved block
More informationMotifs, Profiles and Domains. Michael Tress Protein Design Group Centro Nacional de Biotecnología, CSIC
Motifs, Profiles and Domains Michael Tress Protein Design Group Centro Nacional de Biotecnología, CSIC Comparing Two Proteins Sequence Alignment Determining the pattern of evolution and identifying conserved
More informationStatistical Machine Learning Methods for Biomedical Informatics II. Hidden Markov Model for Biological Sequences
Statistical Machine Learning Methods for Biomedical Informatics II. Hidden Markov Model for Biological Sequences Jianlin Cheng, PhD William and Nancy Thompson Missouri Distinguished Professor Department
More informationLarge-Scale Genomic Surveys
Bioinformatics Subtopics Fold Recognition Secondary Structure Prediction Docking & Drug Design Protein Geometry Protein Flexibility Homology Modeling Sequence Alignment Structure Classification Gene Prediction
More informationBioinformatics (GLOBEX, Summer 2015) Pairwise sequence alignment
Bioinformatics (GLOBEX, Summer 2015) Pairwise sequence alignment Substitution score matrices, PAM, BLOSUM Needleman-Wunsch algorithm (Global) Smith-Waterman algorithm (Local) BLAST (local, heuristic) E-value
More information5. MULTIPLE SEQUENCE ALIGNMENT BIOINFORMATICS COURSE MTAT
5. MULTIPLE SEQUENCE ALIGNMENT BIOINFORMATICS COURSE MTAT.03.239 03.10.2012 ALIGNMENT Alignment is the task of locating equivalent regions of two or more sequences to maximize their similarity. Homology:
More informationSequence Analysis, '18 -- lecture 9. Families and superfamilies. Sequence weights. Profiles. Logos. Building a representative model for a gene.
Sequence Analysis, '18 -- lecture 9 Families and superfamilies. Sequence weights. Profiles. Logos. Building a representative model for a gene. How can I represent thousands of homolog sequences in a compact
More informationCONCEPT OF SEQUENCE COMPARISON. Natapol Pornputtapong 18 January 2018
CONCEPT OF SEQUENCE COMPARISON Natapol Pornputtapong 18 January 2018 SEQUENCE ANALYSIS - A ROSETTA STONE OF LIFE Sequence analysis is the process of subjecting a DNA, RNA or peptide sequence to any of
More informationAlgorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment
Algorithms in Bioinformatics FOUR Sami Khuri Department of Computer Science San José State University Pairwise Sequence Alignment Homology Similarity Global string alignment Local string alignment Dot
More informationSequence analysis and comparison
The aim with sequence identification: Sequence analysis and comparison Marjolein Thunnissen Lund September 2012 Is there any known protein sequence that is homologous to mine? Are there any other species
More informationTools and Algorithms in Bioinformatics
Tools and Algorithms in Bioinformatics GCBA815, Fall 2015 Week-4 BLAST Algorithm Continued Multiple Sequence Alignment Babu Guda, Ph.D. Department of Genetics, Cell Biology & Anatomy Bioinformatics and
More informationSyllabus of BIOINF 528 (2017 Fall, Bioinformatics Program)
Syllabus of BIOINF 528 (2017 Fall, Bioinformatics Program) Course Name: Structural Bioinformatics Course Description: Instructor: This course introduces fundamental concepts and methods for structural
More informationPairwise sequence alignments. Vassilios Ioannidis (From Volker Flegel )
Pairwise sequence alignments Vassilios Ioannidis (From Volker Flegel ) Outline Introduction Definitions Biological context of pairwise alignments Computing of pairwise alignments Some programs Importance
More informationIntroduction to Bioinformatics
Introduction to Bioinformatics Jianlin Cheng, PhD Department of Computer Science Informatics Institute 2011 Topics Introduction Biological Sequence Alignment and Database Search Analysis of gene expression
More informationBioinformatics. Proteins II. - Pattern, Profile, & Structure Database Searching. Robert Latek, Ph.D. Bioinformatics, Biocomputing
Bioinformatics Proteins II. - Pattern, Profile, & Structure Database Searching Robert Latek, Ph.D. Bioinformatics, Biocomputing WIBR Bioinformatics Course, Whitehead Institute, 2002 1 Proteins I.-III.
More informationMultiple Sequence Alignment, Gunnar Klau, December 9, 2005, 17:
Multiple Sequence Alignment, Gunnar Klau, December 9, 2005, 17:50 5001 5 Multiple Sequence Alignment The first part of this exposition is based on the following sources, which are recommended reading:
More informationAlignment & BLAST. By: Hadi Mozafari KUMS
Alignment & BLAST By: Hadi Mozafari KUMS SIMILARITY - ALIGNMENT Comparison of primary DNA or protein sequences to other primary or secondary sequences Expecting that the function of the similar sequence
More informationProgramme Last week s quiz results + Summary Fold recognition Break Exercise: Modelling remote homologues
Programme 8.00-8.20 Last week s quiz results + Summary 8.20-9.00 Fold recognition 9.00-9.15 Break 9.15-11.20 Exercise: Modelling remote homologues 11.20-11.40 Summary & discussion 11.40-12.00 Quiz 1 Feedback
More information3. SEQUENCE ANALYSIS BIOINFORMATICS COURSE MTAT
3. SEQUENCE ANALYSIS BIOINFORMATICS COURSE MTAT.03.239 25.09.2012 SEQUENCE ANALYSIS IS IMPORTANT FOR... Prediction of function Gene finding the process of identifying the regions of genomic DNA that encode
More informationTutorial 4 Substitution matrices and PSI-BLAST
Tutorial 4 Substitution matrices and PSI-BLAST 1 Agenda Substitution Matrices PAM - Point Accepted Mutations BLOSUM - Blocks Substitution Matrix PSI-BLAST Cool story of the day: Why should we care about
More informationAn Introduction to Sequence Similarity ( Homology ) Searching
An Introduction to Sequence Similarity ( Homology ) Searching Gary D. Stormo 1 UNIT 3.1 1 Washington University, School of Medicine, St. Louis, Missouri ABSTRACT Homologous sequences usually have the same,
More informationMultiple Sequence Alignment
Multiple Sequence Alignment BMI/CS 576 www.biostat.wisc.edu/bmi576.html Colin Dewey cdewey@biostat.wisc.edu Multiple Sequence Alignment: Tas Definition Given a set of more than 2 sequences a method for
More informationProtein Structure Prediction and Display
Protein Structure Prediction and Display Goal Take primary structure (sequence) and, using rules derived from known structures, predict the secondary structure that is most likely to be adopted by each
More informationComputational Genomics and Molecular Biology, Fall
Computational Genomics and Molecular Biology, Fall 2014 1 HMM Lecture Notes Dannie Durand and Rose Hoberman November 6th Introduction In the last few lectures, we have focused on three problems related
More informationCopyright 2000 N. AYDIN. All rights reserved. 1
Introduction to Bioinformatics Prof. Dr. Nizamettin AYDIN naydin@yildiz.edu.tr Multiple Sequence Alignment Outline Multiple sequence alignment introduction to msa methods of msa progressive global alignment
More informationPairwise sequence alignments
Pairwise sequence alignments Volker Flegel VI, October 2003 Page 1 Outline Introduction Definitions Biological context of pairwise alignments Computing of pairwise alignments Some programs VI, October
More informationComputational Genomics and Molecular Biology, Fall
Computational Genomics and Molecular Biology, Fall 2011 1 HMM Lecture Notes Dannie Durand and Rose Hoberman October 11th 1 Hidden Markov Models In the last few lectures, we have focussed on three problems
More informationCAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools. Giri Narasimhan
CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools Giri Narasimhan ECS 254; Phone: x3748 giri@cis.fiu.edu www.cis.fiu.edu/~giri/teach/bioinfs15.html Describing & Modeling Patterns
More informationSequence Alignment Techniques and Their Uses
Sequence Alignment Techniques and Their Uses Sarah Fiorentino Since rapid sequencing technology and whole genomes sequencing, the amount of sequence information has grown exponentially. With all of this
More informationHidden Markov Models (HMMs) and Profiles
Hidden Markov Models (HMMs) and Profiles Swiss Institute of Bioinformatics (SIB) 26-30 November 2001 Markov Chain Models A Markov Chain Model is a succession of states S i (i = 0, 1,...) connected by transitions.
More informationChristian Sigrist. November 14 Protein Bioinformatics: Sequence-Structure-Function 2018 Basel
Christian Sigrist General Definition on Conserved Regions Conserved regions in proteins can be classified into 5 different groups: Domains: specific combination of secondary structures organized into a
More informationLecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM)
Bioinformatics II Probability and Statistics Universität Zürich and ETH Zürich Spring Semester 2009 Lecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM) Dr Fraser Daly adapted from
More informationHidden Markov Models
Hidden Markov Models Outline 1. CG-Islands 2. The Fair Bet Casino 3. Hidden Markov Model 4. Decoding Algorithm 5. Forward-Backward Algorithm 6. Profile HMMs 7. HMM Parameter Estimation 8. Viterbi Training
More informationAdministration. ndrew Torda April /04/2008 [ 1 ]
ndrew Torda April 2008 Administration 22/04/2008 [ 1 ] Sprache? zu verhandeln (Englisch, Hochdeutsch, Bayerisch) Selection of topics Proteins / DNA / RNA Two halves to course week 1-7 Prof Torda (larger
More informationHIDDEN MARKOV MODELS FOR REMOTE PROTEIN HOMOLOGY DETECTION
From THE CENTER FOR GENOMICS AND BIOINFORMATICS Karolinska Institutet, Stockholm, Sweden HIDDEN MARKOV MODELS FOR REMOTE PROTEIN HOMOLOGY DETECTION Markus Wistrand Stockholm 2005 All previously published
More informationPROTEIN FUNCTION PREDICTION WITH AMINO ACID SEQUENCE AND SECONDARY STRUCTURE ALIGNMENT SCORES
PROTEIN FUNCTION PREDICTION WITH AMINO ACID SEQUENCE AND SECONDARY STRUCTURE ALIGNMENT SCORES Eser Aygün 1, Caner Kömürlü 2, Zafer Aydin 3 and Zehra Çataltepe 1 1 Computer Engineering Department and 2
More informationSequence Analysis 17: lecture 5. Substitution matrices Multiple sequence alignment
Sequence Analysis 17: lecture 5 Substitution matrices Multiple sequence alignment Substitution matrices Used to score aligned positions, usually of amino acids. Expressed as the log-likelihood ratio of
More informationA profile-based protein sequence alignment algorithm for a domain clustering database
A profile-based protein sequence alignment algorithm for a domain clustering database Lin Xu,2 Fa Zhang and Zhiyong Liu 3, Key Laboratory of Computer System and architecture, the Institute of Computing
More informationTools and Algorithms in Bioinformatics
Tools and Algorithms in Bioinformatics GCBA815, Fall 2013 Week3: Blast Algorithm, theory and practice Babu Guda, Ph.D. Department of Genetics, Cell Biology & Anatomy Bioinformatics and Systems Biology
More informationPage 1. References. Hidden Markov models and multiple sequence alignment. Markov chains. Probability review. Example. Markovian sequence
Page Hidden Markov models and multiple sequence alignment Russ B Altman BMI 4 CS 74 Some slides borrowed from Scott C Schmidler (BMI graduate student) References Bioinformatics Classic: Krogh et al (994)
More informationPractical considerations of working with sequencing data
Practical considerations of working with sequencing data File Types Fastq ->aligner -> reference(genome) coordinates Coordinate files SAM/BAM most complete, contains all of the info in fastq and more!
More informationAn Introduction to Bioinformatics Algorithms Hidden Markov Models
Hidden Markov Models Outline 1. CG-Islands 2. The Fair Bet Casino 3. Hidden Markov Model 4. Decoding Algorithm 5. Forward-Backward Algorithm 6. Profile HMMs 7. HMM Parameter Estimation 8. Viterbi Training
More informationIntroductory course on Multiple Sequence Alignment Part I: Theoretical foundations
Sequence Analysis and Structure Prediction Service Centro Nacional de Biotecnología CSIC 8-10 May, 2013 Introductory course on Multiple Sequence Alignment Part I: Theoretical foundations Course Notes Instructor:
More informationHMMs and biological sequence analysis
HMMs and biological sequence analysis Hidden Markov Model A Markov chain is a sequence of random variables X 1, X 2, X 3,... That has the property that the value of the current state depends only on the
More informationCMPS 3110: Bioinformatics. Tertiary Structure Prediction
CMPS 3110: Bioinformatics Tertiary Structure Prediction Tertiary Structure Prediction Why Should Tertiary Structure Prediction Be Possible? Molecules obey the laws of physics! Conformation space is finite
More informationCMPS 6630: Introduction to Computational Biology and Bioinformatics. Tertiary Structure Prediction
CMPS 6630: Introduction to Computational Biology and Bioinformatics Tertiary Structure Prediction Tertiary Structure Prediction Why Should Tertiary Structure Prediction Be Possible? Molecules obey the
More informationProtein Structure Prediction using String Kernels. Technical Report
Protein Structure Prediction using String Kernels Technical Report Department of Computer Science and Engineering University of Minnesota 4-192 EECS Building 200 Union Street SE Minneapolis, MN 55455-0159
More informationCh. 9 Multiple Sequence Alignment (MSA)
Ch. 9 Multiple Sequence Alignment (MSA) - gather seqs. to make MSA - doing MSA with ClustalW - doing MSA with Tcoffee - comparing seqs. that cannot align Introduction - from pairwise alignment to MSA -
More informationAlgorithms in Bioinformatics
Algorithms in Bioinformatics Sami Khuri Department of omputer Science San José State University San José, alifornia, USA khuri@cs.sjsu.edu www.cs.sjsu.edu/faculty/khuri Pairwise Sequence Alignment Homology
More informationSequence-specific sequence comparison using pairwise statistical significance
Graduate Theses and Dissertations Graduate College 2009 Sequence-specific sequence comparison using pairwise statistical significance Ankit Agrawal Iowa State University Follow this and additional works
More informationSara C. Madeira. Universidade da Beira Interior. (Thanks to Ana Teresa Freitas, IST for useful resources on this subject)
Bioinformática Sequence Alignment Pairwise Sequence Alignment Universidade da Beira Interior (Thanks to Ana Teresa Freitas, IST for useful resources on this subject) 1 16/3/29 & 23/3/29 27/4/29 Outline
More informationHidden Markov Models and Their Applications in Biological Sequence Analysis
Hidden Markov Models and Their Applications in Biological Sequence Analysis Byung-Jun Yoon Dept. of Electrical & Computer Engineering Texas A&M University, College Station, TX 77843-3128, USA Abstract
More informationGlobal alignments - review
Global alignments - review Take two sequences: X[j] and Y[j] M[i-1, j-1] ± 1 M[i, j] = max M[i, j-1] 2 M[i-1, j] 2 The best alignment for X[1 i] and Y[1 j] is called M[i, j] X[j] Initiation: M[,]= pply
More informationCombining pairwise sequence similarity and support vector machines for remote protein homology detection
Combining pairwise sequence similarity and support vector machines for remote protein homology detection Li Liao Central Research & Development E. I. du Pont de Nemours Company li.liao@usa.dupont.com William
More informationGenome Annotation. Qi Sun Bioinformatics Facility Cornell University
Genome Annotation Qi Sun Bioinformatics Facility Cornell University Some basic bioinformatics tools BLAST PSI-BLAST - Position-Specific Scoring Matrix HMM - Hidden Markov Model NCBI BLAST How does BLAST
More informationSupporting Text 1. Comparison of GRoSS sequence alignment to HMM-HMM and GPCRDB
Structure-Based Sequence Alignment of the Transmembrane Domains of All Human GPCRs: Phylogenetic, Structural and Functional Implications, Cvicek et al. Supporting Text 1 Here we compare the GRoSS alignment
More informationCollected Works of Charles Dickens
Collected Works of Charles Dickens A Random Dickens Quote If there were no bad people, there would be no good lawyers. Original Sentence It was a dark and stormy night; the night was dark except at sunny
More informationBioinformatics: Secondary Structure Prediction
Bioinformatics: Secondary Structure Prediction Prof. David Jones d.jones@cs.ucl.ac.uk LMLSTQNPALLKRNIIYWNNVALLWEAGSD The greatest unsolved problem in molecular biology:the Protein Folding Problem? Entries
More information09/06/25. Computergestützte Strukturbiologie (Strukturelle Bioinformatik) Non-uniform distribution of folds. Scheme of protein structure predicition
Sequence identity Structural similarity Computergestützte Strukturbiologie (Strukturelle Bioinformatik) Fold recognition Sommersemester 2009 Peter Güntert Structural similarity X Sequence identity Non-uniform
More informationIntro Protein structure Motifs Motif databases End. Last time. Probability based methods How find a good root? Reliability Reconciliation analysis
Last time Probability based methods How find a good root? Reliability Reconciliation analysis Today Intro to proteinstructure Motifs and domains First dogma of Bioinformatics Sequence structure function
More informationA multi-source domain annotation pipeline for quantitative metagenomic and metatranscriptomic functional profiling
A multi-source domain annotation pipeline for quantitative metagenomic and metatranscriptomic functional profiling Ari Ugarte, Riccardo Vicedomini, Juliana Silva Bernardes, Alessandra Carbone 9 September,
More informationOverview Multiple Sequence Alignment
Overview Multiple Sequence Alignment Inge Jonassen Bioinformatics group Dept. of Informatics, UoB Inge.Jonassen@ii.uib.no Definition/examples Use of alignments The alignment problem scoring alignments
More informationModule: Sequence Alignment Theory and Applications Session: Introduction to Searching and Sequence Alignment
Module: Sequence Alignment Theory and Applications Session: Introduction to Searching and Sequence Alignment Introduction to Bioinformatics online course : IBT Jonathan Kayondo Learning Objectives Understand
More informationSequence Alignment: A General Overview. COMP Fall 2010 Luay Nakhleh, Rice University
Sequence Alignment: A General Overview COMP 571 - Fall 2010 Luay Nakhleh, Rice University Life through Evolution All living organisms are related to each other through evolution This means: any pair of
More informationCombining pairwise sequence similarity and support vector machines for remote protein homology detection
Combining pairwise sequence similarity and support vector machines for remote protein homology detection Li Liao Central Research & Development E. I. du Pont de Nemours Company li.liao@usa.dupont.com William
More informationCSE 549: Computational Biology. Substitution Matrices
CSE 9: Computational Biology Substitution Matrices How should we score alignments So far, we ve looked at arbitrary schemes for scoring mutations. How can we assign scores in a more meaningful way? Are
More informationSEQUENCE alignment is an underlying application in the
194 IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, VOL. 8, NO. 1, JANUARY/FEBRUARY 2011 Pairwise Statistical Significance of Local Sequence Alignment Using Sequence-Specific and Position-Specific
More informationSequence comparison: Score matrices. Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas
Sequence comparison: Score matrices Genome 559: Introduction to Statistical and omputational Genomics Prof James H Thomas Informal inductive proof of best alignment path onsider the last step in the best
More informationScoring Matrices. Shifra Ben-Dor Irit Orr
Scoring Matrices Shifra Ben-Dor Irit Orr Scoring matrices Sequence alignment and database searching programs compare sequences to each other as a series of characters. All algorithms (programs) for comparison
More informationIn-Depth Assessment of Local Sequence Alignment
2012 International Conference on Environment Science and Engieering IPCBEE vol.3 2(2012) (2012)IACSIT Press, Singapoore In-Depth Assessment of Local Sequence Alignment Atoosa Ghahremani and Mahmood A.
More informationMoreover, the circular logic
Moreover, the circular logic How do we know what is the right distance without a good alignment? And how do we construct a good alignment without knowing what substitutions were made previously? ATGCGT--GCAAGT
More informationBIOINF 4120 Bioinformatics 2 - Structures and Systems - Oliver Kohlbacher Summer Protein Structure Prediction I
BIOINF 4120 Bioinformatics 2 - Structures and Systems - Oliver Kohlbacher Summer 2013 9. Protein Structure Prediction I Structure Prediction Overview Overview of problem variants Secondary structure prediction
More informationSequence analysis and Genomics
Sequence analysis and Genomics October 12 th November 23 rd 2 PM 5 PM Prof. Peter Stadler Dr. Katja Nowick Katja: group leader TFome and Transcriptome Evolution Bioinformatics group Paul-Flechsig-Institute
More informationThe Pennsylvania State University. The Graduate School. College of Engineering A COMPUTATIONAL FRAMEWORK FOR INFERRING STRUCTURE, FUNCTION,
The Pennsylvania State University The Graduate School College of Engineering A COMPUTATIONAL FRAMEWORK FOR INFERRING STRUCTURE, FUNCTION, AND EVOLUTION OF PROTEINS A Dissertation in Computer Science and
More informationLecture 2, 5/12/2001: Local alignment the Smith-Waterman algorithm. Alignment scoring schemes and theory: substitution matrices and gap models
Lecture 2, 5/12/2001: Local alignment the Smith-Waterman algorithm Alignment scoring schemes and theory: substitution matrices and gap models 1 Local sequence alignments Local sequence alignments are necessary
More informationSequence Analysis '17- lecture 8. Multiple sequence alignment
Sequence Analysis '17- lecture 8 Multiple sequence alignment Ex5 explanation How many random database search scores have e-values 10? (Answer: 10!) Why? e-value of x = m*p(s x), where m is the database
More informationSequence comparison: Score matrices
Sequence comparison: Score matrices http://facultywashingtonedu/jht/gs559_2013/ Genome 559: Introduction to Statistical and omputational Genomics Prof James H Thomas FYI - informal inductive proof of best
More informationComputational Biology
Computational Biology Lecture 6 31 October 2004 1 Overview Scoring matrices (Thanks to Shannon McWeeney) BLAST algorithm Start sequence alignment 2 1 What is a homologous sequence? A homologous sequence,
More informationOrthology Part I: concepts and implications Toni Gabaldón Centre for Genomic Regulation (CRG), Barcelona
Orthology Part I: concepts and implications Toni Gabaldón Centre for Genomic Regulation (CRG), Barcelona (tgabaldon@crg.es) http://gabaldonlab.crg.es Homology the same organ in different animals under
More informationproteins Refinement by shifting secondary structure elements improves sequence alignments
proteins STRUCTURE O FUNCTION O BIOINFORMATICS Refinement by shifting secondary structure elements improves sequence alignments Jing Tong, 1,2 Jimin Pei, 3 Zbyszek Otwinowski, 1,2 and Nick V. Grishin 1,2,3
More information1-D Predictions. Prediction of local features: Secondary structure & surface exposure
1-D Predictions Prediction of local features: Secondary structure & surface exposure 1 Learning Objectives After today s session you should be able to: Explain the meaning and usage of the following local
More informationBasic Local Alignment Search Tool
Basic Local Alignment Search Tool Alignments used to uncover homologies between sequences combined with phylogenetic studies o can determine orthologous and paralogous relationships Local Alignment uses
More informationCSE182-L7. Protein Sequence Analysis Patterns (regular expressions) Profiles HMM Gene Finding CSE182
CSE182-L7 Protein Sequence Analysis Patterns (regular expressions) Profiles HMM Gene Finding 10-07 CSE182 Bell Labs Honors Pattern matching 10-07 CSE182 Just the Facts Consider the set of all substrings
More informationIntroduction to Sequence Alignment. Manpreet S. Katari
Introduction to Sequence Alignment Manpreet S. Katari 1 Outline 1. Global vs. local approaches to aligning sequences 1. Dot Plots 2. BLAST 1. Dynamic Programming 3. Hash Tables 1. BLAT 4. BWT (Burrow Wheeler
More informationEfficient Remote Homology Detection with Secondary Structure
Efficient Remote Homology Detection with Secondary Structure 2 Yuna Hou 1, Wynne Hsu 1, Mong Li Lee 1, and Christopher Bystroff 2 1 School of Computing,National University of Singapore,Singapore 117543
More information