functional annotation preliminary results
|
|
- Frederica Goodwin
- 6 years ago
- Views:
Transcription
1 functional annotation preliminary results March 16, 216 Alicia Francis, Andrew Teng, Chen Guo, Devika Singh, Ellie Kim, Harshmi Shah, James Moore, Jose Jaimes, Nadav Topaz, Namrata Kalsi, Petar Penev, Tannishtha Som
2 overview recap background functional annotation final tools modified workflow
3 overview recap background functional annotation final tools modified workflow
4 Functional Annotation Consists of attaching biological information to genomic elements. Goal is to better understand the function of the genes, and their respective proteins, within the organism. Information generally includes: biochemical function biological function regulatory functions and interactions expression
5 overview recap background functional annotation final tools modified workflow
6 NT Haemophilus influenzae Gram negative bacteria Facultatively anaerobic, requires NAD and hemin to grow First free living organism to have its genome sequenced Virulence and diseases otitis media, sinusitis, conjunctivitis, and exacerbations of chronic obstructive pulmonary disease ear infections in children and bronchitis in adults, but may also cause invasive disease, such as bacteremia and pneumonia. Adaptability and transformation horizontal gene transfer Out of 182 genes, 1699 are CDS Source:
7 overview recap background functional annotation final tools modified workflow
8 Protein and NonCoding Functions Protein Function: Structural Regulatory Transmembrane Receptor Enzyme Virulence Factors Metabolic Processes NonCoding Functions: CRISPRs Operons
9 Functional Assignment Name Gene Symbol Protein Name Role Function of the protein in the cell Associated Information Supporting Evidence: Domains/Motifs Transmembrane Regions Orthologous domains Pathways
10 Levels of Annotation
11 Levels of Annotation 1
12 CRISPR Clustered regularly interspaced short palindromic repeats (CRISPRs) highly conserved short sequences (24 bps) separated by spacers of a similar length. Part of acquired immunity in prokaryotes Resistance to bacteriophage invasion Present in ~4% of bacterial genomes Used for subtyping through analysis of spacers with a high degree of polymorphism Salmonella (Fabre et al.) spacer content was strongly correlated with both serotype and multilocus sequence typing (MLST) type Mycobacterium tuberculosis (Gori et al.) Spoligotyping (PCR) amplification of a highly polymorphic direct repeat locus
13 CRISPR: Pilercr and CRT Consensus Results M5964 M27986 M28745 M29197 M294 M36564 M7572 M27987 M2877 M2922 M29658 M3658 M154 M28356 M2881 M29227 M29684 M36582 M1618 M2845 M28853 M2937 M29695 M M2626 M28687 M28888 M29323 M29697 M3666 M2632 M2872 M29179 M29331 M36557 M37982
14 CRISPR: Pilercr Command line: pilercr in <input fasta> out <text file> seq <consensus sequence>
15 CRISPR: CRT Command line: java cp CRT1.2CLI.jar crt <input fasta> <output file>
16 Domains & Motifs Structure and localization fundamental for function Domains selfcontained cooperative folding units Motifs short consensus regions, crucial for the function Using homology Same sequence same function however: Higher sequence similarity = higher probability of same function Using abinitio methods Source
17 InterProScan Scans through the InterPro databases and provides annotation based on homology and gene ontology terms. Databases: Prosite, Coils, PIRSF, Pfam, ProDom, Superfamily, Gene3d, SMART, TIGRFAM, PRINTS Command:./interproscan.sh appl (applications to include) i (input file) Output: TSV, XML, GFF3
18 InterProScan Results: Sample Total # Genes InterPro Unique Annotations % Annotated M % M % M % M % M % M % M %
19 InterProScan Results: Sample Coils Gene3D Pfam PIRSF PRINTS ProDom PSPatterns PSProfiles SMART SF TF M M M M M M M SF = Superfamilies, TF = TIGRFAM
20 Gramnegative bacteria Secretory Pathway Secretion : Transport of proteins, enzymes, toxins from interior of bacterial cell to its exterior. 6 types in gramnegative bacteria : Type II and V : proteins carry signal peptides Type I, III, IV and VI : proteins do not carry signal peptides Proteins in the bacterial membrane also use this pathway. Membrane proteins : Lipoproteins : Functions as virulence factors, nutrient uptake, adhesion Transmembrane proteins Signal Peptidase I : Cleavage of preproteins translocated across membranes. Signal Peptidase II : Cleavage of bacterial prolipoproteins.
21 Transmembrane Proteins alphahelical betabarrels Topology Orientation of Nterminus single/multipass Function Transmembrane helices have longer hydrophobic region and no cleavage site Signal Peptides Source
22 LipoP Based on Hidden Markov Model Classifies genes into four classes: SpI: signal peptide (signal peptidase I) TMH: nterminal transmembrane helix SpII: lipoprotein signal peptide (signal peptidase II) CYT: cytoplasmic. It really just means all the rest Command: LipoP short [Input.fasta] > [Output.gff] Example Output: (short summarizes the best prediction for each gene)
23 LipoP Results Sample No. SpI SpII TMH CYT Total M M M M M M M
24 SignalP Predicts signal peptide and cleavage site Based on neural network Command: signalp t <type of organism> f <format> [Input.faa] > <outputfile> Example Output: (short summarizes the best prediction for each gene)
25 SignalP Results Sample No. Number of signal peptides Total % Signal Peptide M M M M M M M
26 LipoP vs. SignalP Sample No. LipoP Unique SignalP Unique Common NonSignal Total M M M M M M M
27 LipoP vs. SignalP LipoP clearly provides more unique information than SignalP for signal peptides.
28 TMHMM Based on Hidden Markov Model approach Prediction of transmembrane helices in proteins Command: cat <input faa file> /path/tmhmm short > <output file> Sample output:
29 TMHMM Results Sample No. Number of TM Proteins Total % TMH M M M M M M M
30 LipoP vs. TMHMM Sample No. LipoP Unique TMHMM Unique Common Total M M M M M M M
31 LipoP vs. TMHMM TMHMM predicts more transmembrane protein than LipoP Could be because the hydrophobic region of a signal peptide is mistaken as that of a TM protein. Next Step : Run Phobius which claims to have lesser false positive rates than the two.
32 VFDB Virulence Factors Virulence Factors: molecules produced by bacterial pathogens contributing to: Pathogenicity of the host Enabling them to achieve colonization Immunoevasion Immunosuppression Entrance to the cell Obtaining host nutrients Useful for understanding virulence mechanisms and interactions with host cell
33 VFDB Output Identification of protein sequences Command line: blastp db <DatabaseName> query <InputFile> outfmt "6 stitle qseqid sseqid sgi qcovs evalue" out <OutputFile> Tabular Output:
34 Levels of Annotation 2
35 Operons and Polycistronic mrna: Operons: Cluster of genes which are under the control of a single promoter Transcribed together into a single mrna strand Polycistronic mrna: A single mrna strand coding for many proteins
36 Operons OperonDB Input :.faa and.ptt files Blastp createoperondb.pl.ptt format is no longer used by NCBI. Alternative approach: Download operons from OperonDB and DOORS2 for our species of interest Blast it against our query sequence Compare results
37 Levels of Annotation 3
38 Pathways What is a biological pathway? Biological pathway diagrams are used to describe the biological reactions and interaction in a cell in a graphical way. There are many types of biological pathways but most wellknown are pathways involved in Metabolic pathways, generegulation pathways and signal transduction pathway. metabolic pathway: chemical reactions that occur in our bodies. generegulation pathway: turn genes on and off signal transduction pathways: move a signal from a cell s exterior to its interior. Why is it important? The computational approach for incorporating pathway knowledge to interpret highthroughput datasets plays a key role in understanding diseases mechanism from genetic studies. It helps many scientists to generate biologically meaningful hypotheses and it allows more comprehensive inferences made based on the pathway analysis.
39 Work in Progress Large (very) database Overlapping methods with other selected tools mysql issues Complicated user manual Large database (3gb+) Will install separately on home folder and run it there
40 Updated Pipeline
41 Exercise
42 Questions?
43 References Philip Jones, David Binns, HsinYu Chang, Matthew Fraser, Weizhong Li, Craig McAnulla, Hamish McWilliam, John Maslen, Alex Mitchell, Gift Nuka, Sebastien Pesseat, Antony F. Quinn, Amaia SangradorVegas, Maxim Scheremetjew, SiewYit Yong, Rodrigo Lopez, and Sarah Hunter (214). InterProScan 5: genomescale protein function classification. Bioinformatics, Jan 214; doi: 1.193/bioinformatics/btu31 Krogh, A., et al. (21). "Predicting transmembrane protein topology with a hidden markov model: application to complete genomes." Journal of Molecular Biology 35(3): G.E. Tusnady and I. Simon (1998), Principles Governing Amino Acid Composition of Integral Membrane Proteins: Applications to topology prediction, J. Mol. Biol. 283, Panwar, B., et al. (214). Prediction and classification of ncrnas using structural information. BMC Genomics 15:127. DOI: / Thomas Nordahl Petersen, Soren Brunak, Gunnar von Heijne & Henrik Nielsen. SignalP4.: discriminating signal peptides from transmembrane regions. Nature Methods, 8:785786, 211 Chen L, Xionq Z, Sun L, Yang J, Jin Q. VFDB 212 update: toward the genetic diversity and molecular evolution of bacterial virulence factors. Nucleic Acids Res. 212 Jan;4(Database issue):d6415. doi: 1.193/nar/gkr989. Epub 211 Nov 8. Koskinen, Patrik, et al. "PANNZER: highthroughput functional annotation of uncharacterized proteins in an errorprone environment." Bioinformatics 31.1 (215): PANNZER: Edgar, Robert C. "PILERCR: fast and accurate identification of CRISPR repeats." Bmc Bioinformatics 8.1 (27): 1. PILERCR: Mi H, Lazarevaulitsky B, Loo R, et al. The PANTHER database of protein families, subfamilies, functions and pathways. Nucleic Acids Res. 25;33(Database issue):d2848. Thomas PD, Campbell MJ, Kejariwal A, et al. PANTHER: a library of protein families and subfamilies indexed by function. Genome Res. 23;13(9): Mi H, Poudel S, Muruganujan A, Casagrande JT, Thomas PD. PANTHER version 1: expanded protein families and functions, and analysis tools. Nucleic Acids Res. 216;44(D1):D33642.
44 References Van Eldere, Johan et al. Nontypeable Haemophilus influenzae, an underrecognised pathogen. The Lancet Infectious Diseases, Volume 14, Issue 12, Marraffini, Luciano A. "CrisprCas Immunity in Prokaryotes." Nature (215): Fabre, Laëtitia et al. CRISPR Typing and Subtyping for Improved Laboratory Surveillance of Salmonella Infections. Ed. Igor Mokrousov. PLoS ONE 7.5 (212): e PMC. Web. 16 Mar. 216.
FUNCTION ANNOTATION PRELIMINARY RESULTS
FUNCTION ANNOTATION PRELIMINARY RESULTS FACTION I KAI YUAN KALYANI PATANKAR KIERA BERGER CAMILA MEDRANO HUBERT PAN JUNKE WANG YANXI CHEN AJAY RAMAKRISHNAN MRUNAL DEHANKAR OVERVIEW Introduction Previous
More information-max_target_seqs: maximum number of targets to report
Review of exercise 1 tblastn -num_threads 2 -db contig -query DH10B.fasta -out blastout.xls -evalue 1e-10 -outfmt "6 qseqid sseqid qstart qend sstart send length nident pident evalue" Other options: -max_target_seqs:
More informationWe have: We will: Assembled six genomes Made predictions of most likely gene locations. Add a layers of biological meaning to the sequences
Recap We have: Assembled six genomes Made predictions of most likely gene locations We will: Add a layers of biological meaning to the sequences Start with Biology This will motivate the choices we make
More informationFunctional Annotation
Functional Annotation Outline Introduction Strategy Pipeline Databases Now, what s next? Functional Annotation Adding the layers of analysis and interpretation necessary to extract its biological significance
More informationIntro Secondary structure Transmembrane proteins Function End. Last time. Domains Hidden Markov Models
Last time Domains Hidden Markov Models Today Secondary structure Transmembrane proteins Structure prediction NAD-specific glutamate dehydrogenase Hard Easy >P24295 DHE2_CLOSY MSKYVDRVIAEVEKKYADEPEFVQTVEEVL
More informationToday. Last time. Secondary structure Transmembrane proteins. Domains Hidden Markov Models. Structure prediction. Secondary structure
Last time Today Domains Hidden Markov Models Structure prediction NAD-specific glutamate dehydrogenase Hard Easy >P24295 DHE2_CLOSY MSKYVDRVIAEVEKKYADEPEFVQTVEEVL SSLGPVVDAHPEYEEVALLERMVIPERVIE FRVPWEDDNGKVHVNTGYRVQFNGAIGPYK
More informationGenome Annotation. Bioinformatics and Computational Biology. Genome sequencing Assembly. Gene prediction. Protein targeting.
Genome Annotation Bioinformatics and Computational Biology Genome Annotation Frank Oliver Glöckner 1 Genome Analysis Roadmap Genome sequencing Assembly Gene prediction Protein targeting trna prediction
More informationGenome Annotation Project Presentation
Halogeometricum borinquense Genome Annotation Project Presentation Loci Hbor_05620 & Hbor_05470 Presented by: Mohammad Reza Najaf Tomaraei Hbor_05620 Basic Information DNA Coordinates: 527,512 528,261
More informationEBI web resources II: Ensembl and InterPro. Yanbin Yin Spring 2013
EBI web resources II: Ensembl and InterPro Yanbin Yin Spring 2013 1 Outline Intro to genome annotation Protein family/domain databases InterPro, Pfam, Superfamily etc. Genome browser Ensembl Hands on Practice
More informationPublic Database 의이용 (1) - SignalP (version 4.1)
Public Database 의이용 (1) - SignalP (version 4.1) 2015. 8. KIST 이철주 Secretion pathway prediction ProteinCenter (Proxeon Bioinformatics, Odense, Denmark; http://www.cbs.dtu.dk/services) SignalP (version 4.1)
More informationCSCE555 Bioinformatics. Protein Function Annotation
CSCE555 Bioinformatics Protein Function Annotation Why we need to do function annotation? Fig from: Network-based prediction of protein function. Molecular Systems Biology 3:88. 2007 What s function? The
More informationSUPPLEMENTARY INFORMATION
Supplementary information S1 (box). Supplementary Methods description. Prokaryotic Genome Database Archaeal and bacterial genome sequences were downloaded from the NCBI FTP site (ftp://ftp.ncbi.nlm.nih.gov/genomes/all/)
More informationBioinformatics methods COMPUTATIONAL WORKFLOW
Bioinformatics methods COMPUTATIONAL WORKFLOW RAW READ PROCESSING: 1. FastQC on raw reads 2. Kraken on raw reads to ID and remove contaminants 3. SortmeRNA to filter out rrna 4. Trimmomatic to filter by
More informationProtein Bioinformatics. Rickard Sandberg Dept. of Cell and Molecular Biology Karolinska Institutet sandberg.cmb.ki.
Protein Bioinformatics Rickard Sandberg Dept. of Cell and Molecular Biology Karolinska Institutet rickard.sandberg@ki.se sandberg.cmb.ki.se Outline Protein features motifs patterns profiles signals 2 Protein
More informationComparative genomics: Overview & Tools + MUMmer algorithm
Comparative genomics: Overview & Tools + MUMmer algorithm Urmila Kulkarni-Kale Bioinformatics Centre University of Pune, Pune 411 007. urmila@bioinfo.ernet.in Genome sequence: Fact file 1995: The first
More informationEBI web resources II: Ensembl and InterPro
EBI web resources II: Ensembl and InterPro Yanbin Yin http://www.ebi.ac.uk/training/online/course/ 1 Homework 3 Go to http://www.ebi.ac.uk/interpro/training.htmland finish the second online training course
More informationFunctional Annotation & Comparative Genomics. Lu Wang, Georgia Tech
Functional Annotation & Comparative Genomics Lu Wang, Georgia Tech Outline Functional annotation What is functional annotation? What needs to be annotated Approaches to functional annotation Pros/cons
More informationPROTEIN SUBCELLULAR LOCALIZATION PREDICTION BASED ON COMPARTMENT-SPECIFIC BIOLOGICAL FEATURES
3251 PROTEIN SUBCELLULAR LOCALIZATION PREDICTION BASED ON COMPARTMENT-SPECIFIC BIOLOGICAL FEATURES Chia-Yu Su 1,2, Allan Lo 1,3, Hua-Sheng Chiu 4, Ting-Yi Sung 4, Wen-Lian Hsu 4,* 1 Bioinformatics Program,
More informationMeiothermus ruber Genome Analysis Project
Augustana College Augustana Digital Commons Meiothermus ruber Genome Analysis Project Biology 2018 Predicted ortholog pairs between E. coli and M. ruber are b3456 and mrub_2379, b3457 and mrub_2378, b3456
More informationSCOP. all-β class. all-α class, 3 different folds. T4 endonuclease V. 4-helical cytokines. Globin-like
SCOP all-β class 4-helical cytokines T4 endonuclease V all-α class, 3 different folds Globin-like TIM-barrel fold α/β class Profilin-like fold α+β class http://scop.mrc-lmb.cam.ac.uk/scop CATH Class, Architecture,
More informationHomology and Information Gathering and Domain Annotation for Proteins
Homology and Information Gathering and Domain Annotation for Proteins Outline Homology Information Gathering for Proteins Domain Annotation for Proteins Examples and exercises The concept of homology The
More informationTMHMM2.0 User's guide
TMHMM2.0 User's guide This program is for prediction of transmembrane helices in proteins. July 2001: TMHMM has been rated best in an independent comparison of programs for prediction of TM helices: S.
More informationProtein structure alignments
Protein structure alignments Proteins that fold in the same way, i.e. have the same fold are often homologs. Structure evolves slower than sequence Sequence is less conserved than structure If BLAST gives
More information1-D Predictions. Prediction of local features: Secondary structure & surface exposure
1-D Predictions Prediction of local features: Secondary structure & surface exposure 1 Learning Objectives After today s session you should be able to: Explain the meaning and usage of the following local
More informationGene function annotation
Gene function annotation Paul D. Thomas, Ph.D. University of Southern California What is function annotation? The formal answer to the question: what does this gene do? The association between: a description
More informationThe minimal prokaryotic genome. The minimal prokaryotic genome. The minimal prokaryotic genome. The minimal prokaryotic genome
Dr. Dirk Gevers 1,2 1 Laboratorium voor Microbiologie 2 Bioinformatics & Evolutionary Genomics The bacterial species in the genomic era CTACCATGAAAGACTTGTGAATCCAGGAAGAGAGACTGACTGGGCAACATGTTATTCAG GTACAAAAAGATTTGGACTGTAACTTAAAAATGATCAAATTATGTTTCCCATGCATCAGG
More informationRegulation of gene expression. Premedical - Biology
Regulation of gene expression Premedical - Biology Regulation of gene expression in prokaryotic cell Operon units system of negative feedback positive and negative regulation in eukaryotic cell - at any
More informationPrediction of signal peptides and signal anchors by a hidden Markov model
In J. Glasgow et al., eds., Proc. Sixth Int. Conf. on Intelligent Systems for Molecular Biology, 122-13. AAAI Press, 1998. 1 Prediction of signal peptides and signal anchors by a hidden Markov model Henrik
More informationBioinformatics. Dept. of Computational Biology & Bioinformatics
Bioinformatics Dept. of Computational Biology & Bioinformatics 3 Bioinformatics - play with sequences & structures Dept. of Computational Biology & Bioinformatics 4 ORGANIZATION OF LIFE ROLE OF BIOINFORMATICS
More informationHomology. and. Information Gathering and Domain Annotation for Proteins
Homology and Information Gathering and Domain Annotation for Proteins Outline WHAT IS HOMOLOGY? HOW TO GATHER KNOWN PROTEIN INFORMATION? HOW TO ANNOTATE PROTEIN DOMAINS? EXAMPLES AND EXERCISES Homology
More informationOrganization of Genes Differs in Prokaryotic and Eukaryotic DNA Chapter 10 p
Organization of Genes Differs in Prokaryotic and Eukaryotic DNA Chapter 10 p.110-114 Arrangement of information in DNA----- requirements for RNA Common arrangement of protein-coding genes in prokaryotes=
More informationA NEURAL NETWORK METHOD FOR IDENTIFICATION OF PROKARYOTIC AND EUKARYOTIC SIGNAL PEPTIDES AND PREDICTION OF THEIR CLEAVAGE SITES
International Journal of Neural Systems, Vol. 8, Nos. 5 & 6 (October/December, 1997) 581 599 c World Scientific Publishing Company A NEURAL NETWORK METHOD FOR IDENTIFICATION OF PROKARYOTIC AND EUKARYOTIC
More informationComprehensive genome analysis of 203 genomes provides structural genomics with new insights into protein family space
Published online February 15, 26 166 18 Nucleic Acids Research, 26, Vol. 34, No. 3 doi:1.193/nar/gkj494 Comprehensive genome analysis of 23 genomes provides structural genomics with new insights into protein
More informationComputational methods for the analysis of bacterial gene regulation Brouwer, Rutger Wubbe Willem
University of Groningen Computational methods for the analysis of bacterial gene regulation Brouwer, Rutger Wubbe Willem IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's
More informationChapter 5. Proteomics and the analysis of protein sequence Ⅱ
Proteomics Chapter 5. Proteomics and the analysis of protein sequence Ⅱ 1 Pairwise similarity searching (1) Figure 5.5: manual alignment One of the amino acids in the top sequence has no equivalent and
More informationMeiothermus ruber Genome Analysis Project
Augustana College Augustana Digital Commons Meiothermus ruber Genome Analysis Project Biology 2018 Examination of Orthologous Genes (Mrub_2518 and b3728, Mrub_2519 and b3727, Mrub_2520 and b3726, Mrub_2521
More informationIntroduction to Bioinformatics Online Course: IBT
Introduction to Bioinformatics Online Course: IBT Multiple Sequence Alignment Building Multiple Sequence Alignment Lec1 Building a Multiple Sequence Alignment Learning Outcomes 1- Understanding Why multiple
More informationIn silico analysis of subcellular localization of putative proteins of Mycobacterium tuberculosis H37Rv strain
ISPUB.COM The Internet Journal of Health Volume 7 Number 1 In silico analysis of subcellular localization of putative proteins of Mycobacterium tuberculosis H37Rv P Somvanshi, V Singh, P Seth Citation
More informationChapter 15 Active Reading Guide Regulation of Gene Expression
Name: AP Biology Mr. Croft Chapter 15 Active Reading Guide Regulation of Gene Expression The overview for Chapter 15 introduces the idea that while all cells of an organism have all genes in the genome,
More informationBio 119 Bacterial Genomics 6/26/10
BACTERIAL GENOMICS Reading in BOM-12: Sec. 11.1 Genetic Map of the E. coli Chromosome p. 279 Sec. 13.2 Prokaryotic Genomes: Sizes and ORF Contents p. 344 Sec. 13.3 Prokaryotic Genomes: Bioinformatic Analysis
More informationMotifs, Profiles and Domains. Michael Tress Protein Design Group Centro Nacional de Biotecnología, CSIC
Motifs, Profiles and Domains Michael Tress Protein Design Group Centro Nacional de Biotecnología, CSIC Comparing Two Proteins Sequence Alignment Determining the pattern of evolution and identifying conserved
More informationGenetic Variation: The genetic substrate for natural selection. Horizontal Gene Transfer. General Principles 10/2/17.
Genetic Variation: The genetic substrate for natural selection What about organisms that do not have sexual reproduction? Horizontal Gene Transfer Dr. Carol E. Lee, University of Wisconsin In prokaryotes:
More informationSUPPLEMENTARY INFORMATION
Supplementary information S3 (box) Methods Methods Genome weighting The currently available collection of archaeal and bacterial genomes has a highly biased distribution of isolates across taxa. For example,
More informationSome Problems from Enzyme Families
Some Problems from Enzyme Families Greg Butler Department of Computer Science Concordia University, Montreal www.cs.concordia.ca/~faculty/gregb gregb@cs.concordia.ca Abstract I will discuss some problems
More informationIntroduction to Microbiology BIOL 220 Summer Session I, 1996 Exam # 1
Name I. Multiple Choice (1 point each) Introduction to Microbiology BIOL 220 Summer Session I, 1996 Exam # 1 B 1. Which is possessed by eukaryotes but not by prokaryotes? A. Cell wall B. Distinct nucleus
More informationMathangi Thiagarajan Rice Genome Annotation Workshop May 23rd, 2007
-2 Transcript Alignment Assembly and Automated Gene Structure Improvements Using PASA-2 Mathangi Thiagarajan mathangi@jcvi.org Rice Genome Annotation Workshop May 23rd, 2007 About PASA PASA is an open
More informationIntroduction to Bioinformatics
CSCI8980: Applied Machine Learning in Computational Biology Introduction to Bioinformatics Rui Kuang Department of Computer Science and Engineering University of Minnesota kuang@cs.umn.edu History of Bioinformatics
More informationMicrobiology BIOL 202 Lecture Course Outcome Guide (COG) Approved 22 MARCH 2012 Pg.1
Microbiology BIOL 202 Lecture Course Outcome Guide (COG) Approved 22 MARCH 2012 Pg.1 Course: Credits: 3 Instructor: Course Description: Concepts and Issues 1. Microbial Ecology including mineral cycles.
More informationCRISPR-SeroSeq: A Developing Technique for Salmonella Subtyping
Department of Biological Sciences Seminar Blog Seminar Date: 3/23/18 Speaker: Dr. Nikki Shariat, Gettysburg College Title: Probing Salmonella population diversity using CRISPRs CRISPR-SeroSeq: A Developing
More informationSignal peptides and protein localization prediction
Downloaded from orbit.dtu.dk on: Jun 30, 2018 Signal peptides and protein localization prediction Nielsen, Henrik Published in: Encyclopedia of Genetics, Genomics, Proteomics and Bioinformatics Publication
More informationMATHEMATICAL MODELS - Vol. III - Mathematical Modeling and the Human Genome - Hilary S. Booth MATHEMATICAL MODELING AND THE HUMAN GENOME
MATHEMATICAL MODELING AND THE HUMAN GENOME Hilary S. Booth Australian National University, Australia Keywords: Human genome, DNA, bioinformatics, sequence analysis, evolution. Contents 1. Introduction:
More informationBioinformatics Chapter 1. Introduction
Bioinformatics Chapter 1. Introduction Outline! Biological Data in Digital Symbol Sequences! Genomes Diversity, Size, and Structure! Proteins and Proteomes! On the Information Content of Biological Sequences!
More informationYeast ORFan Gene Project: Module 5 Guide
Cellular Localization Data (Part 1) The tools described below will help you predict where your gene s product is most likely to be found in the cell, based on its sequence patterns. Each tool adds an additional
More informationHAEMOPHILUS MODULE 29.1 INTRODUCTION OBJECTIVES 29.2 MORPHOLOGY. Notes
29 HAEMOPHILUS 29.1 INTRODUCTION The genus Haemophilus contains small, nonmotile, nonsporing, oxidase positive, pleomorphic, gram negative bacilli that are parasitic on human beings or animals. Haemophilus
More informationGene Ontology and overrepresentation analysis
Gene Ontology and overrepresentation analysis Kjell Petersen J Express Microarray analysis course Oslo December 2009 Presentation adapted from Endre Anderssen and Vidar Beisvåg NMC Trondheim Overview How
More informationBioinformatics. Proteins II. - Pattern, Profile, & Structure Database Searching. Robert Latek, Ph.D. Bioinformatics, Biocomputing
Bioinformatics Proteins II. - Pattern, Profile, & Structure Database Searching Robert Latek, Ph.D. Bioinformatics, Biocomputing WIBR Bioinformatics Course, Whitehead Institute, 2002 1 Proteins I.-III.
More informationA genomic insight into evolution and virulence of Corynebacterium diphtheriae
A genomic insight into evolution and virulence of Corynebacterium diphtheriae Vartul Sangal, Ph.D. Northumbria University, Newcastle vartul.sangal@northumbria.ac.uk @VartulSangal Newcastle University 8
More informationL3.1: Circuits: Introduction to Transcription Networks. Cellular Design Principles Prof. Jenna Rickus
L3.1: Circuits: Introduction to Transcription Networks Cellular Design Principles Prof. Jenna Rickus In this lecture Cognitive problem of the Cell Introduce transcription networks Key processing network
More informationHands-On Nine The PAX6 Gene and Protein
Hands-On Nine The PAX6 Gene and Protein Main Purpose of Hands-On Activity: Using bioinformatics tools to examine the sequences, homology, and disease relevance of the Pax6: a master gene of eye formation.
More informationCS612 - Algorithms in Bioinformatics
Fall 2017 Databases and Protein Structure Representation October 2, 2017 Molecular Biology as Information Science > 12, 000 genomes sequenced, mostly bacterial (2013) > 5x10 6 unique sequences available
More informationThis document describes the process by which operons are predicted for genes within the BioHealthBase database.
1. Purpose This document describes the process by which operons are predicted for genes within the BioHealthBase database. 2. Methods Description An operon is a coexpressed set of genes, transcribed onto
More informationChristian Sigrist. November 14 Protein Bioinformatics: Sequence-Structure-Function 2018 Basel
Christian Sigrist General Definition on Conserved Regions Conserved regions in proteins can be classified into 5 different groups: Domains: specific combination of secondary structures organized into a
More informationComparative Genomics Background & Strategy. Faction 2
Comparative Genomics Background & Strategy Faction 2 Overview Introduction to comparative genomics Salmonella enterica subsp. enterica serovar Heidelberg Comparative Genomics Faction 2 Objectives Genomic
More informationThe EcoCyc Database. January 25, de Nitrógeno, UNAM,Cuernavaca, A.P. 565-A, Morelos, 62100, Mexico;
The EcoCyc Database Peter D. Karp, Monica Riley, Milton Saier,IanT.Paulsen +, Julio Collado-Vides + Suzanne M. Paley, Alida Pellegrini-Toole,César Bonavides ++, and Socorro Gama-Castro ++ January 25, 2002
More informationBIOINFORMATICS: An Introduction
BIOINFORMATICS: An Introduction What is Bioinformatics? The term was first coined in 1988 by Dr. Hwa Lim The original definition was : a collective term for data compilation, organisation, analysis and
More informationIntroduction to Bioinformatics
Systems biology Introduction to Bioinformatics Systems biology: modeling biological p Study of whole biological systems p Wholeness : Organization of dynamic interactions Different behaviour of the individual
More informationBiology 112 Practice Midterm Questions
Biology 112 Practice Midterm Questions 1. Identify which statement is true or false I. Bacterial cell walls prevent osmotic lysis II. All bacterial cell walls contain an LPS layer III. In a Gram stain,
More informationComputational Genomics. Reconstructing dynamic regulatory networks in multiple species
02-710 Computational Genomics Reconstructing dynamic regulatory networks in multiple species Methods for reconstructing networks in cells CRH1 SLT2 SLR3 YPS3 YPS1 Amit et al Science 2009 Pe er et al Recomb
More informationGEP Annotation Report
GEP Annotation Report Note: For each gene described in this annotation report, you should also prepare the corresponding GFF, transcript and peptide sequence files as part of your submission. Student name:
More informationBMD645. Integration of Omics
BMD645 Integration of Omics Shu-Jen Chen, Chang Gung University Dec. 11, 2009 1 Traditional Biology vs. Systems Biology Traditional biology : Single genes or proteins Systems biology: Simultaneously study
More informationA Machine Text-Inspired Machine Learning Approach for Identification of Transmembrane Helix Boundaries
A Machine Text-Inspired Machine Learning Approach for Identification of Transmembrane Helix Boundaries Betty Yee Man Cheng 1, Jaime G. Carbonell 1, and Judith Klein-Seetharaman 1, 2 1 Language Technologies
More informationCISC 636 Computational Biology & Bioinformatics (Fall 2016)
CISC 636 Computational Biology & Bioinformatics (Fall 2016) Predicting Protein-Protein Interactions CISC636, F16, Lec22, Liao 1 Background Proteins do not function as isolated entities. Protein-Protein
More informationRevisiting the Central Dogma The role of Small RNA in Bacteria
Graduate Student Seminar Revisiting the Central Dogma The role of Small RNA in Bacteria The Chinese University of Hong Kong Supervisor : Prof. Margaret Ip Faculty of Medicine Student : Helen Ma (PhD student)
More informationUpdate on human genome completion and annotations: Protein information resource
UPDATE ON GENOME COMPLETION AND ANNOTATIONS Update on human genome completion and annotations: Protein information resource Cathy Wu 1 and Daniel W. Nebert 2 * 1 Director of PIR, Department of Biochemistry
More informationGene Control Mechanisms at Transcription and Translation Levels
Gene Control Mechanisms at Transcription and Translation Levels Dr. M. Vijayalakshmi School of Chemical and Biotechnology SASTRA University Joint Initiative of IITs and IISc Funded by MHRD Page 1 of 9
More informationRGP finder: prediction of Genomic Islands
Training courses on MicroScope platform RGP finder: prediction of Genomic Islands Dynamics of bacterial genomes Gene gain Horizontal gene transfer Gene loss Deletion of one or several genes Duplication
More informationProkaryotic Gene Expression (Learning Objectives)
Prokaryotic Gene Expression (Learning Objectives) 1. Learn how bacteria respond to changes of metabolites in their environment: short-term and longer-term. 2. Compare and contrast transcriptional control
More informationTitle: PSORTb v.2.0: Expanded prediction of bacterial protein subcellular localization and insights gained from comparative proteome analysis
Title: PSORTb v.2.0: Expanded prediction of bacterial protein subcellular localization and insights gained from comparative proteome analysis Running head: PSORTb Localization Prediction J.L. Gardy, Department
More informationRiboflavin Metabolism: A study to see if Mrub_1256 is Orthologous to E. coli b0415, and if Mrub_1254 is Orthologous to E.
Augustana College Augustana Digital Commons Meiothermus ruber Genome Analysis Project Biology Winter 2-2016 Riboflavin Metabolism: A study to see if Mrub_1256 is Orthologous to E. coli b0415, and if Mrub_1254
More informationMarkov Models & DNA Sequence Evolution
7.91 / 7.36 / BE.490 Lecture #5 Mar. 9, 2004 Markov Models & DNA Sequence Evolution Chris Burge Review of Markov & HMM Models for DNA Markov Models for splice sites Hidden Markov Models - looking under
More informationHMM applications. Applications of HMMs. Gene finding with HMMs. Using the gene finder
HMM applications Applications of HMMs Gene finding Pairwise alignment (pair HMMs) Characterizing protein families (profile HMMs) Predicting membrane proteins, and membrane protein topology Gene finding
More informationComparative RNA-seq analysis of transcriptome dynamics during petal development in Rosa chinensis
Title Comparative RNA-seq analysis of transcriptome dynamics during petal development in Rosa chinensis Author list Yu Han 1, Huihua Wan 1, Tangren Cheng 1, Jia Wang 1, Weiru Yang 1, Huitang Pan 1* & Qixiang
More informationComputational Biology: Basics & Interesting Problems
Computational Biology: Basics & Interesting Problems Summary Sources of information Biological concepts: structure & terminology Sequencing Gene finding Protein structure prediction Sources of information
More informationGenome Annotation. Qi Sun Bioinformatics Facility Cornell University
Genome Annotation Qi Sun Bioinformatics Facility Cornell University Some basic bioinformatics tools BLAST PSI-BLAST - Position-Specific Scoring Matrix HMM - Hidden Markov Model NCBI BLAST How does BLAST
More informationProtein function prediction based on sequence analysis
Performing sequence searches Post-Blast analysis, Using profiles and pattern-matching Protein function prediction based on sequence analysis Slides from a lecture on MOL204 - Applied Bioinformatics 18-Oct-2005
More informationSifting through genomes with iterative-sequence clustering produces a large, phylogenetically diverse protein-family resource
Sharpton et al. BMC Bioinformatics 2012, 13:264 RESEARCH ARTICLE Open Access Sifting through genomes with iterative-sequence clustering produces a large, phylogenetically diverse protein-family resource
More informationFlow of Genetic Information
presents Flow of Genetic Information A Montagud E Navarro P Fernández de Córdoba JF Urchueguía Elements Nucleic acid DNA RNA building block structure & organization genome building block types Amino acid
More informationComputational methods for predicting protein-protein interactions
Computational methods for predicting protein-protein interactions Tomi Peltola T-61.6070 Special course in bioinformatics I 3.4.2008 Outline Biological background Protein-protein interactions Computational
More informationImproved Prediction of Signal Peptides: SignalP 3.0
doi:10.1016/j.jmb.2004.05.028 J. Mol. Biol. (2004) 340, 783 795 Improved Prediction of Signal Peptides: SignalP 3.0 Jannick Dyrløv Bendtsen 1, Henrik Nielsen 1, Gunnar von Heijne 2 and Søren Brunak 1 *
More informationSequence Alignment Techniques and Their Uses
Sequence Alignment Techniques and Their Uses Sarah Fiorentino Since rapid sequencing technology and whole genomes sequencing, the amount of sequence information has grown exponentially. With all of this
More informationPogil Answer Key Control Of Blood Sugar Levels
POGIL ANSWER KEY CONTROL OF BLOOD SUGAR LEVELS PDF - Are you looking for pogil answer key control of blood sugar levels Books? Now, you will be happy that at this time pogil answer key control of blood
More informationSupporting online material
Supporting online material Materials and Methods Target proteins All predicted ORFs in the E. coli genome (1) were downloaded from the Colibri data base (2) (http://genolist.pasteur.fr/colibri/). 737 proteins
More informationAmino Acid Structures from Klug & Cummings. 10/7/2003 CAP/CGS 5991: Lecture 7 1
Amino Acid Structures from Klug & Cummings 10/7/2003 CAP/CGS 5991: Lecture 7 1 Amino Acid Structures from Klug & Cummings 10/7/2003 CAP/CGS 5991: Lecture 7 2 Amino Acid Structures from Klug & Cummings
More informationOutline. I. Methods. II. Preliminary Results. A. Phylogeny Methods B. Whole Genome Methods C. Horizontal Gene Transfer
Comparative Genomics Preliminary Results April 4, 2016 Juan Castro, Aroon Chande, Cheng Chen, Evan Clayton, Hector Espitia, Alli Gombolay, Walker Gussler, Ken Lee, Tyrone Lee, Hari Prasanna, Carlos Ruiz,
More informationThe human transmembrane proteome
Dobson et al. Biology Direct (2015) 10:31 DOI 10.1186/s13062-015-0061-x RESEARCH Open Access The human transmembrane proteome László Dobson, István Reményi and Gábor E. Tusnády * Abstract Background: Transmembrane
More informationIn-Silico Approach for Hypothetical Protein Function Prediction
In-Silico Approach for Hypothetical Protein Function Prediction Shabanam Khatoon Department of Computer Science, Faculty of Natural Sciences Jamia Millia Islamia, New Delhi Suraiya Jabin Department of
More informationPrinciples of Cellular Biology
Principles of Cellular Biology آشنایی با مبانی اولیه سلول Biologists are interested in objects ranging in size from small molecules to the tallest trees: Cell Basic building blocks of life Understanding
More informationIntroduction to Bioinformatics Integrated Science, 11/9/05
1 Introduction to Bioinformatics Integrated Science, 11/9/05 Morris Levy Biological Sciences Research: Evolutionary Ecology, Plant- Fungal Pathogen Interactions Coordinator: BIOL 495S/CS490B/STAT490B Introduction
More informationOverview of Research at Bioinformatics Lab
Overview of Research at Bioinformatics Lab Li Liao Develop new algorithms and (statistical) learning methods that help solve biological problems > Capable of incorporating domain knowledge > Effective,
More informationFrom gene to protein. Premedical biology
From gene to protein Premedical biology Central dogma of Biology, Molecular Biology, Genetics transcription replication reverse transcription translation DNA RNA Protein RNA chemically similar to DNA,
More information