Prediction of protein function from sequence analysis
|
|
- Jared McGee
- 5 years ago
- Views:
Transcription
1 Prediction of protein function from sequence analysis Rita Casadio BIOCOMPUTING GROUP University of Bologna, Italy
2 The omic era Genome Sequencing Projects: Archaea: 74 species In Progress:52 Bacteria: 973 species In Progress: 2266 species Eukaryotic: Complete-23 Draft Assembly 318 In Progress Update: January 2010
3 The Data Bases of Biological Sequences and Structures GenBank: 108,431,692 sequences 106,533,156,756 nucleotides >BGAL_SULSO BETA-GALACTOSIDASE Sulfolobus solfataricus. MYSFPNSFRFGWSQAGFQSEMGTPGSEDPNTDWYKWVHDPENMAAGLVSG DLPENGPGYWGNYKTFHDNAQKMGLKIARLNVEWSRIFPNPLPRPQNFDE SKQDVTEVEINENELKRLDEYANKDALNHYREIFKDLKSRGLYFILNMYH WPLPLWLHDPIRVRRGDFTGPSGWLSTRTVYEFARFSAYIAWKFDDLVDE YSTMNEPNVVGGLGYVGVKSGFPPGYLSFELSRRHMYNIIQAHARAYDGI KSVSKKPVGIIYANSSFQPLTDKDMEAVEMAENDNRWWFFDAIIRGEITR GNEKIVRDDLKGRLDWIGVNYYTRTVVKRTEKGYVSLGGYGHGCERNSVS LAGLPTSDFGWEFFPEGLYDVLTKYWNRYHLYMYVTENGIADDADYQRPY YLVSHVYQVHRAINSGADVRGYLHWSLADNYEWASGFSMRFGLLKVDYNT 35,5 HGE! NR(*): 10,381,779 sequences 3,542,056,219 residues KRLYWRPSALVYREIATNGAITDEIEHLNSVPPVKPLRH SwissProt: 514,212 sequences 180,900,945 residues PDB: 60,654 structures membrane proteins <2% (*) CDS translations+pdb+swissprot+pir+prf Update: January 2009
4 (about 30,000 in the human genome) code for proteins... >protein kinase acctgttgatggcgacagggactgtatgctgatct atgctgatgcatgcatgctgactactgatgtgggg gctattgacttgatgtctatc... Genes in DNA... with different effects depending on variability Over 20 millions of single mutations are known in genes proteins correspond to functions... From 5000 to proteins per tissue From Genotype to Phenotype Proteins interact.in methabolic pathways when they are expressed
5 STRING 8 a global view on proteins and their functional interactions in 630 organisms- Jensen et al., 2009, Nucleic Acids Research, Vol 37. The Human Interactome in STRING 22,937 proteins and 1,482,533 interactions
6 One problem of the omic era : Protein functional Protein functional annotation
7 The Protein Data Bank No of Proteins with known structure: 57529
8 SCOP: Structural Classification of Proteins Domains are hierarchically classified: -class -fold:proteins with secondary structures in same arrangement with the same topological connections -superfamily: structures and functional features suggest a common evolutionary origin -family: proteins with identities 30%; with identities <30% but with similar structures and functions
9 From the Protein Sequence to the Structure and Function space Lesk A., 2004
10 100% Sequence comparison 30% Sequence Identity (% %) From the Protein Sequence to the Structure space PDB Fold recognition Machine-learning aided alignment Threading 0% New Folds Ab initio and de novo modelling Machine-learning prediction of structural features
11 From the Protein Sequence to the Structure and Function space What is protein function?
12 What is a function? For enzymes: function can be defined on the basis of the catalysed molecular reaction. e.g. aspartic aminotransferase (AST)
13 In biochemistry, a transaminaseor an aminotransferaseis an enzyme that catalyzes a type of reaction between an amino acid and an α-keto acid. Specifically, this reaction (transamination) involves removing the amino group from the amino acid, leaving behind an α-keto acid, and transferring it to the reactant α-keto acid and converting it into an amino acid. The enzymes are important in the production of various amino acids, and measuring the concentrations of various transaminases in the blood is important in the diagnosing and tracking many diseases. Transaminases require the coenzyme pyridoxal-phosphate, which is converted into pyridoxaminein the first phase of the reaction, when an amino acid is converted into a keto acid. Enzyme-bound pyridoxamine in turn reacts with pyruvate, oxaloacetate, or alphaketoglutarate, giving alanine, aspartic acid, or glutamic acid, respectively. The presence of elevated transaminases can be an indicator of liver damage.
14 Enzyme Commission (E.C.) classification A hierarchical classification for enzymes
15 EC 2.6 Transferring nitrogenous groups EC 2.6.1Transaminases EC Aspartate transaminase Other name(s): glutamic-oxaloacetic transaminase; glutamic-aspartic transaminase; transaminase A; AAT; AspT; 2- oxoglutarate-glutamate aminotransferase; aspartate α-ketoglutarate transaminase; aspartate aminotransferase; aspartate-2-oxoglutarate transaminase; aspartic acid aminotransferase; aspartic aminotransferase; aspartyl aminotransferase; AST; glutamate-oxalacetate aminotransferase; glutamate-oxalate transaminase; glutamic-aspartic aminotransferase; glutamic-oxalacetic transaminase; glutamic oxalic transaminase; GOT (enzyme); L-aspartate transaminase; L-aspartate-α-ketoglutarate transaminase; L-aspartate-2-ketoglutarate aminotransferase; L-aspartate- 2-oxoglutarate aminotransferase; L-aspartate-2-oxoglutarate-transaminase; L-aspartic aminotransferase; oxaloacetate-aspartate aminotransferase; oxaloacetate transferase; aspartate:2-oxoglutarate aminotransferase; glutamate oxaloacetate transaminase Systematic name: L-aspartate:2-oxoglutarate aminotransferase
16 Problems: Isoforms e.g How to differentiate the function of the cytoplasmic aspartate amintransferase from that of mitochondrial isoform? Non enzymatic proteins
17 GO function vocabulary: The Ontologies Cellular component Biological process Molecular function
18 Gene Ontology classification: The human cytoplasmic aspartate transaminase GO: GO: GO:
19 One BIG problem of the omic era : Protein functional Protein functional annotation
20 Functional annotation in silico by homology search ADH1_SULSO MRAVRLVEIGKP--LSLQEIGVPKPKGPQVLIKVEAAGVCHSDVHMRQGRFGNLRIVE ADH_CLOBE MKGFAMLGINKLG---WIEKERPVAGSYDAIVRPLAVSPCTSDIHTVFEGA ADH_THEBR MKGFAMLSIGKVG---WIEKEKPAPGPFDAIVRPLAVAPCTSDIHTVFEGA ADH1_SOLTU MSTTVGQVIRCKAAVAWEAGKP--LVMEEVDVAPPQKMEVRLKILYTSLCHTDVYFWEAKG ADH2_LYCES MSTTVGQVIRCKAAVAWEAGKP--LVMEEVDVAPPQKMEVRLKILYTSLCHTDVYFWEAKG ADH1_ASPFL ----MSIPEMQWAQVAEQKGGP--LIYKQIPVPKPGPDEILVKVRYSGVCHTDLHALKGDW Sequence comparison is performed with alignment programs Sequence identity 40 % Similar structure and function (??) Methods for similarity searches: BLAST, Psi-BLAST ( sequence Altschul et al., (1990) J Mol Biol 215: Altschul et al., (1998) Nucleic Acids Res. 25: Pfam ( sequence/structure Bateman et al., (2000) Nucleic Acids Research 28:
21 Transfer by inheritance: Function annotation transfer from sequence through homology
22
23 PDB The annotation process at UniProt
24 Open problems of inheritance through homology Not all UniProt files are GO annotated The optimal threshold value of sequence identity for function transfer is not known Proteins contain multiple domains Proteins can share common domains and not necessarily the same function In proteins different combination of shared domains lead to different biological roles
CS612 - Algorithms in Bioinformatics
Fall 2017 Databases and Protein Structure Representation October 2, 2017 Molecular Biology as Information Science > 12, 000 genomes sequenced, mostly bacterial (2013) > 5x10 6 unique sequences available
More informationBioinformatics. Dept. of Computational Biology & Bioinformatics
Bioinformatics Dept. of Computational Biology & Bioinformatics 3 Bioinformatics - play with sequences & structures Dept. of Computational Biology & Bioinformatics 4 ORGANIZATION OF LIFE ROLE OF BIOINFORMATICS
More informationHomology and Information Gathering and Domain Annotation for Proteins
Homology and Information Gathering and Domain Annotation for Proteins Outline Homology Information Gathering for Proteins Domain Annotation for Proteins Examples and exercises The concept of homology The
More informationCSCE555 Bioinformatics. Protein Function Annotation
CSCE555 Bioinformatics Protein Function Annotation Why we need to do function annotation? Fig from: Network-based prediction of protein function. Molecular Systems Biology 3:88. 2007 What s function? The
More informationEBI web resources II: Ensembl and InterPro. Yanbin Yin Spring 2013
EBI web resources II: Ensembl and InterPro Yanbin Yin Spring 2013 1 Outline Intro to genome annotation Protein family/domain databases InterPro, Pfam, Superfamily etc. Genome browser Ensembl Hands on Practice
More informationEBI web resources II: Ensembl and InterPro
EBI web resources II: Ensembl and InterPro Yanbin Yin http://www.ebi.ac.uk/training/online/course/ 1 Homework 3 Go to http://www.ebi.ac.uk/interpro/training.htmland finish the second online training course
More informationGenome Annotation. Bioinformatics and Computational Biology. Genome sequencing Assembly. Gene prediction. Protein targeting.
Genome Annotation Bioinformatics and Computational Biology Genome Annotation Frank Oliver Glöckner 1 Genome Analysis Roadmap Genome sequencing Assembly Gene prediction Protein targeting trna prediction
More informationProcheck output. Bond angles (Procheck) Structure verification and validation Bond lengths (Procheck) Introduction to Bioinformatics.
Structure verification and validation Bond lengths (Procheck) Introduction to Bioinformatics Iosif Vaisman Email: ivaisman@gmu.edu ----------------------------------------------------------------- Bond
More informationSyllabus of BIOINF 528 (2017 Fall, Bioinformatics Program)
Syllabus of BIOINF 528 (2017 Fall, Bioinformatics Program) Course Name: Structural Bioinformatics Course Description: Instructor: This course introduces fundamental concepts and methods for structural
More informationHomology. and. Information Gathering and Domain Annotation for Proteins
Homology and Information Gathering and Domain Annotation for Proteins Outline WHAT IS HOMOLOGY? HOW TO GATHER KNOWN PROTEIN INFORMATION? HOW TO ANNOTATE PROTEIN DOMAINS? EXAMPLES AND EXERCISES Homology
More informationProtein structure alignments
Protein structure alignments Proteins that fold in the same way, i.e. have the same fold are often homologs. Structure evolves slower than sequence Sequence is less conserved than structure If BLAST gives
More informationIntro Secondary structure Transmembrane proteins Function End. Last time. Domains Hidden Markov Models
Last time Domains Hidden Markov Models Today Secondary structure Transmembrane proteins Structure prediction NAD-specific glutamate dehydrogenase Hard Easy >P24295 DHE2_CLOSY MSKYVDRVIAEVEKKYADEPEFVQTVEEVL
More informationToday. Last time. Secondary structure Transmembrane proteins. Domains Hidden Markov Models. Structure prediction. Secondary structure
Last time Today Domains Hidden Markov Models Structure prediction NAD-specific glutamate dehydrogenase Hard Easy >P24295 DHE2_CLOSY MSKYVDRVIAEVEKKYADEPEFVQTVEEVL SSLGPVVDAHPEYEEVALLERMVIPERVIE FRVPWEDDNGKVHVNTGYRVQFNGAIGPYK
More informationSome Problems from Enzyme Families
Some Problems from Enzyme Families Greg Butler Department of Computer Science Concordia University, Montreal www.cs.concordia.ca/~faculty/gregb gregb@cs.concordia.ca Abstract I will discuss some problems
More informationBiology Scope and Sequence Student Outcomes (Objectives Skills/Verbs)
C-4 N.12.A 1-6 N.12.B.1-4 Scientific Literacy/ Nature of (embedded throughout course) Scientific Inquiry is the process by which humans systematically examine the natural world. Scientific inquiry is a
More informationI. Molecules & Cells. A. Unit One: The Nature of Science. B. Unit Two: The Chemistry of Life. C. Unit Three: The Biology of the Cell.
I. Molecules & Cells A. Unit One: The Nature of Science a. How is the scientific method used to solve problems? b. What is the importance of controls? c. How does Darwin s theory of evolution illustrate
More informationMETABOLIC PATHWAY PREDICTION/ALIGNMENT
COMPUTATIONAL SYSTEMIC BIOLOGY METABOLIC PATHWAY PREDICTION/ALIGNMENT Hofestaedt R*, Chen M Bioinformatics / Medical Informatics, Technische Fakultaet, Universitaet Bielefeld Postfach 10 01 31, D-33501
More informationCMPS 6630: Introduction to Computational Biology and Bioinformatics. Structure Comparison
CMPS 6630: Introduction to Computational Biology and Bioinformatics Structure Comparison Protein Structure Comparison Motivation Understand sequence and structure variability Understand Domain architecture
More informationHMM applications. Applications of HMMs. Gene finding with HMMs. Using the gene finder
HMM applications Applications of HMMs Gene finding Pairwise alignment (pair HMMs) Characterizing protein families (profile HMMs) Predicting membrane proteins, and membrane protein topology Gene finding
More informationStructure to Function. Molecular Bioinformatics, X3, 2006
Structure to Function Molecular Bioinformatics, X3, 2006 Structural GeNOMICS Structural Genomics project aims at determination of 3D structures of all proteins: - organize known proteins into families
More informationChem Lecture 4 Enzymes Part 1
Chem 452 - Lecture 4 Enzymes Part 1 Question of the Day: Enzymes are biological catalysts. Based on your general understanding of catalysts, what does this statement imply about enzymes? Introduction Enzymes
More informationLarge-Scale Genomic Surveys
Bioinformatics Subtopics Fold Recognition Secondary Structure Prediction Docking & Drug Design Protein Geometry Protein Flexibility Homology Modeling Sequence Alignment Structure Classification Gene Prediction
More informationIntroduction to Bioinformatics
Introduction to Bioinformatics Jianlin Cheng, PhD Department of Computer Science Informatics Institute 2011 Topics Introduction Biological Sequence Alignment and Database Search Analysis of gene expression
More informationPROTEIN FUNCTION PREDICTION WITH AMINO ACID SEQUENCE AND SECONDARY STRUCTURE ALIGNMENT SCORES
PROTEIN FUNCTION PREDICTION WITH AMINO ACID SEQUENCE AND SECONDARY STRUCTURE ALIGNMENT SCORES Eser Aygün 1, Caner Kömürlü 2, Zafer Aydin 3 and Zehra Çataltepe 1 1 Computer Engineering Department and 2
More informationIMPORTANCE OF SECONDARY STRUCTURE ELEMENTS FOR PREDICTION OF GO ANNOTATIONS
IMPORTANCE OF SECONDARY STRUCTURE ELEMENTS FOR PREDICTION OF GO ANNOTATIONS Aslı Filiz 1, Eser Aygün 2, Özlem Keskin 3 and Zehra Cataltepe 2 1 Informatics Institute and 2 Computer Engineering Department,
More informationProtein Structure: Data Bases and Classification Ingo Ruczinski
Protein Structure: Data Bases and Classification Ingo Ruczinski Department of Biostatistics, Johns Hopkins University Reference Bourne and Weissig Structural Bioinformatics Wiley, 2003 More References
More informationMITOCW watch?v=0xajihttcns
MITOCW watch?v=0xajihttcns The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high quality educational resources for free. To
More informationUnderstanding Sequence, Structure and Function Relationships and the Resulting Redundancy
Understanding Sequence, Structure and Function Relationships and the Resulting Redundancy many slides by Philip E. Bourne Department of Pharmacology, UCSD Agenda Understand the relationship between sequence,
More informationBIOINFORMATICS LAB AP BIOLOGY
BIOINFORMATICS LAB AP BIOLOGY Bioinformatics is the science of collecting and analyzing complex biological data. Bioinformatics combines computer science, statistics and biology to allow scientists to
More information2MHR. Protein structure classification is important because it organizes the protein structure universe that is independent of sequence similarity.
Protein structure classification is important because it organizes the protein structure universe that is independent of sequence similarity. A global picture of the protein universe will help us to understand
More informationAnnotation Error in Public Databases ALEXANDRA SCHNOES UNIVERSITY OF CALIFORNIA, SAN FRANCISCO OCTOBER 25, 2010
Annotation Error in Public Databases ALEXANDRA SCHNOES UNIVERSITY OF CALIFORNIA, SAN FRANCISCO OCTOBER 25, 2010 1 New genomes (and metagenomes) sequenced every day... 2 3 3 3 3 3 3 3 3 3 Computational
More informationCISC 636 Computational Biology & Bioinformatics (Fall 2016)
CISC 636 Computational Biology & Bioinformatics (Fall 2016) Predicting Protein-Protein Interactions CISC636, F16, Lec22, Liao 1 Background Proteins do not function as isolated entities. Protein-Protein
More informationSUPPLEMENTARY INFORMATION
Supplementary information S1 (box). Supplementary Methods description. Prokaryotic Genome Database Archaeal and bacterial genome sequences were downloaded from the NCBI FTP site (ftp://ftp.ncbi.nlm.nih.gov/genomes/all/)
More informationBIOINFORMATICS: An Introduction
BIOINFORMATICS: An Introduction What is Bioinformatics? The term was first coined in 1988 by Dr. Hwa Lim The original definition was : a collective term for data compilation, organisation, analysis and
More informationCOMP 598 Advanced Computational Biology Methods & Research. Introduction. Jérôme Waldispühl School of Computer Science McGill University
COMP 598 Advanced Computational Biology Methods & Research Introduction Jérôme Waldispühl School of Computer Science McGill University General informations (1) Office hours: by appointment Office: TR3018
More informationSequence and Structure Alignment Z. Luthey-Schulten, UIUC Pittsburgh, 2006 VMD 1.8.5
Sequence and Structure Alignment Z. Luthey-Schulten, UIUC Pittsburgh, 2006 VMD 1.8.5 Why Look at More Than One Sequence? 1. Multiple Sequence Alignment shows patterns of conservation 2. What and how many
More informationINTERACTIVE CLUSTERING FOR EXPLORATION OF GENOMIC DATA
INTERACTIVE CLUSTERING FOR EXPLORATION OF GENOMIC DATA XIUFENG WAN xw6@cs.msstate.edu Department of Computer Science Box 9637 JOHN A. BOYLE jab@ra.msstate.edu Department of Biochemistry and Molecular Biology
More informationADVANCED PLACEMENT BIOLOGY
ADVANCED PLACEMENT BIOLOGY Description Advanced Placement Biology is designed to be the equivalent of a two-semester college introductory course for Biology majors. The course meets seven periods per week
More informationMETHODS FOR DETERMINING PHYLOGENY. In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task.
Chapter 12 (Strikberger) Molecular Phylogenies and Evolution METHODS FOR DETERMINING PHYLOGENY In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task. Modern
More informationFrom Sequence to Function (I): - Protein Profiling - Case Studies in Structural & Functional Genomics
BCHS 6229 Protein Structure and Function Lecture 6 (Oct 27, 2011) From Sequence to Function (I): - Protein Profiling - Case Studies in Structural & Functional Genomics 1 From Sequence to Function in the
More informationMotif Prediction in Amino Acid Interaction Networks
Motif Prediction in Amino Acid Interaction Networks Omar GACI and Stefan BALEV Abstract In this paper we represent a protein as a graph where the vertices are amino acids and the edges are interactions
More informationHomology Modeling. Roberto Lins EPFL - summer semester 2005
Homology Modeling Roberto Lins EPFL - summer semester 2005 Disclaimer: course material is mainly taken from: P.E. Bourne & H Weissig, Structural Bioinformatics; C.A. Orengo, D.T. Jones & J.M. Thornton,
More informationTiffany Samaroo MB&B 452a December 8, Take Home Final. Topic 1
Tiffany Samaroo MB&B 452a December 8, 2003 Take Home Final Topic 1 Prior to 1970, protein and DNA sequence alignment was limited to visual comparison. This was a very tedious process; even proteins with
More informationCurriculum Links. AQA GCE Biology. AS level
Curriculum Links AQA GCE Biology Unit 2 BIOL2 The variety of living organisms 3.2.1 Living organisms vary and this variation is influenced by genetic and environmental factors Causes of variation 3.2.2
More informationIn-Silico Approach for Hypothetical Protein Function Prediction
In-Silico Approach for Hypothetical Protein Function Prediction Shabanam Khatoon Department of Computer Science, Faculty of Natural Sciences Jamia Millia Islamia, New Delhi Suraiya Jabin Department of
More informationResearch Article A Topological Description of Hubs in Amino Acid Interaction Networks
Advances in Bioinformatics Volume 21, Article ID 257512, 9 pages doi:1.1155/21/257512 Research Article A Topological Description of Hubs in Amino Acid Interaction Networks Omar Gaci Le Havre University,
More informationBMD645. Integration of Omics
BMD645 Integration of Omics Shu-Jen Chen, Chang Gung University Dec. 11, 2009 1 Traditional Biology vs. Systems Biology Traditional biology : Single genes or proteins Systems biology: Simultaneously study
More informationIntroduction and. Properties of Enzymes
Unit-III Enzymes Contents 1. Introduction and Properties of enzymes 2. Nomenclature and Classification 3. Mechanism of enzyme-catalyzed reactions 4. Kinetics of enzyme-catalyzed reactions 5. Inhibition
More informationQuiz answers. Allele. BIO 5099: Molecular Biology for Computer Scientists (et al) Lecture 17: The Quiz (and back to Eukaryotic DNA)
BIO 5099: Molecular Biology for Computer Scientists (et al) Lecture 17: The Quiz (and back to Eukaryotic DNA) http://compbio.uchsc.edu/hunter/bio5099 Larry.Hunter@uchsc.edu Quiz answers Kinase: An enzyme
More informationGiri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748
CAP 5510: Introduction to Bioinformatics Giri Narasimhan ECS 254; Phone: x3748 giri@cis.fiu.edu www.cis.fiu.edu/~giri/teach/bioinfs07.html 2/15/07 CAP5510 1 EM Algorithm Goal: Find θ, Z that maximize Pr
More informationI. Molecules and Cells: Cells are the structural and functional units of life; cellular processes are based on physical and chemical changes.
I. Molecules and Cells: Cells are the structural and functional units of life; cellular processes are based on physical and chemical changes. A. Chemistry of Life B. Cells 1. Water How do the unique chemical
More informationProtein Bioinformatics. Rickard Sandberg Dept. of Cell and Molecular Biology Karolinska Institutet sandberg.cmb.ki.
Protein Bioinformatics Rickard Sandberg Dept. of Cell and Molecular Biology Karolinska Institutet rickard.sandberg@ki.se sandberg.cmb.ki.se Outline Protein features motifs patterns profiles signals 2 Protein
More informationStudy of Mining Protein Structural Properties and its Application
Study of Mining Protein Structural Properties and its Application A Dissertation Proposal Presented to the Department of Computer Science and Information Engineering College of Electrical Engineering and
More informationMethod of Enzyme Assay
Method of Enzyme Assay Objective To study the different methods for determining enzyme activity. Use these method in diagnosis of certain diseases How to follow a reaction? Enzyme assays: Are laboratory
More informationElements and Isotopes
Section 2-1 Notes Atoms Life depends on chemistry. The basic unit of matter is the atom. Atoms are incredibly small The subatomic particles that make up atoms are protons, neutrons, and electrons. Parts
More informationA Protein Ontology from Large-scale Textmining?
A Protein Ontology from Large-scale Textmining? Protege-Workshop Manchester, 07-07-2003 Kai Kumpf, Juliane Fluck and Martin Hofmann Instructive mistakes: a narrative Aim: Protein ontology that supports
More informationBioinformatics Exercises
Bioinformatics Exercises AP Biology Teachers Workshop Susan Cates, Ph.D. Evolution of Species Phylogenetic Trees show the relatedness of organisms Common Ancestor (Root of the tree) 1 Rooted vs. Unrooted
More informationBasic modeling approaches for biological systems. Mahesh Bule
Basic modeling approaches for biological systems Mahesh Bule The hierarchy of life from atoms to living organisms Modeling biological processes often requires accounting for action and feedback involving
More informationK-means-based Feature Learning for Protein Sequence Classification
K-means-based Feature Learning for Protein Sequence Classification Paul Melman and Usman W. Roshan Department of Computer Science, NJIT Newark, NJ, 07102, USA pm462@njit.edu, usman.w.roshan@njit.edu Abstract
More informationComputational Biology: Basics & Interesting Problems
Computational Biology: Basics & Interesting Problems Summary Sources of information Biological concepts: structure & terminology Sequencing Gene finding Protein structure prediction Sources of information
More informationPrediction of protein
Prediction of protein contact t maps Piero Fariselli Department of Biology University of Bologna From Sequence to Function Functional Genomics and Proteomics >BGAL_SULSO BETA-GALACTOSIDASE Sulfolobus solfataricus.
More informationMethod of Enzyme Assay
Method of Enzyme Assay Objective To study the different methods for determining enzyme activity. Enzyme assays: Are laboratory methods for measuring enzymatic activity. All enzyme assays measure either
More informationPeddie Summer Day School
Peddie Summer Day School Course Syllabus: BIOLOGY Teacher: Mr. Jeff Tuliszewski Text: Biology by Miller and Levine, Prentice Hall, 2010 edition ISBN 9780133669510 Guided Reading Workbook for Biology ISBN
More informationSUPPLEMENTARY INFORMATION
Supplementary information S3 (box) Methods Methods Genome weighting The currently available collection of archaeal and bacterial genomes has a highly biased distribution of isolates across taxa. For example,
More informationComputational methods for predicting protein-protein interactions
Computational methods for predicting protein-protein interactions Tomi Peltola T-61.6070 Special course in bioinformatics I 3.4.2008 Outline Biological background Protein-protein interactions Computational
More informationComputational Biology From The Perspective Of A Physical Scientist
Computational Biology From The Perspective Of A Physical Scientist Dr. Arthur Dong PP1@TUM 26 November 2013 Bioinformatics Education Curriculum Math, Physics, Computer Science (Statistics and Programming)
More informationSCOP. all-β class. all-α class, 3 different folds. T4 endonuclease V. 4-helical cytokines. Globin-like
SCOP all-β class 4-helical cytokines T4 endonuclease V all-α class, 3 different folds Globin-like TIM-barrel fold α/β class Profilin-like fold α+β class http://scop.mrc-lmb.cam.ac.uk/scop CATH Class, Architecture,
More informationJeremy Chang Identifying protein protein interactions with statistical coupling analysis
Jeremy Chang Identifying protein protein interactions with statistical coupling analysis Abstract: We used an algorithm known as statistical coupling analysis (SCA) 1 to create a set of features for building
More information(Lys), resulting in translation of a polypeptide without the Lys amino acid. resulting in translation of a polypeptide without the Lys amino acid.
1. A change that makes a polypeptide defective has been discovered in its amino acid sequence. The normal and defective amino acid sequences are shown below. Researchers are attempting to reproduce the
More informationThe CATH Database provides insights into protein structure/function relationships
1999 Oxford University Press Nucleic Acids Research, 1999, Vol. 27, No. 1 275 279 The CATH Database provides insights into protein structure/function relationships C. A. Orengo, F. M. G. Pearl, J. E. Bray,
More informationMicrobiology / Active Lecture Questions Chapter 10 Classification of Microorganisms 1 Chapter 10 Classification of Microorganisms
1 2 Bergey s Manual of Systematic Bacteriology differs from Bergey s Manual of Determinative Bacteriology in that the former a. groups bacteria into species. b. groups bacteria according to phylogenetic
More informationHeteropolymer. Mostly in regular secondary structure
Heteropolymer - + + - Mostly in regular secondary structure 1 2 3 4 C >N trace how you go around the helix C >N C2 >N6 C1 >N5 What s the pattern? Ci>Ni+? 5 6 move around not quite 120 "#$%&'!()*(+2!3/'!4#5'!1/,#64!#6!,6!
More informationDivision Ave. High School AP Biology
Overview 10 reactions u convert () to pyruvate (3C) u produces: 4 & NADH u consumes: u net: & NADH C-C-C-C-C-C fructose-1,6bp P-C-C-C-C-C-C-P DHAP P-C-C-C G3P C-C-C-P H P i P i pyruvate C-C-C 4 4 NAD +
More informationSequence Alignment Techniques and Their Uses
Sequence Alignment Techniques and Their Uses Sarah Fiorentino Since rapid sequencing technology and whole genomes sequencing, the amount of sequence information has grown exponentially. With all of this
More informationHands-On Nine The PAX6 Gene and Protein
Hands-On Nine The PAX6 Gene and Protein Main Purpose of Hands-On Activity: Using bioinformatics tools to examine the sequences, homology, and disease relevance of the Pax6: a master gene of eye formation.
More informationMultiple Choice Review- Eukaryotic Gene Expression
Multiple Choice Review- Eukaryotic Gene Expression 1. Which of the following is the Central Dogma of cell biology? a. DNA Nucleic Acid Protein Amino Acid b. Prokaryote Bacteria - Eukaryote c. Atom Molecule
More informationUpdated: 10/11/2018 Page 1 of 5
A. Academic Division: Health Sciences B. Discipline: Biology C. Course Number and Title: BIOL1230 Biology I MASTER SYLLABUS 2018-2019 D. Course Coordinator: Justin Tickhill Assistant Dean: Melinda Roepke,
More informationSequence Alignment: A General Overview. COMP Fall 2010 Luay Nakhleh, Rice University
Sequence Alignment: A General Overview COMP 571 - Fall 2010 Luay Nakhleh, Rice University Life through Evolution All living organisms are related to each other through evolution This means: any pair of
More informationIntroduction to molecular biology. Mitesh Shrestha
Introduction to molecular biology Mitesh Shrestha Molecular biology: definition Molecular biology is the study of molecular underpinnings of the process of replication, transcription and translation of
More informationBioinformatics. Macromolecular structure
Bioinformatics Macromolecular structure Contents Determination of protein structure Structure databases Secondary structure elements (SSE) Tertiary structure Structure analysis Structure alignment Domain
More informationChapter 7: Covalent Structure of Proteins. Voet & Voet: Pages ,
Chapter 7: Covalent Structure of Proteins Voet & Voet: Pages 163-164, 185-194 Slide 1 Structure & Function Function is best understood in terms of structure Four levels of structure that apply to proteins
More informationTIPS TO PREPARE FOR THE BIOLOGY 2 nd SEMESTER FINAL EXAM:
TIPS TO PREPARE FOR THE BIOLOGY 2 nd SEMESTER FINAL EXAM: FINAL EXAM DETAILS: 80 questions Multiple choice Will assess your mastery of the biological concepts covered in Units 3 and 4 Will assess your
More informationMiller & Levine Biology 2014
A Correlation of Miller & Levine Biology To the Essential Standards for Biology High School Introduction This document demonstrates how meets the North Carolina Essential Standards for Biology, grades
More informationSubsystem: TCA Cycle. List of Functional roles. Olga Vassieva 1 and Rick Stevens 2 1. FIG, 2 Argonne National Laboratory and University of Chicago
Subsystem: TCA Cycle Olga Vassieva 1 and Rick Stevens 2 1 FIG, 2 Argonne National Laboratory and University of Chicago List of Functional roles Tricarboxylic acid cycle (TCA) oxidizes acetyl-coa to CO
More informationEnzyme Catalysis & Biotechnology
L28-1 Enzyme Catalysis & Biotechnology Bovine Pancreatic RNase A Biochemistry, Life, and all that L28-2 A brief word about biochemistry traditionally, chemical engineers used organic and inorganic chemistry
More informationGenome Annotation Project Presentation
Halogeometricum borinquense Genome Annotation Project Presentation Loci Hbor_05620 & Hbor_05470 Presented by: Mohammad Reza Najaf Tomaraei Hbor_05620 Basic Information DNA Coordinates: 527,512 528,261
More informationText of objective. Investigate and describe the structure and functions of cells including: Cell organelles
This document is designed to help North Carolina educators teach the s (Standard Course of Study). NCDPI staff are continually updating and improving these tools to better serve teachers. Biology 2009-to-2004
More informationSara Khraim. Shaymaa Alnamos ... Dr. Nafeth
10 Sara Khraim Shaymaa Alnamos... Dr. Nafeth *Requirement of oxidative phosphorylation: 1- Source and target for electrons(nadh+fadh2 >> O2). 2- Electron carriers. 3- Enzymes, like oxidoreductases and
More informationComputational Structural Bioinformatics
Computational Structural Bioinformatics ECS129 Instructor: Patrice Koehl http://koehllab.genomecenter.ucdavis.edu/teaching/ecs129 koehl@cs.ucdavis.edu Learning curve Math / CS Biology/ Chemistry Pre-requisite
More informationBIOLOGY STANDARDS BASED RUBRIC
BIOLOGY STANDARDS BASED RUBRIC STUDENTS WILL UNDERSTAND THAT THE FUNDAMENTAL PROCESSES OF ALL LIVING THINGS DEPEND ON A VARIETY OF SPECIALIZED CELL STRUCTURES AND CHEMICAL PROCESSES. First Semester Benchmarks:
More informationMatter and Substances Section 3-1
Matter and Substances Section 3-1 Key Idea: All matter is made up of atoms. An atom has a positively charges core surrounded by a negatively charged region. An atom is the smallest unit of matter that
More informationDATE A DAtabase of TIM Barrel Enzymes
DATE A DAtabase of TIM Barrel Enzymes 2 2.1 Introduction.. 2.2 Objective and salient features of the database 2.2.1 Choice of the dataset.. 2.3 Statistical information on the database.. 2.4 Features....
More informationChapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships
Chapter 26: Phylogeny and the Tree of Life You Must Know The taxonomic categories and how they indicate relatedness. How systematics is used to develop phylogenetic trees. How to construct a phylogenetic
More informationALL LECTURES IN SB Introduction
1. Introduction 2. Molecular Architecture I 3. Molecular Architecture II 4. Molecular Simulation I 5. Molecular Simulation II 6. Bioinformatics I 7. Bioinformatics II 8. Prediction I 9. Prediction II ALL
More informationWe used the PSI-BLAST program (http://www.ncbi.nlm.nih.gov/blast/) to search the
SUPPLEMENTARY METHODS - in silico protein analysis We used the PSI-BLAST program (http://www.ncbi.nlm.nih.gov/blast/) to search the Protein Data Bank (PDB, http://www.rcsb.org/pdb/) and the NCBI non-redundant
More informationStatistical Machine Learning Methods for Bioinformatics II. Hidden Markov Model for Biological Sequences
Statistical Machine Learning Methods for Bioinformatics II. Hidden Markov Model for Biological Sequences Jianlin Cheng, PhD Department of Computer Science University of Missouri 2008 Free for Academic
More informationSupporting Online Material for
www.sciencemag.org/cgi/content/full/309/5742/1868/dc1 Supporting Online Material for Toward High-Resolution de Novo Structure Prediction for Small Proteins Philip Bradley, Kira M. S. Misura, David Baker*
More informationSPRINGFIELD TECHNICAL COMMUNITY COLLEGE ACADEMIC AFFAIRS
SPRINGFIELD TECHNICAL COMMUNITY COLLEGE ACADEMIC AFFAIRS Course Number: BIOL 102 Department: Biological Sciences Course Title: Principles of Biology 1 Semester: Spring Year: 1997 Objectives/ 1. Summarize
More informationActivation of a receptor. Assembly of the complex
Activation of a receptor ligand inactive, monomeric active, dimeric When activated by growth factor binding, the growth factor receptor tyrosine kinase phosphorylates the neighboring receptor. Assembly
More informationAll Proteins Have a Basic Molecular Formula
All Proteins Have a Basic Molecular Formula Homa Torabizadeh Abstract This study proposes a basic molecular formula for all proteins. A total of 10,739 proteins belonging to 9 different protein groups
More information