Today. Last time. Secondary structure Transmembrane proteins. Domains Hidden Markov Models. Structure prediction. Secondary structure

Similar documents
Intro Secondary structure Transmembrane proteins Function End. Last time. Domains Hidden Markov Models

1-D Predictions. Prediction of local features: Secondary structure & surface exposure

TMHMM2.0 User's guide

SUPPLEMENTARY MATERIALS

Public Database 의이용 (1) - SignalP (version 4.1)

Yeast ORFan Gene Project: Module 5 Guide

Protein Structure. Hierarchy of Protein Structure. Tertiary structure. independently stable structural unit. includes disulfide bonds

CAP 5510 Lecture 3 Protein Structures

Basics of protein structure

Protein structure alignments

Bioinformatics Practical for Biochemists

Giri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748

What is the central dogma of biology?

Bioinformatics III Structural Bioinformatics and Genome Analysis Part Protein Secondary Structure Prediction. Sepp Hochreiter

Genome Annotation Project Presentation

SCOP. all-β class. all-α class, 3 different folds. T4 endonuclease V. 4-helical cytokines. Globin-like

Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche

CSCE555 Bioinformatics. Protein Function Annotation

Protein Structures. Sequences of amino acid residues 20 different amino acids. Quaternary. Primary. Tertiary. Secondary. 10/8/2002 Lecture 12 1

HMM applications. Applications of HMMs. Gene finding with HMMs. Using the gene finder

Neural Networks for Protein Structure Prediction Brown, JMB CS 466 Saurabh Sinha

Reliability Measures for Membrane Protein Topology Prediction Algorithms

Protein Structure. W. M. Grogan, Ph.D. OBJECTIVES

COMP 598 Advanced Computational Biology Methods & Research. Introduction. Jérôme Waldispühl School of Computer Science McGill University

CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools. Giri Narasimhan

Structure Prediction of Membrane Proteins. Introduction. Secondary Structure Prediction and Transmembrane Segments Topology Prediction

Amino Acid Structures from Klug & Cummings. 10/7/2003 CAP/CGS 5991: Lecture 7 1

Protein Structure Prediction Using Multiple Artificial Neural Network Classifier *

Prediction. Emily Wei Xu. A thesis. presented to the University of Waterloo. in fulfillment of the. thesis requirement for the degree of

Motif Prediction in Amino Acid Interaction Networks

Analysis and Prediction of Protein Structure (I)

Protein Structure: Data Bases and Classification Ingo Ruczinski

From Amino Acids to Proteins - in 4 Easy Steps

PROTEIN SUBCELLULAR LOCALIZATION PREDICTION BASED ON COMPARTMENT-SPECIFIC BIOLOGICAL FEATURES

A Machine Text-Inspired Machine Learning Approach for Identification of Transmembrane Helix Boundaries

Computational Genomics and Molecular Biology, Fall

Signal peptides and protein localization prediction

Chapter 12: Intracellular sorting

Review. Membrane proteins. Membrane transport

BCB 444/544 Fall 07 Dobbs 1

Topology Prediction of Helical Transmembrane Proteins: How Far Have We Reached?

BIOINFORMATICS. Enhanced Recognition of Protein Transmembrane Domains with Prediction-based Structural Profiles

Proteins: Structure & Function. Ulf Leser

CHAPTER 29 HW: AMINO ACIDS + PROTEINS

Prediction of signal peptides and signal anchors by a hidden Markov model

Physiochemical Properties of Residues

A Genetic Algorithm to Enhance Transmembrane Helices Prediction

Getting To Know Your Protein

Statistical Machine Learning Methods for Bioinformatics IV. Neural Network & Deep Learning Applications in Bioinformatics

Protein Secondary Structure Prediction

Protein Secondary Structure Prediction using Pattern Recognition Neural Network

Genome Annotation. Bioinformatics and Computational Biology. Genome sequencing Assembly. Gene prediction. Protein targeting.

Major Types of Association of Proteins with Cell Membranes. From Alberts et al

ALL LECTURES IN SB Introduction

Predictors (of secondary structure) based on Machine Learning tools

PROTEIN FUNCTION PREDICTION WITH AMINO ACID SEQUENCE AND SECONDARY STRUCTURE ALIGNMENT SCORES

RNA and Protein Structure Prediction

Cellular Neuroanatomy I The Prototypical Neuron: Soma. Reading: BCP Chapter 2

Supporting online material

STRUCTURAL BIOINFORMATICS. Barry Grant University of Michigan

Improved membrane protein topology prediction by domain assignments

Protein Structure Basics

Introduction to Comparative Protein Modeling. Chapter 4 Part I

PROTEIN SECONDARY STRUCTURE PREDICTION: AN APPLICATION OF CHOU-FASMAN ALGORITHM IN A HYPOTHETICAL PROTEIN OF SARS VIRUS

Secondary Structure. Bioch/BIMS 503 Lecture 2. Structure and Function of Proteins. Further Reading. Φ, Ψ angles alone determine protein structure

Outline. Levels of Protein Structure. Primary (1 ) Structure. Lecture 6:Protein Architecture II: Secondary Structure or From peptides to proteins

Protein Secondary Structure Prediction

Protein Secondary Structure Prediction using Feed-Forward Neural Network

Heteropolymer. Mostly in regular secondary structure

Protein Structure. Role of (bio)informatics in drug discovery. Bioinformatics

Orientational degeneracy in the presence of one alignment tensor.

FUNCTION ANNOTATION PRELIMINARY RESULTS

Introduction to Pattern Recognition. Sequence structure function

Bioinformatics: Secondary Structure Prediction

BIRKBECK COLLEGE (University of London)

Reconstructing Amino Acid Interaction Networks by an Ant Colony Approach

Prediction of protein function from sequence analysis

Protein structure. Protein structure. Amino acid residue. Cell communication channel. Bioinformatics Methods

BIOCHEMISTRY Unit 2 Part 4 ACTIVITY #6 (Chapter 5) PROTEINS

Objective: Students will be able identify peptide bonds in proteins and describe the overall reaction between amino acids that create peptide bonds.

Biochemistry Prof. S. DasGupta Department of Chemistry Indian Institute of Technology Kharagpur. Lecture - 06 Protein Structure IV

We used the PSI-BLAST program ( to search the

A hidden Markov model for predicting transmembrane helices in protein sequences

9/11/18. Molecular and Cellular Biology. 3. The Cell From Genes to Proteins. key processes

HIV protease inhibitor. Certain level of function can be found without structure. But a structure is a key to understand the detailed mechanism.

Protein Bioinformatics. Rickard Sandberg Dept. of Cell and Molecular Biology Karolinska Institutet sandberg.cmb.ki.

Computational Biology From The Perspective Of A Physical Scientist

Structure to Function. Molecular Bioinformatics, X3, 2006

Advanced Certificate in Principles in Protein Structure. You will be given a start time with your exam instructions

CHAPTER 3. Cell Structure and Genetic Control. Chapter 3 Outline

9/2/17. Molecular and Cellular Biology. 3. The Cell From Genes to Proteins. key processes

Some Problems from Enzyme Families

Denaturation and renaturation of proteins

Enhanced membrane protein topology prediction using a hierarchical classification method and a new scoring function

TMSEG Michael Bernhofer, Jonas Reeb pp1_tmseg

CHEM 3653 Exam # 1 (03/07/13)

Bioinformatics: Secondary Structure Prediction

Conditional Graphical Models

Answer Additional Guidance Mark. Answer Additional Guidance Mark

Presentation Outline. Prediction of Protein Secondary Structure using Neural Networks at Better than 70% Accuracy

Transcription:

Last time Today Domains Hidden Markov Models Structure prediction NAD-specific glutamate dehydrogenase Hard Easy >P24295 DHE2_CLOSY MSKYVDRVIAEVEKKYADEPEFVQTVEEVL SSLGPVVDAHPEYEEVALLERMVIPERVIE FRVPWEDDNGKVHVNTGYRVQFNGAIGPYK GGLRFAPSVNLSIMKFLGFEQAFKDSLTTL PMGGAKGGSDFDPNGKSDREVMRFCQAFMT ELYRHIGPDIDVPAGDLGVGAREIGYMYGQ YRKIVGGFYNGVLTGKARSFGGSLVRPEAT GYGSVYYVEAVMKHENDTLVGKTVALAGFG NVAWGAAKKLAELGAKAVTLSGPDGYIYDP EGITTEEKINYMLEMRASGRNKVQDYADKF GVQFFPGEKPWGQKVDIIMPCATQNDVDLE QAKKIVANNVKYYIEVANMPTTNEALRFLM QQPNMVVAPSKAVNAGGVLVSGFEMSQNSE RLSWTAEEVDSKLHQVMTDIHDGSAAAAER YGLGYNLVAGANIVGFQKIADAMMAQGIAW Structure What it is Diff. Qual. Primary Sequence Easy Precise Secondary Structure elements Fair Tertiary Atomic coordinates Hard What s in between?

Example of secondary structure Elements, definitions Alpha helix: The classic spiral Beta strand: strands form sheets Turn, bend: Sudden change Coil, loop: Everything else DSSP H B,E S,T C, L,_ Assigned by principles. Coded in DSSP, Stride, etc Defining secondary structure Prediction of secondary structure Principle: Structure affect amino acids distribution. Bad news: No good explicit model for determining secondary structure. Good news: Artificial Neural Networks give decent implicit model. To determine sec. str. of residue i, look at window around i. R i 7 R i 6 R i 1 R i R i+1 R i+6 R i+7

Prediction trick Prediction quality Use homologs! 1. Collect very similar sequences 2. Build profile 3. Use a predictor for profiles Good effect in sec. str. prediction General trick for various predictions problems. One sequence vs A profile Pos 17 has a C Pos 17 is always a C Pos 18 has a A Pos 18 is rarely an A Predictor Accuracy PHD 70% PSIpred 77% Common problem:... EEEEHEEEE... Not an active research area today. 20-30% of proteins in any organism are TM. 70% of drug targets are TM proteins (Pestourie et al, 2006) Bad news: Hard to determine structure for TM-proteins. Less than 1% of PDB contains TM structures. Good news: Regular and clear structure, perfect for HMMs! Classic structure: rhodopsin Sensory rhodopsin (1gue) embedded in the membrane and transducing beneath.

Intro Function End Intro Modern view Function End Beta barrel structure Not studied in this course Image created by Opabinia Regalis. Image from Kauko-Illergård-Elofsson, 2008 Intro Goals Classify proteins: TM or not? Determine TM regions Determine TM topology Function End Intro Function Properties of TM proteins Transmembrane helices are hydrophobic TM regions are 15-30 aa Loops on cytoplasmic side are positive: positive inside rule (Gunnar von Heijne) End

First attempt: TopPred Identify the hydrophobic regions in PSN1_HUMAN. TMHMM: Predictor using an HMM Look at window of 21 aa. Prediction quality Sonnhammer, von Heijne, Krogh, 1998 Signal peptides Good quality Generally correct when 3 TM regions Common problems: Lose a TM region Flip in-out topology Problem discerning signal peptides Short (15-30 aa?) peptide addressing protein to organelles 16% of human proteome have a SP Some SP cleaved from its host protein One hydrophobic TM-segment, 7-15 aa Special predictor for SP: SignalP Common problem for TM predictors

Phobius: including signal peptides TM prediction example Käll, Krogh, Sonnhammer, 2004 Function prediction What is gene/protein function? Why predict structure? Real goal (?): Function Problem 1: What is function? Problem 2: What data do you need? Is protein sequence enough? Chemical reactions? Interactions? Pathway activity? Cell localization? Activity details?

Enzyme Commission number From 1961! Hierarchical classification of enzymes Specifies reactions Example from Wikipedia: EC 3 enzymes are hydrolases EC 3.4 are hydrolases that act on peptide bonds EC 3.4.11 are those hydrolases that cleave off the amino-terminal amino acid from a polypeptide EC 3.4.11.4 are those that cleave off the amino-terminal end from a tripeptide Too limited for Bioinformatics GO example Gene Ontology Controlled vocabulary for function annotation Non-hierarchical is a and part of relationships between terms GO applications Facilitates enrichment studies We show that gene duplication and loss is highly constrained by the functional properties and interacting partners of genes. In particular, stress-related genes exhibit many duplications and losses, whereas growth-related genes show selection against such changes. (Wapinski et al, Nature 2007) Baldock and Burger, Genome Biology 2005

Predicting function? Predict localization Given a gene/protein, can we predict a GO term? Approach: Expert systems Collect homologs Collect orthologs Domain and motif analysis Study other features Study network connections Examples: ProtFun (http://www.cbs.dtu.dk/services/protfun/) FunCoup (http://funcoup.sbc.su.se/) Modest goal! Is the target... mitochondria? peroxisome? endoplasmic reticulum? golgi? Study signal peptide Olof Emanuelsson: TargetP (http://www.cbs.dtu.dk/services/targetp/) Next: Computational genomics Introduction Genome sequencing and assembly EST analysis