Intro Secondary structure Transmembrane proteins Function End. Last time. Domains Hidden Markov Models

Similar documents
Today. Last time. Secondary structure Transmembrane proteins. Domains Hidden Markov Models. Structure prediction. Secondary structure

1-D Predictions. Prediction of local features: Secondary structure & surface exposure

SUPPLEMENTARY MATERIALS

TMHMM2.0 User's guide

Public Database 의이용 (1) - SignalP (version 4.1)

Yeast ORFan Gene Project: Module 5 Guide

CAP 5510 Lecture 3 Protein Structures

Protein Structure. Hierarchy of Protein Structure. Tertiary structure. independently stable structural unit. includes disulfide bonds

Protein structure alignments

Basics of protein structure

Bioinformatics Practical for Biochemists

Giri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748

SCOP. all-β class. all-α class, 3 different folds. T4 endonuclease V. 4-helical cytokines. Globin-like

Genome Annotation Project Presentation

What is the central dogma of biology?

Bioinformatics III Structural Bioinformatics and Genome Analysis Part Protein Secondary Structure Prediction. Sepp Hochreiter

Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche

CSCE555 Bioinformatics. Protein Function Annotation

HMM applications. Applications of HMMs. Gene finding with HMMs. Using the gene finder

Protein Structures. Sequences of amino acid residues 20 different amino acids. Quaternary. Primary. Tertiary. Secondary. 10/8/2002 Lecture 12 1

Neural Networks for Protein Structure Prediction Brown, JMB CS 466 Saurabh Sinha

Reliability Measures for Membrane Protein Topology Prediction Algorithms

Protein Structure. W. M. Grogan, Ph.D. OBJECTIVES

COMP 598 Advanced Computational Biology Methods & Research. Introduction. Jérôme Waldispühl School of Computer Science McGill University

CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools. Giri Narasimhan

Structure Prediction of Membrane Proteins. Introduction. Secondary Structure Prediction and Transmembrane Segments Topology Prediction

Amino Acid Structures from Klug & Cummings. 10/7/2003 CAP/CGS 5991: Lecture 7 1

Protein Structure Prediction Using Multiple Artificial Neural Network Classifier *

Prediction. Emily Wei Xu. A thesis. presented to the University of Waterloo. in fulfillment of the. thesis requirement for the degree of

Motif Prediction in Amino Acid Interaction Networks

Protein Structure: Data Bases and Classification Ingo Ruczinski

Analysis and Prediction of Protein Structure (I)

Chapter 12: Intracellular sorting

BCB 444/544 Fall 07 Dobbs 1

Computational Genomics and Molecular Biology, Fall

From Amino Acids to Proteins - in 4 Easy Steps

Review. Membrane proteins. Membrane transport

A Machine Text-Inspired Machine Learning Approach for Identification of Transmembrane Helix Boundaries

Signal peptides and protein localization prediction

Topology Prediction of Helical Transmembrane Proteins: How Far Have We Reached?

PROTEIN SUBCELLULAR LOCALIZATION PREDICTION BASED ON COMPARTMENT-SPECIFIC BIOLOGICAL FEATURES

BIOINFORMATICS. Enhanced Recognition of Protein Transmembrane Domains with Prediction-based Structural Profiles

Proteins: Structure & Function. Ulf Leser

CHAPTER 29 HW: AMINO ACIDS + PROTEINS

Cellular Neuroanatomy I The Prototypical Neuron: Soma. Reading: BCP Chapter 2

Getting To Know Your Protein

Protein Secondary Structure Prediction

Physiochemical Properties of Residues

Genome Annotation. Bioinformatics and Computational Biology. Genome sequencing Assembly. Gene prediction. Protein targeting.

A Genetic Algorithm to Enhance Transmembrane Helices Prediction

RNA and Protein Structure Prediction

Prediction of signal peptides and signal anchors by a hidden Markov model

Protein Secondary Structure Prediction using Pattern Recognition Neural Network

ALL LECTURES IN SB Introduction

Predictors (of secondary structure) based on Machine Learning tools

Statistical Machine Learning Methods for Bioinformatics IV. Neural Network & Deep Learning Applications in Bioinformatics

Major Types of Association of Proteins with Cell Membranes. From Alberts et al

PROTEIN FUNCTION PREDICTION WITH AMINO ACID SEQUENCE AND SECONDARY STRUCTURE ALIGNMENT SCORES

STRUCTURAL BIOINFORMATICS. Barry Grant University of Michigan

Supporting online material

Introduction to Comparative Protein Modeling. Chapter 4 Part I

Protein Secondary Structure Prediction using Feed-Forward Neural Network

Secondary Structure. Bioch/BIMS 503 Lecture 2. Structure and Function of Proteins. Further Reading. Φ, Ψ angles alone determine protein structure

Improved membrane protein topology prediction by domain assignments

Protein Structure Basics

PROTEIN SECONDARY STRUCTURE PREDICTION: AN APPLICATION OF CHOU-FASMAN ALGORITHM IN A HYPOTHETICAL PROTEIN OF SARS VIRUS

Orientational degeneracy in the presence of one alignment tensor.

Heteropolymer. Mostly in regular secondary structure

Introduction to Pattern Recognition. Sequence structure function

Protein Structure. Role of (bio)informatics in drug discovery. Bioinformatics

FUNCTION ANNOTATION PRELIMINARY RESULTS

Outline. Levels of Protein Structure. Primary (1 ) Structure. Lecture 6:Protein Architecture II: Secondary Structure or From peptides to proteins

Protein Secondary Structure Prediction

Bioinformatics: Secondary Structure Prediction

Reconstructing Amino Acid Interaction Networks by an Ant Colony Approach

BIRKBECK COLLEGE (University of London)

Prediction of protein function from sequence analysis

9/11/18. Molecular and Cellular Biology. 3. The Cell From Genes to Proteins. key processes

Protein structure. Protein structure. Amino acid residue. Cell communication channel. Bioinformatics Methods

BIOCHEMISTRY Unit 2 Part 4 ACTIVITY #6 (Chapter 5) PROTEINS

CHAPTER 3. Cell Structure and Genetic Control. Chapter 3 Outline

Biochemistry Prof. S. DasGupta Department of Chemistry Indian Institute of Technology Kharagpur. Lecture - 06 Protein Structure IV

Objective: Students will be able identify peptide bonds in proteins and describe the overall reaction between amino acids that create peptide bonds.

9/2/17. Molecular and Cellular Biology. 3. The Cell From Genes to Proteins. key processes

We used the PSI-BLAST program ( to search the

Answer Additional Guidance Mark. Answer Additional Guidance Mark

HIV protease inhibitor. Certain level of function can be found without structure. But a structure is a key to understand the detailed mechanism.

A hidden Markov model for predicting transmembrane helices in protein sequences

Computational Biology From The Perspective Of A Physical Scientist

Protein Bioinformatics. Rickard Sandberg Dept. of Cell and Molecular Biology Karolinska Institutet sandberg.cmb.ki.

CHEM 3653 Exam # 1 (03/07/13)

TMSEG Michael Bernhofer, Jonas Reeb pp1_tmseg

Structure to Function. Molecular Bioinformatics, X3, 2006

Bioinformatics: Secondary Structure Prediction

Advanced Certificate in Principles in Protein Structure. You will be given a start time with your exam instructions

Denaturation and renaturation of proteins

Enhanced membrane protein topology prediction using a hierarchical classification method and a new scoring function

Presentation Outline. Prediction of Protein Secondary Structure using Neural Networks at Better than 70% Accuracy

Molecular dynamics simulation of Aquaporin-1. 4 nm

Some Problems from Enzyme Families

Transcription:

Last time Domains Hidden Markov Models

Today Secondary structure Transmembrane proteins

Structure prediction NAD-specific glutamate dehydrogenase Hard Easy >P24295 DHE2_CLOSY MSKYVDRVIAEVEKKYADEPEFVQTVEEVL SSLGPVVDAHPEYEEVALLERMVIPERVIE FRVPWEDDNGKVHVNTGYRVQFNGAIGPYK GGLRFAPSVNLSIMKFLGFEQAFKDSLTTL PMGGAKGGSDFDPNGKSDREVMRFCQAFMT ELYRHIGPDIDVPAGDLGVGAREIGYMYGQ YRKIVGGFYNGVLTGKARSFGGSLVRPEAT GYGSVYYVEAVMKHENDTLVGKTVALAGFG NVAWGAAKKLAELGAKAVTLSGPDGYIYDP EGITTEEKINYMLEMRASGRNKVQDYADKF GVQFFPGEKPWGQKVDIIMPCATQNDVDLE QAKKIVANNVKYYIEVANMPTTNEALRFLM QQPNMVVAPSKAVNAGGVLVSGFEMSQNSE RLSWTAEEVDSKLHQVMTDIHDGSAAAAER YGLGYNLVAGANIVGFQKIADAMMAQGIAW What s in between?

Secondary structure Structure What it is Diff. Qual. Primary Sequence Easy Precise Secondary Structure elements Fair Tertiary Atomic coordinates Hard

Example of secondary structure

Elements, definitions Alpha helix: The classic spiral Beta strand: strands form sheets Turn, bend: Sudden change Coil, loop: Everything else DSSP H B,E S,T C, L,_ Assigned by principles. Coded in DSSP, Stride, etc

Defining secondary structure

Prediction of secondary structure Principle: Structure affect amino acids distribution.

Prediction of secondary structure Principle: Structure affect amino acids distribution. Bad news: No good explicit model for determining secondary structure.

Prediction of secondary structure Principle: Structure affect amino acids distribution. Bad news: No good explicit model for determining secondary structure. Good news: Artificial Neural Networks give decent implicit model.

Prediction of secondary structure Principle: Structure affect amino acids distribution. Bad news: No good explicit model for determining secondary structure. Good news: Artificial Neural Networks give decent implicit model. To determine sec. str. of residue i, look at window around i. R i 7 R i 6 R i 1 R i R i+1 R i+6 R i+7

Prediction trick Use homologs! 1. Collect very similar sequences 2. Build profile 3. Use a predictor for profiles Good effect in sec. str. prediction General trick for various predictions problems.

Prediction trick Use homologs! 1. Collect very similar sequences 2. Build profile 3. Use a predictor for profiles Good effect in sec. str. prediction General trick for various predictions problems. One sequence vs A profile Pos 17 has a C Pos 17 is always a C Pos 18 has a A Pos 18 is rarely an A

Prediction quality Predictor Accuracy PHD 70% PSIpred 77% Common problem:... EEEEHEEEE... Not an active research area today.

Transmembrane proteins 20-30% of proteins in any organism are TM.

Transmembrane proteins 20-30% of proteins in any organism are TM. 70% of drug targets are TM proteins (Pestourie et al, 2006)

Transmembrane proteins 20-30% of proteins in any organism are TM. 70% of drug targets are TM proteins (Pestourie et al, 2006) Bad news: Hard to determine structure for TM-proteins. Less than 1% of PDB contains TM structures.

Transmembrane proteins 20-30% of proteins in any organism are TM. 70% of drug targets are TM proteins (Pestourie et al, 2006) Bad news: Hard to determine structure for TM-proteins. Less than 1% of PDB contains TM structures. Good news: Regular and clear structure, perfect for HMMs!

Classic structure: rhodopsin Sensory rhodopsin (1gue) embedded in the membrane and transducing beneath.

Intro Secondary structure Transmembrane proteins Function Modern view Image from Kauko-Illergård-Elofsson, 2008 End

Beta barrel structure Not studied in this course Image created by Opabinia Regalis.

Goals Classify proteins: TM or not? Determine TM regions Determine TM topology

Properties of TM proteins Transmembrane helices are hydrophobic TM regions are 15-30 aa Loops on cytoplasmic side are positive: positive inside rule (Gunnar von Heijne)

First attempt: TopPred Identify the hydrophobic regions in PSN1_HUMAN. Look at window of 21 aa.

TMHMM: Predictor using an HMM

TMHMM: Predictor using an HMM

Prediction quality Good quality Generally correct when 3 TM regions Common problems: Lose a TM region Flip in-out topology Problem discerning signal peptides

Signal peptides Short (15-30 aa?) peptide addressing protein to organelles 16% of human proteome have a SP Some SP cleaved from its host protein One hydrophobic TM-segment, 7-15 aa

Signal peptides Short (15-30 aa?) peptide addressing protein to organelles 16% of human proteome have a SP Some SP cleaved from its host protein One hydrophobic TM-segment, 7-15 aa Special predictor for SP: SignalP Common problem for TM predictors

Phobius: including signal peptides

Phobius: including signal peptides Käll, Krogh, Sonnhammer, 2004

TM prediction example

Function prediction Why predict structure? Real goal (?): Function

Function prediction Why predict structure? Real goal (?): Function Problem 1: What is function? Problem 2: What data do you need? Is protein sequence enough?

What is gene/protein function? Chemical reactions? Interactions? Pathway activity? Cell localization? Activity details?

Enzyme Commission number From 1961! Hierarchical classification of enzymes Specifies reactions

Enzyme Commission number From 1961! Hierarchical classification of enzymes Specifies reactions Example from Wikipedia: EC 3 enzymes are hydrolases EC 3.4 are hydrolases that act on peptide bonds EC 3.4.11 are those hydrolases that cleave off the amino-terminal amino acid from a polypeptide EC 3.4.11.4 are those that cleave off the amino-terminal end from a tripeptide Too limited for Bioinformatics

Gene Ontology Controlled vocabulary for function annotation Non-hierarchical is a and part of relationships between terms

GO example Baldock and Burger, Genome Biology 2005

GO applications Facilitates enrichment studies We show that gene duplication and loss is highly constrained by the functional properties and interacting partners of genes. In particular, stress-related genes exhibit many duplications and losses, whereas growth-related genes show selection against such changes. (Wapinski et al, Nature 2007)

Predicting function? Given a gene/protein, can we predict a GO term? Approach: Expert systems Collect homologs Collect orthologs Domain and motif analysis Study other features Study network connections Examples: ProtFun (http://www.cbs.dtu.dk/services/protfun/) FunCoup (http://funcoup.sbc.su.se/)

Predict localization Modest goal! Is the target... mitochondria? peroxisome? endoplasmic reticulum? golgi?

Predict localization Modest goal! Is the target... mitochondria? peroxisome? endoplasmic reticulum? golgi? Study signal peptide Olof Emanuelsson: TargetP (http://www.cbs.dtu.dk/services/targetp/)

Next: Computational genomics Introduction Genome sequencing and assembly EST analysis