Protein Interaction Mapping: Use of Osprey to map Survival of Motor Neuron Protein interactions

Similar documents
Name: SBI 4U. Gene Expression Quiz. Overall Expectation:

- conserved in Eukaryotes. - proteins in the cluster have identifiable conserved domains. - human gene should be included in the cluster.

BME 5742 Biosystems Modeling and Control

GCD3033:Cell Biology. Transcription

9/11/18. Molecular and Cellular Biology. 3. The Cell From Genes to Proteins. key processes

1. In most cases, genes code for and it is that

Simulation of Gene Regulatory Networks

The Schrödinger KNIME extensions

PDBe TUTORIAL. PDBePISA (Protein Interfaces, Surfaces and Assemblies)

Gene Ontology and overrepresentation analysis

Proteomics. 2 nd semester, Department of Biotechnology and Bioinformatics Laboratory of Nano-Biotechnology and Artificial Bioengineering

Student Learning Outcomes: Nucleus distinguishes Eukaryotes from Prokaryotes

FCModeler: Dynamic Graph Display and Fuzzy Modeling of Regulatory and Metabolic Maps

Supplementary Information 16

9/2/17. Molecular and Cellular Biology. 3. The Cell From Genes to Proteins. key processes

Biology 105/Summer Bacterial Genetics 8/12/ Bacterial Genomes p Gene Transfer Mechanisms in Bacteria p.

What is Systems Biology

Mathangi Thiagarajan Rice Genome Annotation Workshop May 23rd, 2007

Comparing whole genomes

Bioinformatics tools for phylogeny and visualization. Yanbin Yin

Cellular Neuroanatomy I The Prototypical Neuron: Soma. Reading: BCP Chapter 2

Tutorial. Getting started. Sample to Insight. March 31, 2016

Nucleus. The nucleus is a membrane bound organelle that store, protect and express most of the genetic information(dna) found in the cell.

Types of biological networks. I. Intra-cellurar networks

HOW TO USE MIKANA. 1. Decompress the zip file MATLAB.zip. This will create the directory MIKANA.

Translation Part 2 of Protein Synthesis

Bio 119 Bacterial Genomics 6/26/10

Organization of Genes Differs in Prokaryotic and Eukaryotic DNA Chapter 10 p

Bioinformatics Chapter 1. Introduction

Identifying Signaling Pathways

Introduction to IsoMAP Isoscapes Modeling, Analysis, and Prediction

Introduction to Bioinformatics

PROTEIN SYNTHESIS INTRO

#33 - Genomics 11/09/07

Biology 559R: Introduction to Phylogenetic Comparative Methods Topics for this week:

Analysis and visualization of protein-protein interactions. Olga Vitek Assistant Professor Statistics and Computer Science

Computational Biology Course Descriptions 12-14

INTERACTIVE CLUSTERING FOR EXPLORATION OF GENOMIC DATA

Eukaryotic vs. Prokaryotic genes

Gene Regulation and Expression

A Protein Ontology from Large-scale Textmining?

Last updated: Copyright

DOWNLOAD OR READ : UNANSWERED QUESTIONS CELL BIOLOGY PDF EBOOK EPUB MOBI

Multiple Choice Review- Eukaryotic Gene Expression

A Simple Protein Synthesis Model

Written Exam 15 December Course name: Introduction to Systems Biology Course no

changes in gene expression, we developed and tested several models. Each model was

CS612 - Algorithms in Bioinformatics

Grundlagen der Bioinformatik Summer semester Lecturer: Prof. Daniel Huson

Reading Assignments. A. Genes and the Synthesis of Polypeptides. Lecture Series 7 From DNA to Protein: Genotype to Phenotype

NMR Predictor. Introduction

Full file at CHAPTER 2 Genetics

Motifs and Logos. Six Introduction to Bioinformatics. Importance and Abundance of Motifs. Getting the CDS. From DNA to Protein 6.1.

Genome Annotation. Bioinformatics and Computational Biology. Genome sequencing Assembly. Gene prediction. Protein targeting.

Lecture 3: A basic statistical concept

CHEM 463: Advanced Inorganic Chemistry Modeling Metalloproteins for Structural Analysis

Chapter 15 Active Reading Guide Regulation of Gene Expression

Protein Bioinformatics Computer lab #1 Friday, April 11, 2008 Sean Prigge and Ingo Ruczinski

Miller & Levine Biology 2014

Prokaryotic Regulation

BIOINFORMATICS LAB AP BIOLOGY

Regulation of Gene Expression *

Predicting Protein Functions and Domain Interactions from Protein Interactions

Session 5: Phylogenomics

Hands-On Nine The PAX6 Gene and Protein

L3.1: Circuits: Introduction to Transcription Networks. Cellular Design Principles Prof. Jenna Rickus

Lecture 8: Temporal programs and the global structure of transcription networks. Chap 5 of Alon. 5.1 Introduction

ENCODE DCC Antibody Validation Document

TRANSCRIPTION VS TRANSLATION FILE

Networks & pathways. Hedi Peterson MTAT Bioinformatics

86 Part 4 SUMMARY INTRODUCTION

Understanding Geographic Information System GIS

Molecular Biology of the Cell

GEP Annotation Report

Proteomics. Yeast two hybrid. Proteomics - PAGE techniques. Data obtained. What is it?

Discovering modules in expression profiles using a network

DATA ACQUISITION FROM BIO-DATABASES AND BLAST. Natapol Pornputtapong 18 January 2018

What Kind Of Molecules Carry Protein Assembly Instructions From The Nucleus To The Cytoplasm

Boolean models of gene regulatory networks. Matthew Macauley Math 4500: Mathematical Modeling Clemson University Spring 2016

Newly made RNA is called primary transcript and is modified in three ways before leaving the nucleus:

SNORNAS HOMOLOGY SEARCH

Introduction to Coastal GIS

UNIT 6 PART 3 *REGULATION USING OPERONS* Hillis Textbook, CH 11

UE Praktikum Bioinformatik

Protein-protein interaction networks Prof. Peter Csermely

Erzsébet Ravasz Advisor: Albert-László Barabási

Unit 3: Control and regulation Higher Biology

CSEP 590A Summer Tonight MLE. FYI, re HW #2: Hemoglobin History. Lecture 4 MLE, EM, RE, Expression. Maximum Likelihood Estimators

CSEP 590A Summer Lecture 4 MLE, EM, RE, Expression

PGA: A Program for Genome Annotation by Comparative Analysis of. Maximum Likelihood Phylogenies of Genes and Species

Miller & Levine Biology 2010

Graph Alignment and Biological Networks

Karsten Vennemann, Seattle. QGIS Workshop CUGOS Spring Fling 2015

Introduction to Bioinformatics. Shifra Ben-Dor Irit Orr

Molecular Biology - Translation of RNA to make Protein *

Comparative RNA-seq analysis of transcriptome dynamics during petal development in Rosa chinensis

PREDICTION OF PROTEIN FUNCTIONS THROUGH THE PROCESSING OF PROTEIN- PROTEIN INTERACTION DATA NETWORK

NR402 GIS Applications in Natural Resources

(Lys), resulting in translation of a polypeptide without the Lys amino acid. resulting in translation of a polypeptide without the Lys amino acid.

Big Idea 3: Living systems store, retrieve, transmit and respond to information essential to life processes. Tuesday, December 27, 16

Transcription:

Protein Interaction Mapping: Use of Osprey to map Survival of Motor Neuron Protein interactions Presented by: Meg Barnhart Computational Biosciences Arizona State University

The Spinal Muscular Atrophy project Support provided by Dr. Ron Nieman, director of ASU NMR facility, through Families of SMA and NIH research grants Osprey and GRID databases created in the Mike Tyers lab at Samual Lunenfeld research institute at Mt. Sinai Hospital in Ontario, Canada

Why study the SMN protein? Most common genetic cause of infantile death SMA results in the specific loss of motor neurons and the atrophy of voluntary muscle groups (legs, arms) Broad clinical spectrum: Type I (severe), Type II (intermediate), and Type III (adult). 500 Kb 500 Kb SMN2 NAIP AG-1 C212 C212 AG-1 NAIP T T C C G AA CC A 8 7 6 5 4 3 2b 2a SMN2 Low full-length levels Alternatively spliced product --80% 7 (lacks exon 7) No protection from SMA 1 1 2a 2b3 4 5 6 7 8 ~100 % full-length SMN Protects from SMA

SMA research efforts SMN gene discovered in 1995 In recent years, at least 4 putative or proven functions for SMN protein >40 primary interactions of SMN with other proteins

SMN functions pre-mrna splicing Transcriptional regulation Ribosome production Axon-specific RNP transport

Pre-mRNA splicing First established function of SMN in cells master assembler (1997) Adapted from Paushkin, Gubitz et al. 2002

Proposed axon-specific function SMN observed moving bi-directionally along axons (2003) (Zhang, Pan et al. 2003)

Network visualization of protein interactions: Osprey Stand alone application (Red Hat, OS X, Windows) or online viewer for GRID databases

The GRID General Repository for interaction datasets Files may be interactively imported from any of the GRIDs while running Osprey

GRID datasets

Input files Osprey accepts several types of tab-delimited text files: File variation #1 GeneA GeneB FBP Htra 2B hnrnpg Htra 2B RBM hnrnpr hnrnpq Gemin5 File variation #2 GeneA GeneB GeneA screen name GeneB screen name FBP A B Htra 2B hnrnpg C D Htra 2B RBM C E hnrnpr F A hnrnpq D A Gemin5 G A

Input Files File variation #3 GeneA GeneB Experimental System Source PubMedID FBP CoIP HEK293 Williams et al 10734235 Htra 2B hnrnpg CoIP HEK293 Hofmann et al 12165565 Htra 2B RBM CoIP HEK293 Hofmann et al 12165565 HnRNPR CoIP HEK293 Rossoll et al 11773003 HnRNPQ CoIP HEK293 Rossoll et al 11773003 Gemin5 CoIP HEK293 Gubitz et al 11714716

Input Files File variation #4 GeneA GeneB GeneA screen name GeneB screen name Experimental System Source PubMedID FBP A B CoIP HEK293 Williams et al 10734235 Htra 2B hnrnpg C D CoIP HEK293 Hofmann et al 12165565 Htra 2B RBM C E CoIP HEK293 Hofmann et al 12165565 hnrnpr F A CoIP HEK293 Rossoll et al 11773003 hnrnpq D A CoIP HEK293 Rossoll et al 11773003 Gemin5 G A CoIP HEK293 Gubitz et al 11714716

SMN input files File variation #3 was most useful for SMN data ProCite database was used to download papers containing terms SMN, SMA., survival motor neuron protein and spinal muscular atrophy from NCBI Abstracts were manually reviewed for interaction information Selected papers were read & interactions were recorded in text files GeneA GeneB Experimental System Source PubMedID FBP CoIP HEK293 Williams et al 10734235 Htra 2B hnrnpg CoIP HEK293 Hofmann et al 12165565 Htra 2B RBM CoIP HEK293 Hofmann et al 12165565

SMN input files Disadvantage: manual data mining and entry is a huge bottleneck area! SMN-Osprey network was generated by creation of several files, each defined by a single type of experimental binding system

User interface Pubmed link

Superimposition of files Osprey can superimpose two or more user files differentiated by experimental system

SMN:GRID superimposition User files and GRID datasets may also be superimposed

Color indices Color of nodes may be user defined or defined by GO components GO consortium (http://www.geneontology.org/) catagorizes gene products in terms of their associated biological processes, cellular components and molecular functions in a species-independent manner

Color indices, continued Edge colors can be user defined, defined by experimental process, or defined by author Edges colored by source (author) Same edges colored by experimental system

Network and connectivity filters The network may be filtered by experimental system, source, or GO process applicable only if the edges are colored by experimental system or source and the nodes are colored by GO process

Connectivity filters Connectivity filters allow the user to define a minimum number of connections a node must have in order to be displayed An iterative option is also available

Network layouts: concentric circle

Network layouts: spoked dual ring

SMN: spoked dual ring layout

SMN interaction layouts

SMN interaction layouts - GO

Node report Table view YORF Gene Description GO Component GO Function GO Process GO Special LOC10248 RPP20 POP7 RPP2 Homo sapiens, POP7 (processing of precursor, S. cerevisiae) homolog, clone MGC:1986 IMAGE:3138336, mrna, complete cds. - nucleolar ribonuclease P complex - nucleus - ribonuclease P activity - hydrolase activity - trna processing - RNA processing - metabolism LOC79760 GEMIN7 FLJ13956 Homo sapiens, hypothetical protein FLJ13956, clone MGC:14121 IMAGE:4053402, mrna, complete cds. - spliceosome complex - NONE - nuclear mrna splicing, via spliceosome - RNA processing - metabolism

Manual functional clustering

Back to cartooning: too much information, too little information, or both? Cartoon images are widely used to visualize and document protein interactions Figures are often accepted as research results Too little information: interactions are only viewed in an isolated context, picture is limited to information deemed relevant by a particular research group Too much information: sequence of binding, oligomerization states, binding orientation, etc. are implied but not proven

Osprey advantages Protein-protein interactions are not shown with implications toward binding order or binding sites Flexible interface allows rapid manipulation of the network, allowing immediate and creative formation of hypotheses regarding interactions

Osprey disadvantages Manual extraction of data No automated deposition tools for interaction information Multiple interaction databases, main are Osprey and BIND Lack of standards for interaction quality Inability to record temporal or PTM dependence of interactions

Obstacles or opportunities? As shown with SMN, it is rapidly becoming necessary to be able to view protein interactions on the proteome scale, with a flexible and non-misleading graphic interface In this new field, there are several exciting potential projects for students of the computational biosciences A few ideas: GUI for normalized data entry into standardized protein interaction DBs, creation of decisiontree or NN data mining of PubMed for relevant interaction papers, n-dimensional interaction analysis