Supporting Information

Similar documents
5- Semaphorin-Plexin-Neuropilin

Signal transduction by Cell-Cell and Cell-Matrix contacts

Patrick: An Introduction to Medicinal Chemistry 5e Chapter 04

Axon Guidance. Multiple decision points along a growing axon s trajectory Different types of axon guidance cues:

Massachusetts Institute of Technology Harvard Medical School Brigham and Women s Hospital VA Boston Healthcare System 2.79J/3.96J/BE.

Genome Annotation Project Presentation

ADAM FAMILY. ephrin A INTERAZIONE. Eph ADESIONE? PROTEOLISI ENDOCITOSI B A RISULTATO REPULSIONE. reverse. forward

Zool 3200: Cell Biology Exam 5 4/27/15

Cell Adhesion and Signaling

Signal Transduction Phosphorylation Protein kinases. Misfolding diseases. Protein Engineering Lysozyme variants

Amino Acid Structures from Klug & Cummings. Bioinformatics (Lec 12)

Graduate Institute t of fanatomy and Cell Biology

The EGF Signaling Pathway! Introduction! Introduction! Chem Lecture 10 Signal Transduction & Sensory Systems Part 3. EGF promotes cell growth

Graph Theoretical Insights into Evolution of Multidomain Proteins

Chem Lecture 10 Signal Transduction

Bio 127 Section I Introduction to Developmental Biology. Cell Cell Communication in Development. Developmental Activities Coordinated in this Way

CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools. Giri Narasimhan

Biol403 - Receptor Serine/Threonine Kinases

Signal Transduction. Dr. Chaidir, Apt

Advanced Higher Biology. Unit 1- Cells and Proteins 2c) Membrane Proteins

A Monte Carlo study of ligand-dependent integrin signal initiation

Protein Structures. Sequences of amino acid residues 20 different amino acids. Quaternary. Primary. Tertiary. Secondary. 10/8/2002 Lecture 12 1

Building a Homology Model of the Transmembrane Domain of the Human Glycine α-1 Receptor

Cell Cell Communication in Development

Cell-Cell Communication in Development

Leucine-rich repeat receptor-like kinases (LRR-RLKs), HAESA, ERECTA-family

Mechanisms of Human Health and Disease. Developmental Biology

Organization of Genes Differs in Prokaryotic and Eukaryotic DNA Chapter 10 p

Classification of nonenzymatic homologues of protein kinases

targets. clustering show that different complex pathway

Amino Acid Structures from Klug & Cummings. 10/7/2003 CAP/CGS 5991: Lecture 7 1

Review Article Functional Diversity of the Schistosomamansoni Tyrosine Kinases

Comparative Features of Multicellular Eukaryotic Genomes

Gene Control Mechanisms at Transcription and Translation Levels

Mechanisms of Cell Proliferation

Cytokines regulate interactions between cells of the hemapoietic system

Supplementary Figure 1 Schematic overview of ASTNs in neuronal migration. (a) Schematic of roles played by ASTNs 1 and 2. ASTN-1-mediated adhesions

The Eukaryotic Genome and Its Expression. The Eukaryotic Genome and Its Expression. A. The Eukaryotic Genome. Lecture Series 11

Mechanisms of Cell Proliferation

We used the PSI-BLAST program ( to search the

1. The plasma membrane of eukaryotic cells is supported by a. actin filaments. b. microtubules. c. lamins. d. intermediate filaments.

Chapter 16. Cellular Movement: Motility and Contractility. Lectures by Kathleen Fitzpatrick Simon Fraser University Pearson Education, Inc.

NGF - twenty years a-growing

Transcription Regulation And Gene Expression in Eukaryotes UPSTREAM TRANSCRIPTION FACTORS

CSCE555 Bioinformatics. Protein Function Annotation

Homology and Information Gathering and Domain Annotation for Proteins

Supplementary Information

Statistical Machine Learning Methods for Bioinformatics II. Hidden Markov Model for Biological Sequences

Chapter 5. Proteomics and the analysis of protein sequence Ⅱ

Yeast ORFan Gene Project: Module 5 Guide

Dimerization of the EphA1 Receptor Tyrosine Kinase Transmembrane Domain: Insights into the Mechanism of Receptor Activation

Similarity searching summary (2)

The Src module: an ancient scaffold in the evolution of cytoplasmic tyrosine kinases

Lecture 2, 5/12/2001: Local alignment the Smith-Waterman algorithm. Alignment scoring schemes and theory: substitution matrices and gap models

A model for the evaluation of domain based classification of GPCR

SUPPLEMENTARY INFORMATION

Patterns and profiles applications of multiple alignments. Tore Samuelsson March 2013

Prediction and Classif ication of Human G-protein Coupled Receptors Based on Support Vector Machines

Large-Scale Genomic Surveys

Improved membrane protein topology prediction by domain assignments

Regulation and signaling. Overview. Control of gene expression. Cells need to regulate the amounts of different proteins they express, depending on

Ion Channel Structure and Function (part 1)

Thesis. Reference. Exploring structure and plasticity of tyrosine kinase domains for drug discovery. MORETTI, Loris

Statistical Machine Learning Methods for Biomedical Informatics II. Hidden Markov Model for Biological Sequences

Cell-Cell Communication in Development

Src Family Kinases and Receptors: Analysis of Three Activation Mechanisms by Dynamic Systems Modeling

Whole-genome analysis of GCN4 binding in S.cerevisiae

Orthology Part I concepts and implications Toni Gabaldón Centre for Genomic Regulation (CRG), Barcelona

CHAPTER 1 THE STRUCTURAL BIOLOGY OF THE FGF19 SUBFAMILY

Giri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748

Goals. Structural Analysis of the EGR Family of Transcription Factors: Templates for Predicting Protein DNA Interactions

2 The Proteome. The Proteome 15

Supplementary Information 16

STRING: Protein association networks. Lars Juhl Jensen

Week 10: Homology Modelling (II) - HHpred

Advanced Certificate in Principles in Protein Structure. You will be given a start time with your exam instructions

Computational chemical biology to address non-traditional drug targets. John Karanicolas

Activation of a receptor. Assembly of the complex

Reprogramming what is it? ips. neurones cardiomyocytes. Takahashi K & Yamanaka S. Cell 126, 2006,

An optimized energy potential can predict SH2 domainpeptide

Peter Pristas. Gene regulation in eukaryotes

Genomics and bioinformatics summary. Finding genes -- computer searches

Understanding Sequence, Structure and Function Relationships and the Resulting Redundancy

1-D Predictions. Prediction of local features: Secondary structure & surface exposure

Co-ordination occurs in multiple layers Intracellular regulation: self-regulation Intercellular regulation: coordinated cell signalling e.g.

GCD3033:Cell Biology. Transcription

Structure to Function. Molecular Bioinformatics, X3, 2006

Reception The target cell s detection of a signal coming from outside the cell May Occur by: Direct connect Through signal molecules

Introduction to protein alignments

Mechanical Proteins. Stretching imunoglobulin and fibronectin. domains of the muscle protein titin. Adhesion Proteins of the Immune System

Examination paper for Bi3016 Molecular Cell Biology

HMM applications. Applications of HMMs. Gene finding with HMMs. Using the gene finder

Quantitative Measurement of Genome-wide Protein Domain Co-occurrence of Transcription Factors

Supplementary Materials for

Old FINAL EXAM BIO409/509 NAME. Please number your answers and write them on the attached, lined paper.

Comprehensive genome analysis of 203 genomes provides structural genomics with new insights into protein family space

G protein coupled receptors Structure, function, regulation

BIOINFORMATICS: An Introduction

A Machine Text-Inspired Machine Learning Approach for Identification of Transmembrane Helix Boundaries

Cells to Tissues. Peter Takizawa Department of Cell Biology

Transcription:

Supporting Information Manning et al. 1.173/pnas.8131415 SI Text RM1 Motif. This Monosiga-specific motif of 22 aa is repeated 8 13 times in the extracellular regions of two RTKA s, one RTKG, and 4 other Monosiga predicted proteins (see.com for sequences and domain analysis). Fourteen of those proteins have N-terminal predicted signal peptides, but none have likely transmembrane regions, and only one has additional known domains (EF hands). While these gene predictions are preliminary, this suggests that some of these proteins might be secreted and possibly interact with the RTKs via homophilic adhesion. No clear examples of the domain repeat are found outside of Monosiga, though there are some scattered weakly similar sequences particularly in bacterial surface proteins, and profileprofile analysis with prc shows a weak overlap between RM1 and part of the eukaryotic Recep L domain. The logo view (Fig. S4) below of the alignment of all Monosiga RM1 motifs shows a partially conserved LxxL repeated pattern within the motif, which appears to be the main feature shared in these weak hits. RM2. This 8-aa domain is found in the cytoplasmic tail of four of the nine RTKB s and is repeated six times in RTKB2. It has not been found elsewhere in Monosiga or any published sequence. There is some substructure within the domain, including four conserved tyrosines followed by acidic residues (Fig. S5). These score highly by Scansite prediction (http://scansite.mit-.edu) both as Src phosphorylation sites and SH2 binding sites. This domain overlaps the MR motif seen in RTKB2, but due to the substructure within the domain, the MR phase is different to that of RM2. RM1-LRR. The RM1 motif emerged from a MEME search, and is found to partially overlap with the Pfam LRR (leucine-rich repeat) domain, so appears to be a Monosiga-specific extension of that domain. LRR-RM1 annotations refer to the merged domain. a. Similar to RM1, we found a variant LDL receptor type A repeat using Smart and Pfam models, and extended with a Monosiga-specific sequence extension. Unlike many other proteins, this domain is found only once per gene, and is specific both to Monosiga and to RTKs. -Related Domains. A number of weakly scoring (Hyalin Repeat) domain hits resolved into three major subclasses of this domain (2, 3, 4), with distinct patterns of conservation within the domain, but also considerable sequence variation, indels and partial hits within each domain, so this classification should be used with caution. 2 domains are most common in s, while 3 is found predominantly in SH2 proteins. 1. Obenauer JC, Cantley LC, Yaffe MB (23) Scansite 2.: Proteome-wide prediction of cell signaling interactions using short sequence motifs. Nucleic Acids Res 31:3635 3641. 2. King N, Carroll SB (21) A receptor tyrosine from choanoflagellates: molecular insights into early animal evolution. Proc Natl Acad Sci USA 98: 1532 1537. Manning et al. www.pnas.org/cgi/content/short/8131415 1of13

Fig S1. Domain architecture of all Monosiga TKs. Manning et al. www.pnas.org/cgi/content/short/8131415 2of13

25 24 2 22 21 2 1 18 17 16 15 14 1 12 11 1 8 7 6 5 4 2 1 1 8 7 6 5 4 2 1 1 8 7 6 5 4 2 1 Receptor Tyrosine Kinases RTKC1 RTKC2 RTKC3 RTKC5 RTKC6 RTKC7 RTKC8 RTKC9 RTKC1 RTKD1 RTKD2 RTKD3 RTKD4 RTKE1 RM1-LRR RTKE2 RM1-LRR RTKE3 RM1-LRR RTKE4 RTKF1 RTKF2 RTKF3 RTKE5 RTKE6 RM21 36 3 27 24 21 18 15 12 6 PA26 RTKC4 25 24 2 22 21 2 1 18 17 16 15 14 1 12 11 1 8 7 6 5 4 2 1 FN3 FN3 FN3 FN3 FGTK1 FGTK2 FGTK3 FGTK4 FGTK5 FGTK6 FGTK7 FGTK9 FGTK1 FGTK11 FGTK12 FGTK13 FGTK14 FGTK8 ANF receptor LRTK1 RM1-LRR LRTK2 RM1-LRR LRTK3 RM1-LRR LRTK4 RM1-LRR LRTK5 RM1-LRR RM1-LRR Fig S1. Continued. Manning et al. www.pnas.org/cgi/content/short/8131415 3of13

25 24 2 22 21 2 1 18 17 16 15 14 1 12 11 1 8 7 6 5 4 2 1 Receptor Tyrosine Kinases RTKL1 RM1-LRR RTKL2 RM1-LRR RTKL3 RTKM1 RTKM2 Unclassified Tyrosine Kinases UTK1 UTK2 UTK3 UTK4 UTK5 UTK6 UTK7 UTK8 UTK9 UTK1 UTK11 UTK12 UTK13 UTK14 UTK15 UTK16 UTK17 UTK18 UTK19 UTK2 UTK21 UTK22 receptor L FN3 SH2 SH2 SH2 UTK23 UTK24 UTK25 UTK26 ANF receptor SAM FN3 25 24 2 22 21 2 1 18 17 16 15 14 1 12 11 1 8 7 6 5 4 2 1 25 24 2 22 21 2 1 18 17 16 15 14 1 12 11 1 8 7 6 5 4 2 1 RTKH1 FN3 RTKH2 RTKJ1 FN3 RTKK1 receptor L RTKJ2 receptor L pbh1 36 3 27 24 21 18 15 12 6 RTKK2 pbh1 MFS RTKG1 RTKG2 RM1 Fig S1. Continued. Manning et al. www.pnas.org/cgi/content/short/8131415 4of13

Fig S2. Domain architecture for all Monosiga PTP, SH2 and PTB domain containing proteins. SH2 domains in s and PTPs are listed under those headings. Manning et al. www.pnas.org/cgi/content/short/8131415 5of13

Fig S2. Continued. Manning et al. www.pnas.org/cgi/content/short/8131415 6of13

Fig S2. Continued. Manning et al. www.pnas.org/cgi/content/short/8131415 7of13

Fig S2. Continued. Manning et al. www.pnas.org/cgi/content/short/8131415 8of13

Fig S3. HMM logo comparison of Monosiga TKs with those of human, Drosophila, and C. elegans. Manning et al. www.pnas.org/cgi/content/short/8131415 9of13

Fig S4. Logo view of RM1 motif. Manning et al. www.pnas.org/cgi/content/short/8131415 1 of 13

Fig S5. Logo view of RM2 motif. Manning et al. www.pnas.org/cgi/content/short/8131415 11 of 13

Table S1. Accessory domain and motifs in Monosiga TKs Human TKs Name No. genes (families) Copies/ gene Related to/description with domain Extracellular motifs and domains RM1 3 (RTKA, G) 8 13 Unique to choanoflagellates - 11 (RTKC, E) 3 Family of domains, related to Ig, FN3 (Ig): FGFR, Trk, VR, Tie, Axl, PDGFR, CCK4 L 25 1 Similar to part of LDL receptor A motif Recep_L_domain 5 (RTKA, G, J, UTK) 1 2 Fragment of domain found in, Insulin receptors R, InsR 9 (FGTK) 3 2 Alpha-Integrin repeat motif - /CA- 1 (RTKB-D, H J) 1 9 Epidermal Growth Factor repeats Tie, Eph, ALK LRR 11 (FGTK, LRTK, RTKE, L) 1 4 Leucine Rich Repeat Trk 21 (RTKB-E, J, M) Rich in C and CxxC. Weakly similar to TNFR, furin, GCC2 repeats ANF_receptor 2 (UTK, RTKC) 1 Ligand binding domain of RGCs, which contain an inactive domain (Furin) R, InsR FN3 5 (RTKC, RTKH, UTK) 1 2 Fibronectin Type 3 domain Axl, Eph, InsR, Sev, Tie Intracellular motifs and domains SH2 14 (SFK, FVTK, 1 CTKA, 3 UTK) 1 Ptyr binding Src, Tec, Abl, Csk, Fer, Syk SH3 8 (SFK) 1 Binds PxxP motifs Src, Tec, Abl, Csk PTB 9 (HMTK) 1 4 Peptide and ptyr binding - FYVE 2 (FVTK) 1 Zinc Finger implicated in lipid binding - RGC CAP GLY 9 (RTKC) 1 Cytoskeleton-associated (19 copies in genome, including one PTP) PH 2 (CTKA, Tec) 1 Binds to lipids and signaling proteins Tec CH 1 (CTKB) 1 Calponin Homology. Actin-binding and signaling roles, also seen in many SH2-containing proteins C2 1 (Src) 1 Ca-dependent lipid association, maybe a substitution for missing myristoylation site SAM 1 (UTK) 1 Sterile Alpha Motif, also seen in many SH2-containing adaptors RM2 ( MR (3)) 4 (RTKB) 1 6 Novel motif, C-terminal of domain. 3 conserved tyrosine residues include conserved Src-like phosphorylation/sh2 binding motif. - - - ACK - Manning et al. www.pnas.org/cgi/content/short/8131415 12 of 13

Other Supporting Information Files Dataset S1 (PDF) Dataset S2 (XLS) Manning et al. www.pnas.org/cgi/content/short/8131415 13 of 13