Bioinformatics Practical for Biochemists Andrei Lupas, Birte Höcker, Steffen Schmidt WS 2013/14 03. Sequence Features
Targeting proteins signal peptide targets proteins to the secretory pathway N-terminal sequence recognized while peptide is still synthesized on the ribosome Günter Blobel, 1999, nobelprize.org
Signal Peptide Prediction Sequence Logo of eukaryotic signal peptides Nielsen et al. (2007)
Signal Peptide Prediction - SignalP http://www.cbs.dtu.dk/services/signalp/ http://www.cbs.dtu.dk/cgi-bin/webface2.fcgi?jobid=53329bb900004a2504297a29&wait=20
Transmembrane Helices unusually long stretch of hydrophobic residues >18 hydrophobic amino acids hydrophobic interaction with lipids in membrane orientation of helix / topology of the protein looking at the loops : R & K mainly found on cytoplasmic side positive inside rule TGRPEWIWLALGTALMGLGTLYFLVKGMGVSDPDAKKFYAITTLVPAIAFTMYLSMLLGYGL N C PDB-id: 1c3w
Transmembrane Helices TMHMM http://www.cbs.dtu.dk/services/tmhmm/ Accuracy of predicting TM helices high > 90% Accuracy of predicting the topology prediction > 75% Sonnhammer et al. (1998)
Transmembrane Helices TMHMM http://www.cbs.dtu.dk/services/tmhmm/ http://www.cbs.dtu.dk/cgi-bin/webface2.fcgi?jobid=5332a5e70000681f58ebac3b&wait=20
Secondary Structure amino acid preferences -helix" β-strand" β-turn " Glu" 1.59" 0.52" 1.01" Ala" 1.41" 0.72" 0.92" Leu" 1.34" 1.22" 0.57" Met" 1.3" 1.14" 0.52" Gln" 1.27" 0.98" 0.84" Lys" 1.23" 0.69" 1.07" Arg" 1.21" 0.84" 0.9" His" 1.05" 0.8" 0.81" Val" 0.9" 1.87" 0.41" Ile" 1.09" 1.67" 0.47" Tyr" 0.74" 1.45" 0.76" Cys" 0.66" 1.4" 0.54" Trp" 1.02" 1.35" 0.65" Phe" 1.16" 1.33" 0.59" Thr" 0.76" 1.17" 0.9" Gly" 0.43" 0.58" 1.77" Asn" 0.76" 0.48" 1.34" Pro" 0.34" 0.31" 1.32" Ser" 0.57" 0.96" 1.22" Asp" 0.99" 0.39" 1.24" William (1987) Biochim Biophys Acta
Secondary Structure buried ß-sheet PDB: 1kgs, Buckler et al. (2002), Structure
Secondary Structure amphiphilic partially buried -helix PDB: 1kgs, Buckler et al. (2002), Structure
Secondary Structure amphiphilic ß-strand PDB: 1jat, VanDenmark et al. (2001), Cell
Secondary Structure collagen
Secondary Structure Prediction Toolkit Quick2D
Secondary Structure Prediction Toolkit Ali2D
Disordered Regions Today, programs predict that about 40% of all human proteins contain at least one intrinsically disordered segment of 30 amino acids or more, and that some 25% are likely to be disordered from beginning to end. lack of hydrophobic residues often with overrepresentation of a few amino acids http://www.nature.com/news/2011/110309/full/471151a/box/1.html
Secondary Structure Prediction Toolkit - Ali2D
Disorder Prediction IUPRED - http://iupred.enzim.hu/
Short Linear Motifs Eukaryotic Linear Motifs (ELM) / Short Linear Motifs (SLiM) Hunt T (1990) These motifs are linear, in the sense that three- dimensional organization is not required to bring distant segments of the molecule together to make the recognizable unit. The conservation of these motifs varies: some are highly conserved while others allow substitutions that retain only a certain pattern of charge across the motif.
Short Linear Motifs Characteristics 3-11 amino acids long poorly conserved / evolve fast 1-3 amino acids in the motifs are hot spots ~ 80% in disordered regions relatively low affinity to interacting partner (1-150µM) interaction via induced fit
Short Linear Motifs Function protein-protein interactions post-translational modifications e.g. Phosphorylation proteolytic cleavage/processing sites KEN / D box in cell cycle - degradation signals subcellular targeting sites NES - nuclear export signal modulation of interactions - fine tuning
Short Linear Motifs Nuclear Localization Signal (NLS) Impor'n- beta (1qgk; blue) recognizes nuclear pores and moves through them. It wraps around the end of importin-alpha (1ee5; green), an adaptor molecule that connects importin-beta with the cargo, here nucleoplasmin(1k5j; yellow), a chaperone important in nucleosome assembly. All interactions are mediated by linear motifs in unstructured segments (bipartite nuclear localization signals). David Goodsell, http://www.rcsb.org/pdb/101/motm.do?momid=85
ELM Resources elm.eu.org NUPL_XENLA
NLS in nucleoplasmin Quick 2D secondary structure prediction for nucleoplasmin, showing the unstructured C-terminal tail and the bipartite nuclear localization motif 50 100!! MASTVSNTSKLEKPVSLIWGCELNEQNKTFEFKVEDDEEKCEHQLALRTVCLGDKAKDEFHIVEIVTQEEGAEKSVPIATLKPSILPMATMVGIELTPPVTFRLKAGSG! SS PSIPRED EEEEEEEE EEEE EEEEEEEEE EEEEEE EEEEEEEE EEE EE EEEEEE! SS JNET EEEEEEE EEE HHHHHHHHHHHH EEEEEEEE EEEEEE EEEEEE! DO DISOPRED2 DDDDDDDDDDDDDDDDD! DO IUPRED DDD D DDDD DDD D D DDD DDDDDDD DDDD D DDDD DDDD! SO Prof (Rost) B B B BBBBB B B B B B BBB BBB BBBB BB B BB B BB BB BBBB B B B B! SO JNET B BBBBBBBB B B B B B BBBBBBBBBBB B BBBBBBB B BBBBBB B B BBBB B B B BBB B B!! 150!! PLYISGQHVAMEEDYSWAEEEDEGEAEGEEEEEEEEDQESPPKAVKRPAATKKAGQAKKKKLDKEDESSEEDSPTKKGKGAGRGRKPAAKK! SS PSIPRED EEEEEEEEEE HH! SS JNET EEE E! DO DISOPRED2 DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD! DO IUPRED DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD! SO Prof (Rost) BBB BBB B B B B B! SO JNET BBBBBBBBBB!! SS = Alpha-Helix Beta-Sheet Secondary Structure! DO = Disorder! SO = Solvent accessibility (A burried residue has at most 25% of its surface exposed to the solvent.)!
DDX6 & Scd6 / EDC3 Interaction EDC3 LSM FDF FDK YjeF DDX6/Me31B DEAD-box helicase Scd6/Tral LSM FDF
DDX6 & Scd6 / EDC3 Interaction
DDX6 & Scd6 / EDC3 Interaction PDB: 2wax, Tritschler et al. (2009), Mol Cell