Expanded View Figures

Size: px
Start display at page:

Download "Expanded View Figures"

Transcription

1 Molecular Systems iology Evolutionary divergence of mouse P Mei-Sheng Xiao et al Expanded View Figures Nucleotide percentage (%) T G Nucleotide percentage (%) T G Frequency of PS motif (%) filtered_ps raw_ps pseudo_ps Position to cleavage site 124bp L SP leavage site bp Position to cleavage site D E F L_ps( ) L_SP_ps( ) SP_ps( ) Log2 (read counts) U UU UUU UU GU U u UU GU UG G U UG L_ps( ) L_SP_ps( ) SP_ps( ) Log2 (read counts) Figure EV1. Features of identified ps clusters., The nucleotide composition around the cleavage sites identified in 57L/6J and SPRET/EiJ, respectively. The frequency of 13 known PS motifs within 100 nt upstream of ps identified in SPRET/EiJ. pseudo_ps represents ps identified by using reads without nongenomic T (Materials and Methods). raw_ps and filtered_ps represent the ps determined by the PSS reads with and without further filtering, respectively (Materials and Methods). X-axis is different types of PS motifs, and y-axis shows the percentage of ps with the specific motif in the upstream region. D Up: We retained only the ps clusters, of which the region ( 124 nt to 24 nt) flanking their cleavage sites could be reciprocally aligned between the genomes of 57L/6J and SPRET/EiJ using LiftOver. Down: the Venn diagram shows the number of ps that were identified in both strains (shared) or only in one strain. E, F The cumulative distribution function (DF) of the number of 3 0 mrn-seq reads mapped to shared, 57L/6J-specific, and SPRET/EiJ-specific ps from 57L/6J- and SPRET/EiJ-derived sample, respectively. EV1 Molecular Systems iology 12: ª 2016 The uthors

2 Mei-Sheng Xiao et al Evolutionary divergence of mouse P Molecular Systems iology Rep2 (log2 read counts) 20 r = 0.98 n = Rep2 (log2 read counts) 20 r = 0.98 n = Rep1 (log2 read counts) Rep1 (log2 read counts) D 3 mrn-seq (log2 read counts) r = 0.72 n = mrn-seq (log2 read counts) r = 0.76 n = RN seq (log2 RPKM) RN seq (log2 RPKM) E F 3 mrn-seq (log2 read counts) r = 0.79 n = mrn-seq (log2 read counts) r = 0.82 n = RN Seq (log2 RPKM) RN Seq (log2 RPKM) Figure EV2. Quality control of 3 0 REDS and 3 0 mrn-seq., Reproducibility of 3 0 mrn-seq results between two replicates in 57L/6J and SPRET/EiJ, respectively. The scatterplots compare the number of reads mapped to each ps between the two replicates., D The correlation of gene expression level estimated by RN-Seq (x-axis) and 3 0 mrn-seq (y-axis) in 57L/6J and SPRET/EiJ samples, respectively. E, F The correlation of gene expression level estimated by RN-Seq (x-axis) and 3 0 mrn-seq (y-axis) in two mouse strains, but only for genes with single ps isoform. ª 2016 The uthors Molecular Systems iology 12: EV2

3 Molecular Systems iology Evolutionary divergence of mouse P Mei-Sheng Xiao et al Rep2 (allelic ps usage difference) r = 0.90 n = Figure EV3. Scatterplot comparing the allelic difference of ps usages measured in the two independent replicate experiments Rep1 (allelic ps usage difference) EV3 Molecular Systems iology 12: ª 2016 The uthors

4 Mei-Sheng Xiao et al Evolutionary divergence of mouse P Molecular Systems iology L>SP_ps (940) P value = 0.21 SP>L_ps (936) P value = 0.73 control_ps (2502) L>SP_ps (940) P value = 0.01 SP>L_ps (936) P value = 0.08 control_ps (2502) Difference of MFE (L SP) Difference of MFE (L SP) L>SP_ps (940) P value = 0.72 SP>L_ps (936) P value = 0.38 control_ps (2502) Difference of MFE (L SP) Figure EV4. Predicted RN secondary structure in the flanking region of ps. The cumulative distribution function (DF) of allelic difference in MFE of distal upstream region (: UE), proximal downstream region (: DE), distal downstream region (: DE) from ps with usage biased toward 57L/6J (L, blue line), SPRET/EiJ (SP, red line), or without biases (control, gray). The statistical significance of the difference between biased group and control group was determined by Kolmogorov Smirnov test. ª 2016 The uthors Molecular Systems iology 12: EV4

5 Molecular Systems iology Evolutionary divergence of mouse P Mei-Sheng Xiao et al PRS score TSS(15939) Start codon(5934) Stop codon(10628) ps(27514) all (27514) distal (1047) ase ase ase ase PRS score ase ase ase ase Figure EV5. RN secondary structure in TSS, start codon, stop codon and ps. PRS scores (y-axis) were plotted on each base (x-axis) within the ps flanking region. lack and red lines represent the PRS scores calculated based on all ps and only the most distal ps, respectively. The gray bar marks the upstream region ( 30 nt to 15 nt) known to contain ps signal. Figure EV6. Motifs impact ps usage. The same as Fig 4, but after removing all the ps containing canonical ps motif U in the 100 nt upstream region. Scatterplot comparing the allelic difference in hexamer frequency in downstream region (0 100 nt) between two groups of ps, one with usage biased toward 57L/6J (L, x-axis) and SPRET/EiJ (SP, y-axis), respectively. Each gray dot represents one specific hexamer, and the dot size represents the frequency of the hexamer. E oxplots showing the repressive effect of poly(u) tract on ps usage is dependent of its length. g indicates both alleles have the same length of poly(u) tract (6 Us) in the 100 nt upstream region. U 5 /U 4 /U 3 indicates only one allele has an intact 6 Us, whereas another allele has only 5/4/3 Us (). Y-axis represents the allelic difference in ps usage. In panels (D and E), the length of poly(u) tract increases to 7 (D) and 8 (E), respectively. Mann Whitney U-test was used to determine the statistical significance (P < 0.05). The solid horizontal bars, box ranges, the upper and lower bar represent median, 75 th percentile, 25 th percentile, maximum and minimum value, respectively. F Scatterplot comparing the frequency of all hexamers in the 100-nt region upstream of cleavage sites between trans-regulatory (x-axis) and control ps without parental divergence (y-axis). EV5 Molecular Systems iology 12: ª 2016 The uthors

6 Mei-Sheng Xiao et al Evolutionary divergence of mouse P Molecular Systems iology Hexamer frequency difference (SP L) UUUUUU UU Hexamer frequency difference (SP L) Hexamer frequency difference (L SP) Hexamer frequency difference (L SP) D E ps usage change ps usage change Hexamer frequency (ontrol) ps usage change g(u 6 ) U 5 U 4 U 3 g(u 7 ) U 6 U 5 U 4 U 3 g(u 8 ) U 7 U 6 U 5 U 4 F Hexamer frequency (Trans) Figure EV6. ª 2016 The uthors Molecular Systems iology 12: EV6

Extensive identification and analysis of conserved small ORFs in animals

Extensive identification and analysis of conserved small ORFs in animals Mackowiak et al. Genome Biology (2015) 16:179 DOI 10.1186/s13059-015-0742-x RESEARCH Open Access Extensive identification and analysis of conserved small ORFs in animals Sebastian D. Mackowiak 1, Henrik

More information

Genome 559 Wi RNA Function, Search, Discovery

Genome 559 Wi RNA Function, Search, Discovery Genome 559 Wi 2009 RN Function, Search, Discovery The Message Cells make lots of RN noncoding RN Functionally important, functionally diverse Structurally complex New tools required alignment, discovery,

More information

AS A SERVICE TO THE RESEARCH COMMUNITY, GENOME BIOLOGY PROVIDES A 'PREPRINT' DEPOSITORY

AS A SERVICE TO THE RESEARCH COMMUNITY, GENOME BIOLOGY PROVIDES A 'PREPRINT' DEPOSITORY http://genomebiology.com/2002/3/12/preprint/0011.1 This information has not been peer-reviewed. Responsibility for the findings rests solely with the author(s). Deposited research article MRD: a microsatellite

More information

Sequence comparison: Score matrices

Sequence comparison: Score matrices Sequence comparison: Score matrices http://facultywashingtonedu/jht/gs559_2013/ Genome 559: Introduction to Statistical and omputational Genomics Prof James H Thomas FYI - informal inductive proof of best

More information

Introduction to Hidden Markov Models for Gene Prediction ECE-S690

Introduction to Hidden Markov Models for Gene Prediction ECE-S690 Introduction to Hidden Markov Models for Gene Prediction ECE-S690 Outline Markov Models The Hidden Part How can we use this for gene prediction? Learning Models Want to recognize patterns (e.g. sequence

More information

Sequence comparison: Score matrices. Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas

Sequence comparison: Score matrices. Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas Sequence comparison: Score matrices Genome 559: Introduction to Statistical and omputational Genomics Prof James H Thomas FYI - informal inductive proof of best alignment path onsider the last step in

More information

Supplemental Information

Supplemental Information Molecular Cell, Volume 52 Supplemental Information The Translational Landscape of the Mammalian Cell Cycle Craig R. Stumpf, Melissa V. Moreno, Adam B. Olshen, Barry S. Taylor, and Davide Ruggero Supplemental

More information

The genomes of recombinant inbred lines

The genomes of recombinant inbred lines The genomes of recombinant inbred lines Karl W Broman Department of Biostatistics Johns Hopkins University http://www.biostat.jhsph.edu/~kbroman C57BL/6 2 1 Recombinant inbred lines (by sibling mating)

More information

Figure S1: Mitochondrial gene map for Pythium ultimum BR144. Arrows indicate transcriptional orientation, clockwise for the outer row and

Figure S1: Mitochondrial gene map for Pythium ultimum BR144. Arrows indicate transcriptional orientation, clockwise for the outer row and Figure S1: Mitochondrial gene map for Pythium ultimum BR144. Arrows indicate transcriptional orientation, clockwise for the outer row and counterclockwise for the inner row, with green representing coding

More information

Bias in RNA sequencing and what to do about it

Bias in RNA sequencing and what to do about it Bias in RNA sequencing and what to do about it Walter L. (Larry) Ruzzo Computer Science and Engineering Genome Sciences University of Washington Fred Hutchinson Cancer Research Center Seattle, WA, USA

More information

Intro Gene regulation Synteny The End. Today. Gene regulation Synteny Good bye!

Intro Gene regulation Synteny The End. Today. Gene regulation Synteny Good bye! Today Gene regulation Synteny Good bye! Gene regulation What governs gene transcription? Genes active under different circumstances. Gene regulation What governs gene transcription? Genes active under

More information

Sequence comparison: Score matrices. Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas

Sequence comparison: Score matrices. Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas Sequence comparison: Score matrices Genome 559: Introduction to Statistical and omputational Genomics Prof James H Thomas Informal inductive proof of best alignment path onsider the last step in the best

More information

Evolutionary analysis of the well characterized endo16 promoter reveals substantial variation within functional sites

Evolutionary analysis of the well characterized endo16 promoter reveals substantial variation within functional sites Evolutionary analysis of the well characterized endo16 promoter reveals substantial variation within functional sites Paper by: James P. Balhoff and Gregory A. Wray Presentation by: Stephanie Lucas Reviewed

More information

Supplementary Information

Supplementary Information Supplementary Information Supplementary Figure 1. Schematic pipeline for single-cell genome assembly, cleaning and annotation. a. The assembly process was optimized to account for multiple cells putatively

More information

RNA-Strukturvorhersage Strukturelle Bioinformatik WS16/17

RNA-Strukturvorhersage Strukturelle Bioinformatik WS16/17 RNA-Strukturvorhersage Strukturelle Bioinformatik WS16/17 Dr. Stefan Simm, 01.11.2016 simm@bio.uni-frankfurt.de RNA secondary structures a. hairpin loop b. stem c. bulge loop d. interior loop e. multi

More information

Density Curves and the Normal Distributions. Histogram: 10 groups

Density Curves and the Normal Distributions. Histogram: 10 groups Density Curves and the Normal Distributions MATH 2300 Chapter 6 Histogram: 10 groups 1 Histogram: 20 groups Histogram: 40 groups 2 Histogram: 80 groups Histogram: 160 groups 3 Density Curve Density Curves

More information

A Method for Aligning RNA Secondary Structures

A Method for Aligning RNA Secondary Structures Method for ligning RN Secondary Structures Jason T. L. Wang New Jersey Institute of Technology J Liu, JTL Wang, J Hu and B Tian, BM Bioinformatics, 2005 1 Outline Introduction Structural alignment of RN

More information

ASSESSING TRANSLATIONAL EFFICIACY THROUGH POLY(A)- TAIL PROFILING AND IN VIVO RNA SECONDARY STRUCTURE DETERMINATION

ASSESSING TRANSLATIONAL EFFICIACY THROUGH POLY(A)- TAIL PROFILING AND IN VIVO RNA SECONDARY STRUCTURE DETERMINATION ASSESSING TRANSLATIONAL EFFICIACY THROUGH POLY(A)- TAIL PROFILING AND IN VIVO RNA SECONDARY STRUCTURE DETERMINATION Journal Club, April 15th 2014 Karl Frontzek, Institute of Neuropathology POLY(A)-TAIL

More information

Figure S1: Similar to Fig. 2D in paper, but using Euclidean distance instead of Spearman distance.

Figure S1: Similar to Fig. 2D in paper, but using Euclidean distance instead of Spearman distance. Supplementary analysis 1: Euclidean distance in the RESS As shown in Fig. S1, Euclidean distance did not adequately distinguish between pairs of random sequence and pairs of structurally-related sequence.

More information

Sequence analysis and comparison

Sequence analysis and comparison The aim with sequence identification: Sequence analysis and comparison Marjolein Thunnissen Lund September 2012 Is there any known protein sequence that is homologous to mine? Are there any other species

More information

Genomics and bioinformatics summary. Finding genes -- computer searches

Genomics and bioinformatics summary. Finding genes -- computer searches Genomics and bioinformatics summary 1. Gene finding: computer searches, cdnas, ESTs, 2. Microarrays 3. Use BLAST to find homologous sequences 4. Multiple sequence alignments (MSAs) 5. Trees quantify sequence

More information

Single nucleotide variants in transcription factors associate more tightly with phenotype than with gene expression

Single nucleotide variants in transcription factors associate more tightly with phenotype than with gene expression Washington University School of Medicine Digital Commons@Becker Open Access Publications 2014 Single nucleotide variants in transcription factors associate more tightly with phenotype than with gene expression

More information

Haploid & diploid recombination and their evolutionary impact

Haploid & diploid recombination and their evolutionary impact Haploid & diploid recombination and their evolutionary impact W. Garrett Mitchener College of Charleston Mathematics Department MitchenerG@cofc.edu http://mitchenerg.people.cofc.edu Introduction The basis

More information

Supplemental Data. Hou et al. (2016). Plant Cell /tpc

Supplemental Data. Hou et al. (2016). Plant Cell /tpc Supplemental Data. Hou et al. (216). Plant Cell 1.115/tpc.16.295 A Distance to 1 st nt of start codon Distance to 1 st nt of stop codon B Normalized PARE abundance 8 14 nt 17 nt Frame1 Arabidopsis inflorescence

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION SUPPLMTARY IFORMATIO a doi:10.108/nature10402 b 100 nm 100 nm c SAXS Model d ulers assigned to reference- Back-projected free class averages class averages Refinement against single particles Reconstructed

More information

Web-based Supplementary Materials for BM-Map: Bayesian Mapping of Multireads for Next-Generation Sequencing Data

Web-based Supplementary Materials for BM-Map: Bayesian Mapping of Multireads for Next-Generation Sequencing Data Web-based Supplementary Materials for BM-Map: Bayesian Mapping of Multireads for Next-Generation Sequencing Data Yuan Ji 1,, Yanxun Xu 2, Qiong Zhang 3, Kam-Wah Tsui 3, Yuan Yuan 4, Clift Norris 1, Shoudan

More information

Supplementary Tables and Figures

Supplementary Tables and Figures Supplementary Tables Supplementary Tables and Figures Supplementary Table 1: Tumor types and samples analyzed. Supplementary Table 2: Genes analyzed here. Supplementary Table 3: Statistically significant

More information

The Eukaryotic Genome and Its Expression. The Eukaryotic Genome and Its Expression. A. The Eukaryotic Genome. Lecture Series 11

The Eukaryotic Genome and Its Expression. The Eukaryotic Genome and Its Expression. A. The Eukaryotic Genome. Lecture Series 11 The Eukaryotic Genome and Its Expression Lecture Series 11 The Eukaryotic Genome and Its Expression A. The Eukaryotic Genome B. Repetitive Sequences (rem: teleomeres) C. The Structures of Protein-Coding

More information

Identification of 3 0 gene ends using transcriptional and genomic conservation across vertebrates

Identification of 3 0 gene ends using transcriptional and genomic conservation across vertebrates Morgan et al. BMC Genomics 2012, 13:708 METHODOLOGY ARTICLE Open Access Identification of 3 0 gene ends using transcriptional and genomic conservation across vertebrates Marcos Morgan 1,2*, Alessandra

More information

Nature Genetics: doi: /ng Supplementary Figure 1. The phenotypes of PI , BR121, and Harosoy under short-day conditions.

Nature Genetics: doi: /ng Supplementary Figure 1. The phenotypes of PI , BR121, and Harosoy under short-day conditions. Supplementary Figure 1 The phenotypes of PI 159925, BR121, and Harosoy under short-day conditions. (a) Plant height. (b) Number of branches. (c) Average internode length. (d) Number of nodes. (e) Pods

More information

Sequence alignment methods. Pairwise alignment. The universe of biological sequence analysis

Sequence alignment methods. Pairwise alignment. The universe of biological sequence analysis he universe of biological sequence analysis Word/pattern recognition- Identification of restriction enzyme cleavage sites Sequence alignment methods PstI he universe of biological sequence analysis - prediction

More information

Divergence of regulatory networks governed by the orthologous transcription factors FLC and PEP1 in Brassicaceae species

Divergence of regulatory networks governed by the orthologous transcription factors FLC and PEP1 in Brassicaceae species Divergence of regulatory networks governed by the orthologous transcription factors FLC and PEP1 in Brassicaceae species Julieta L. Mateos a,1,2, Vicky Tilmes a,1, Pedro Madrigal b,3, Edouard Severing

More information

Conserved RNA Structures. Ivo L. Hofacker. Institut for Theoretical Chemistry, University Vienna.

Conserved RNA Structures. Ivo L. Hofacker. Institut for Theoretical Chemistry, University Vienna. onserved RN Structures Ivo L. Hofacker Institut for Theoretical hemistry, University Vienna http://www.tbi.univie.ac.at/~ivo/ Bled, January 2002 Energy Directed Folding Predict structures from sequence

More information

(Lys), resulting in translation of a polypeptide without the Lys amino acid. resulting in translation of a polypeptide without the Lys amino acid.

(Lys), resulting in translation of a polypeptide without the Lys amino acid. resulting in translation of a polypeptide without the Lys amino acid. 1. A change that makes a polypeptide defective has been discovered in its amino acid sequence. The normal and defective amino acid sequences are shown below. Researchers are attempting to reproduce the

More information

Technologie w skali genomowej 2/ Algorytmiczne i statystyczne aspekty sekwencjonowania DNA

Technologie w skali genomowej 2/ Algorytmiczne i statystyczne aspekty sekwencjonowania DNA Technologie w skali genomowej 2/ Algorytmiczne i statystyczne aspekty sekwencjonowania DNA Expression analysis for RNA-seq data Ewa Szczurek Instytut Informatyki Uniwersytet Warszawski 1/35 The problem

More information

Illegitimate translation causes unexpected gene expression from on-target out-of-frame alleles

Illegitimate translation causes unexpected gene expression from on-target out-of-frame alleles Illegitimate translation causes unexpected gene expression from on-target out-of-frame alleles created by CRISPR-Cas9 Shigeru Makino, Ryutaro Fukumura, Yoichi Gondo* Mutagenesis and Genomics Team, RIKEN

More information

Understanding relationship between homologous sequences

Understanding relationship between homologous sequences Molecular Evolution Molecular Evolution How and when were genes and proteins created? How old is a gene? How can we calculate the age of a gene? How did the gene evolve to the present form? What selective

More information

Comparing whole genomes

Comparing whole genomes BioNumerics Tutorial: Comparing whole genomes 1 Aim The Chromosome Comparison window in BioNumerics has been designed for large-scale comparison of sequences of unlimited length. In this tutorial you will

More information

Getting statistical significance and Bayesian confidence limits for your hidden Markov model or score-maximizing dynamic programming algorithm,

Getting statistical significance and Bayesian confidence limits for your hidden Markov model or score-maximizing dynamic programming algorithm, Getting statistical significance and Bayesian confidence limits for your hidden Markov model or score-maximizing dynamic programming algorithm, with pairwise alignment of sequences as an example 1,2 1

More information

Gene Regula*on, ChIP- X and DNA Mo*fs. Statistics in Genomics Hongkai Ji

Gene Regula*on, ChIP- X and DNA Mo*fs. Statistics in Genomics Hongkai Ji Gene Regula*on, ChIP- X and DNA Mo*fs Statistics in Genomics Hongkai Ji (hji@jhsph.edu) Genetic information is stored in DNA TCAGTTGGAGCTGCTCCCCCACGGCCTCTCCTCACATTCCACGTCCTGTAGCTCTATGACCTCCACCTTTGAGTCCCTCCTC

More information

*Equal contribution Contact: (TT) 1 Department of Biomedical Engineering, the Engineering Faculty, Tel Aviv

*Equal contribution Contact: (TT) 1 Department of Biomedical Engineering, the Engineering Faculty, Tel Aviv Supplementary of Complementary Post Transcriptional Regulatory Information is Detected by PUNCH-P and Ribosome Profiling Hadas Zur*,1, Ranen Aviner*,2, Tamir Tuller 1,3 1 Department of Biomedical Engineering,

More information

More Codon Usage Bias

More Codon Usage Bias .. CSC448 Bioinformatics Algorithms Alexander Dehtyar.. DA Sequence Evaluation Part II More Codon Usage Bias Scaled χ 2 χ 2 measure. In statistics, the χ 2 statstic computes how different the distribution

More information

An empirical strategy to detect bacterial transcript structure from directional RNA-seq transcriptome data

An empirical strategy to detect bacterial transcript structure from directional RNA-seq transcriptome data Wang et al. BMC Genomics (215) 16:359 DOI 1.1186/s12864-15-1555-8 RESEARCH ARTICLE Open Access An empirical strategy to detect bacterial transcript structure from directional RNA-seq transcriptome data

More information

Supporting Information

Supporting Information Supporting Information Weghorn and Lässig 10.1073/pnas.1210887110 SI Text Null Distributions of Nucleosome Affinity and of Regulatory Site Content. Our inference of selection is based on a comparison of

More information

STATISTICS 4, S4 (4769) A2

STATISTICS 4, S4 (4769) A2 (4769) A2 Objectives To provide students with the opportunity to explore ideas in more advanced statistics to a greater depth. Assessment Examination (72 marks) 1 hour 30 minutes There are four options

More information

Theoretical distribution of PSSM scores

Theoretical distribution of PSSM scores Regulatory Sequence Analysis Theoretical distribution of PSSM scores Jacques van Helden Jacques.van-Helden@univ-amu.fr Aix-Marseille Université, France Technological Advances for Genomics and Clinics (TAGC,

More information

Bustamante et al., Supplementary Nature Manuscript # 1 out of 9 Information #

Bustamante et al., Supplementary Nature Manuscript # 1 out of 9 Information # Bustamante et al., Supplementary Nature Manuscript # 1 out of 9 Details of PRF Methodology In the Poisson Random Field PRF) model, it is assumed that non-synonymous mutations at a given gene are either

More information

3. SEQUENCE ANALYSIS BIOINFORMATICS COURSE MTAT

3. SEQUENCE ANALYSIS BIOINFORMATICS COURSE MTAT 3. SEQUENCE ANALYSIS BIOINFORMATICS COURSE MTAT.03.239 25.09.2012 SEQUENCE ANALYSIS IS IMPORTANT FOR... Prediction of function Gene finding the process of identifying the regions of genomic DNA that encode

More information

Tandem repeat 16,225 20,284. 0kb 5kb 10kb 15kb 20kb 25kb 30kb 35kb

Tandem repeat 16,225 20,284. 0kb 5kb 10kb 15kb 20kb 25kb 30kb 35kb Overview Fosmid XAAA112 consists of 34,783 nucleotides. Blat results indicate that this fosmid has significant identity to the 2R chromosome of D.melanogaster. Evidence suggests that fosmid XAAA112 contains

More information

Supplementary Information for Hurst et al.: Causes of trends of amino acid gain and loss

Supplementary Information for Hurst et al.: Causes of trends of amino acid gain and loss Supplementary Information for Hurst et al.: Causes of trends of amino acid gain and loss Methods Identification of orthologues, alignment and evolutionary distances A preliminary set of orthologues was

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION SUPPLEMENTARY INFORMATION doi:10.1038/nature12791 Supplementary Figure 1 (1/3) WWW.NATURE.COM/NATURE 1 RESEARCH SUPPLEMENTARY INFORMATION Supplementary Figure 1 (2/3) 2 WWW.NATURE.COM/NATURE SUPPLEMENTARY

More information

Formalizing the gene centered view of evolution

Formalizing the gene centered view of evolution Chapter 1 Formalizing the gene centered view of evolution Yaneer Bar-Yam and Hiroki Sayama New England Complex Systems Institute 24 Mt. Auburn St., Cambridge, MA 02138, USA yaneer@necsi.org / sayama@necsi.org

More information

Предсказание и анализ промотерных последовательностей. Татьяна Татаринова

Предсказание и анализ промотерных последовательностей. Татьяна Татаринова Предсказание и анализ промотерных последовательностей Татьяна Татаринова Eukaryotic Transcription 2 Initiation Promoter: the DNA sequence that initially binds the RNA polymerase The structure of promoter-polymerase

More information

Quantifying sequence similarity

Quantifying sequence similarity Quantifying sequence similarity Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, February 16 th 2016 After this lecture, you can define homology, similarity, and identity

More information

Supplementary Figure 1. Nature Genetics: doi: /ng.3848

Supplementary Figure 1. Nature Genetics: doi: /ng.3848 Supplementary Figure 1 Phenotypes and epigenetic properties of Fab2L flies. A- Phenotypic classification based on eye pigment levels in Fab2L male (orange bars) and female (yellow bars) flies (n>150).

More information

Jungreis et al. Supplementary Materials

Jungreis et al. Supplementary Materials Supplemental Text S1: What we mean by functional To understand what we mean when we refer to functional readthrough candidates, it is helpful to review the nature of the signal detected by PhyloSF. PhyloSF

More information

Tree of Life iological Sequence nalysis Chapter http://tolweb.org/tree/ Phylogenetic Prediction ll organisms on Earth have a common ancestor. ll species are related. The relationship is called a phylogeny

More information

A phylogenetic view on RNA structure evolution

A phylogenetic view on RNA structure evolution 3 2 9 4 7 3 24 23 22 8 phylogenetic view on RN structure evolution 9 26 6 52 7 5 6 37 57 45 5 84 63 86 77 65 3 74 7 79 8 33 9 97 96 89 47 87 62 32 34 42 73 43 44 4 76 58 75 78 93 39 54 82 99 28 95 52 46

More information

Introduction to Bioinformatics

Introduction to Bioinformatics CSCI8980: Applied Machine Learning in Computational Biology Introduction to Bioinformatics Rui Kuang Department of Computer Science and Engineering University of Minnesota kuang@cs.umn.edu History of Bioinformatics

More information

Ø Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization.

Ø Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization. Statistical Tools in Evaluation HPS 41 Dr. Joe G. Schmalfeldt Types of Scores Continuous Scores scores with a potentially infinite number of values. Discrete Scores scores limited to a specific number

More information

(c) Find the product moment correlation coefficient between s and t.

(c) Find the product moment correlation coefficient between s and t. 1. A clothes shop manager records the weekly sales figures, s, and the average weekly temperature, t C, for 6 weeks during the summer. The sales figures were coded so that s w = 1000 The data are summarised

More information

Genômica comparativa. João Carlos Setubal IQ-USP outubro /5/2012 J. C. Setubal

Genômica comparativa. João Carlos Setubal IQ-USP outubro /5/2012 J. C. Setubal Genômica comparativa João Carlos Setubal IQ-USP outubro 2012 11/5/2012 J. C. Setubal 1 Comparative genomics There are currently (out/2012) 2,230 completed sequenced microbial genomes publicly available

More information

10-810: Advanced Algorithms and Models for Computational Biology. microrna and Whole Genome Comparison

10-810: Advanced Algorithms and Models for Computational Biology. microrna and Whole Genome Comparison 10-810: Advanced Algorithms and Models for Computational Biology microrna and Whole Genome Comparison Central Dogma: 90s Transcription factors DNA transcription mrna translation Proteins Central Dogma:

More information

arxiv:physics/ v1 [physics.bio-ph] 8 Feb 2000 Formalizing the gene centered view of evolution

arxiv:physics/ v1 [physics.bio-ph] 8 Feb 2000 Formalizing the gene centered view of evolution arxiv:physics/0002016v1 [physics.bio-ph] 8 Feb 2000 Formalizing the gene centered view of evolution Y. Bar-Yam New England Complex Systems Institute 24 Mt. Auburn St., Cambridge MA Department of Molecular

More information

Comparative analysis of RNA- Seq data with DESeq2

Comparative analysis of RNA- Seq data with DESeq2 Comparative analysis of RNA- Seq data with DESeq2 Simon Anders EMBL Heidelberg Two applications of RNA- Seq Discovery Eind new transcripts Eind transcript boundaries Eind splice junctions Comparison Given

More information

GEP Annotation Report

GEP Annotation Report GEP Annotation Report Note: For each gene described in this annotation report, you should also prepare the corresponding GFF, transcript and peptide sequence files as part of your submission. Student name:

More information

Computational Biology: Basics & Interesting Problems

Computational Biology: Basics & Interesting Problems Computational Biology: Basics & Interesting Problems Summary Sources of information Biological concepts: structure & terminology Sequencing Gene finding Protein structure prediction Sources of information

More information

Supplemental Data. Perea-Resa et al. Plant Cell. (2012) /tpc

Supplemental Data. Perea-Resa et al. Plant Cell. (2012) /tpc Supplemental Data. Perea-Resa et al. Plant Cell. (22)..5/tpc.2.3697 Sm Sm2 Supplemental Figure. Sequence alignment of Arabidopsis LSM proteins. Alignment of the eleven Arabidopsis LSM proteins. Sm and

More information

21 ST CENTURY LEARNING CURRICULUM FRAMEWORK PERFORMANCE RUBRICS FOR MATHEMATICS PRE-CALCULUS

21 ST CENTURY LEARNING CURRICULUM FRAMEWORK PERFORMANCE RUBRICS FOR MATHEMATICS PRE-CALCULUS 21 ST CENTURY LEARNING CURRICULUM FRAMEWORK PERFORMANCE RUBRICS FOR MATHEMATICS PRE-CALCULUS Table of Contents Functions... 2 Polynomials and Rational Functions... 3 Exponential Functions... 4 Logarithmic

More information

Lecture 18 June 2 nd, Gene Expression Regulation Mutations

Lecture 18 June 2 nd, Gene Expression Regulation Mutations Lecture 18 June 2 nd, 2016 Gene Expression Regulation Mutations From Gene to Protein Central Dogma Replication DNA RNA PROTEIN Transcription Translation RNA Viruses: genome is RNA Reverse Transcriptase

More information

Quantitative Bioinformatics

Quantitative Bioinformatics Chapter 9 Class Notes Signals in DNA 9.1. The Biological Problem: since proteins cannot read, how do they recognize nucleotides such as A, C, G, T? Although only approximate, proteins actually recognize

More information

Ensembl focuses on metazoan (animal) genomes. The genomes currently available at the Ensembl site are:

Ensembl focuses on metazoan (animal) genomes. The genomes currently available at the Ensembl site are: Comparative genomics and proteomics Species available Ensembl focuses on metazoan (animal) genomes. The genomes currently available at the Ensembl site are: Vertebrates: human, chimpanzee, mouse, rat,

More information

Annotation of Drosophila grimashawi Contig12

Annotation of Drosophila grimashawi Contig12 Annotation of Drosophila grimashawi Contig12 Marshall Strother April 27, 2009 Contents 1 Overview 3 2 Genes 3 2.1 Genscan Feature 12.4............................................. 3 2.1.1 Genome Browser:

More information

Searching for Noncoding RNA

Searching for Noncoding RNA Searching for Noncoding RN Larry Ruzzo omputer Science & Engineering enome Sciences niversity of Washington http://www.cs.washington.edu/homes/ruzzo Bio 2006, Seattle, 8/4/2006 1 Outline Noncoding RN Why

More information

METHODS FOR DETERMINING PHYLOGENY. In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task.

METHODS FOR DETERMINING PHYLOGENY. In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task. Chapter 12 (Strikberger) Molecular Phylogenies and Evolution METHODS FOR DETERMINING PHYLOGENY In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task. Modern

More information

Nature Genetics: doi: /ng Supplementary Figure 1. ssp mutant phenotypes in a functional SP background.

Nature Genetics: doi: /ng Supplementary Figure 1. ssp mutant phenotypes in a functional SP background. Supplementary Figure 1 ssp mutant phenotypes in a functional SP background. (a,b) Statistical comparisons of primary and sympodial shoot flowering times as determined by mean values for leaf number on

More information

Z score indicates how far a raw score deviates from the sample mean in SD units. score Mean % Lower Bound

Z score indicates how far a raw score deviates from the sample mean in SD units. score Mean % Lower Bound 1 EDUR 8131 Chat 3 Notes 2 Normal Distribution and Standard Scores Questions Standard Scores: Z score Z = (X M) / SD Z = deviation score divided by standard deviation Z score indicates how far a raw score

More information

SURVEY AND SUMMARY Multiple roles of the coding sequence 5 end in gene expression regulation

SURVEY AND SUMMARY Multiple roles of the coding sequence 5 end in gene expression regulation Published online 12 December 2014 Nucleic Acids Research, 2015, Vol. 43, No. 1 13 28 doi: 10.1093/nar/gku1313 SURVEY AND SUMMARY Multiple roles of the coding sequence 5 end in gene expression regulation

More information

Tools and Algorithms in Bioinformatics

Tools and Algorithms in Bioinformatics Tools and Algorithms in Bioinformatics GCBA815, Fall 2015 Week-4 BLAST Algorithm Continued Multiple Sequence Alignment Babu Guda, Ph.D. Department of Genetics, Cell Biology & Anatomy Bioinformatics and

More information

Using Base Pairing Probabilities for MiRNA Recognition. Yet Another SVM for MiRNA Recognition: yasmir

Using Base Pairing Probabilities for MiRNA Recognition. Yet Another SVM for MiRNA Recognition: yasmir 0. Using Base Pairing Probabilities for MiRNA Recognition Yet Another SVM for MiRNA Recognition: yasmir Daniel Pasailă, Irina Mohorianu, Liviu Ciortuz Department of Computer Science Al. I. Cuza University,

More information

Solutions exercises of Chapter 7

Solutions exercises of Chapter 7 Solutions exercises of Chapter 7 Exercise 1 a. These are paired samples: each pair of half plates will have about the same level of corrosion, so the result of polishing by the two brands of polish are

More information

Newly made RNA is called primary transcript and is modified in three ways before leaving the nucleus:

Newly made RNA is called primary transcript and is modified in three ways before leaving the nucleus: m Eukaryotic mrna processing Newly made RNA is called primary transcript and is modified in three ways before leaving the nucleus: Cap structure a modified guanine base is added to the 5 end. Poly-A tail

More information

Transcription Factor Binding Site Positioning in Yeast: Proximal Promoter Motifs Characterize TATA-Less Promoters

Transcription Factor Binding Site Positioning in Yeast: Proximal Promoter Motifs Characterize TATA-Less Promoters Transcription Factor Binding Site Positioning in Yeast: Proximal Promoter Motifs Characterize TATA-Less Promoters Ionas Erb 1, Erik van Nimwegen 2 * 1 Bioinformatics and Genomics program, Center for Genomic

More information

Supporting Information

Supporting Information Supporting Information Das et al. 10.1073/pnas.1302500110 < SP >< LRRNT > < LRR1 > < LRRV1 > < LRRV2 Pm-VLRC M G F V V A L L V L G A W C G S C S A Q - R Q R A C V E A G K S D V C I C S S A T D S S P E

More information

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics - in deriving a phylogeny our goal is simply to reconstruct the historical relationships between a group of taxa. - before we review the

More information

The Union and Intersection for Different Configurations of Two Events Mutually Exclusive vs Independency of Events

The Union and Intersection for Different Configurations of Two Events Mutually Exclusive vs Independency of Events Section 1: Introductory Probability Basic Probability Facts Probabilities of Simple Events Overview of Set Language Venn Diagrams Probabilities of Compound Events Choices of Events The Addition Rule Combinations

More information

6 Introduction to Population Genetics

6 Introduction to Population Genetics 70 Grundlagen der Bioinformatik, SoSe 11, D. Huson, May 19, 2011 6 Introduction to Population Genetics This chapter is based on: J. Hein, M.H. Schierup and C. Wuif, Gene genealogies, variation and evolution,

More information

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic analysis Phylogenetic Basics: Biological

More information

UE Praktikum Bioinformatik

UE Praktikum Bioinformatik UE Praktikum Bioinformatik WS 08/09 University of Vienna 7SK snrna 7SK was discovered as an abundant small nuclear RNA in the mid 70s but a possible function has only recently been suggested. Two independent

More information

Biology. Biology. Slide 1 of 26. End Show. Copyright Pearson Prentice Hall

Biology. Biology. Slide 1 of 26. End Show. Copyright Pearson Prentice Hall Biology Biology 1 of 26 Fruit fly chromosome 12-5 Gene Regulation Mouse chromosomes Fruit fly embryo Mouse embryo Adult fruit fly Adult mouse 2 of 26 Gene Regulation: An Example Gene Regulation: An Example

More information

GENETICS - CLUTCH CH.22 EVOLUTIONARY GENETICS.

GENETICS - CLUTCH CH.22 EVOLUTIONARY GENETICS. !! www.clutchprep.com CONCEPT: OVERVIEW OF EVOLUTION Evolution is a process through which variation in individuals makes it more likely for them to survive and reproduce There are principles to the theory

More information

Practical Bioinformatics

Practical Bioinformatics 5/2/2017 Dictionaries d i c t i o n a r y = { A : T, T : A, G : C, C : G } d i c t i o n a r y [ G ] d i c t i o n a r y [ N ] = N d i c t i o n a r y. h a s k e y ( C ) Dictionaries g e n e t i c C o

More information

Algorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment

Algorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment Algorithms in Bioinformatics FOUR Sami Khuri Department of Computer Science San José State University Pairwise Sequence Alignment Homology Similarity Global string alignment Local string alignment Dot

More information

How to detect paleoploidy?

How to detect paleoploidy? Genome duplications (polyploidy) / ancient genome duplications (paleopolyploidy) How to detect paleoploidy? e.g. a diploid cell undergoes failed meiosis, producing diploid gametes, which selffertilize

More information

Dr. Amira A. AL-Hosary

Dr. Amira A. AL-Hosary Phylogenetic analysis Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic Basics: Biological

More information

Genetic Engineering and Creative Design

Genetic Engineering and Creative Design Genetic Engineering and Creative Design Background genes, genotype, phenotype, fitness Connecting genes to performance in fitness Emergent gene clusters evolved genes MIT Class 4.208 Spring 2002 Evolution

More information

RNA Abstract Shape Analysis

RNA Abstract Shape Analysis ourse: iegerich RN bstract nalysis omplete shape iegerich enter of Biotechnology Bielefeld niversity robert@techfak.ni-bielefeld.de ourse on omputational RN Biology, Tübingen, March 2006 iegerich ourse:

More information

Sara C. Madeira. Universidade da Beira Interior. (Thanks to Ana Teresa Freitas, IST for useful resources on this subject)

Sara C. Madeira. Universidade da Beira Interior. (Thanks to Ana Teresa Freitas, IST for useful resources on this subject) Bioinformática Sequence Alignment Pairwise Sequence Alignment Universidade da Beira Interior (Thanks to Ana Teresa Freitas, IST for useful resources on this subject) 1 16/3/29 & 23/3/29 27/4/29 Outline

More information

6 Introduction to Population Genetics

6 Introduction to Population Genetics Grundlagen der Bioinformatik, SoSe 14, D. Huson, May 18, 2014 67 6 Introduction to Population Genetics This chapter is based on: J. Hein, M.H. Schierup and C. Wuif, Gene genealogies, variation and evolution,

More information

Tools and Algorithms in Bioinformatics

Tools and Algorithms in Bioinformatics Tools and Algorithms in Bioinformatics GCBA815, Fall 2013 Week3: Blast Algorithm, theory and practice Babu Guda, Ph.D. Department of Genetics, Cell Biology & Anatomy Bioinformatics and Systems Biology

More information