Supplemental Information for Pramila et al. Periodic Normal Mixture Model (PNM)
|
|
- Daisy Harmon
- 6 years ago
- Views:
Transcription
1 Supplemental Information for Pramila et al. Periodic Normal Mixture Model (PNM) The data sets alpha30 and alpha38 were analyzed with PNM (Lu et al. 2004). The first two time points were deleted to alleviate block/release artifacts. Data points at 105 min were deleted from both data sets due to unsatisfactory hybridization. From each data set, the 1000 least variable genes and genes with more than 25% missing data were discarded. The remaining genes were centered and normalized with mean 0 and standard deviation of 1. The gene expression profiles were fitted with a linear combination of six sinusoidal functions. The gene-specific Fourier decomposition coefficients and the cell cycle rate were iteratively estimated until the top 100 genes with the least fitting residuals and the estimated cell cycle rate became stable. Permutation Based Statistical Method (PBM) Alpha30 and alpha38 data were combined for PBM analysis with three other data sets previously used for identifying periodic transcripts in the budding yeast genome (Cho et al. 1998; Spellman et al. 1998) involving three different induced synchrony protocols: two temperature sensitive mutations that arrest the cells in G1 or late M phase at 37 C (cdc28 and cdc15) and the mating pheromone alpha factor. This method ranks each transcript by combining two permutation-based statistical tests for periodicity and magnitude of oscillation, respectively (de Lichtenberg 2005). The scoring penalizes genes that only display one property, i.e. high amplitude fluctuations with no periodicity or very low amplitude periodic oscillation. The method also computes a gene specific number called the "peak time" for each transcript in each data set, describing when in the cell cycle the gene is maximally expressed. To compare the peak timing across experiments, the time scales are transformed to percent of the cell cycle by dividing by the cell cycle span calculated by Zhao et al (Zhao et al. 2001) for each type of synchrony, and then aligning different experiments relative to each other (de Lichtenberg 2005). Zero was set based on the peak time of 19 M/G1-specific genes (de Lichtenberg 2005). In addition, a combined peak time is calculated as a weighted average from all five data sets and the error associated with that calculation is also provided. Peak times are expressed as percent of the cell cycle and zero is set at the M/G1 boundary. Ranks, peak times for individual data sets, and the weighted average peak time with the error associated with that calculation are provided at our website ( Promoter region alignments Upstream sequences from seven yeast species were collected from two previous studies (Cliften et al. 2003; Kellis et al. 2003). We wanted to generate high-quality alignments of these upstream regions, and were willing to discard unalignable sequences in order to achieve this goal. Accordingly, we designed an iterative procedure that produces 1
2 upstream region alignments with pairwise percent sequence identity above 40%. The procedure first removes leading single-sequence columns from the alignment, which occur frequently because the upstream regions are often of widely varying length. Thereafter, if any sequence matches poorly to the rest of the alignment, that sequence is removed, and the alignment is re-computed. The resulting collection of alignments is available upon request. Yeast phylogeny We inferred a phylogenetic tree among the seven yeast species from alignments of the coding sequences for three proteins. We selected the Mcm proteins, and used only those that occur unambiguously in all seven species: MCM2, CDC47 and CDC54. The concatenated alignment, consisting of 3201 columns, was analyzed using fastdnaml (olsen:fastdnaml) with the default parameters. PhyME We searched for motifs using PhyME (Sinha et al. 2004), which takes into account aligned orthologous sequences, as well as the phylogenetic species tree. For each gene, we extracted 800 bp upstream of the start codon from each sensu stricto species (cerevisiae, mikitae, kudravzevii, bayanus, and paradoxus). These sequences were then aligned using Lagan (Brudno et al. 2003), which is distributed with PhyME. For a given group of genes with similar peak times, we searched the corresponding alignments for the top three motifs, using the following options: revcompw ot 0.3 maxsites 6. Motiph For the purposes of this study, we implemented a program called Motiph that scans a multiple alignment for occurrences of a given motif, taking into account the phylogenetic tree relating the species in the alignment. Motiph was motivated by the notion of phylogenetic shadowing (Boffelli et al. 2003), and the Motiph program is thus similar to Monkey (Moses et al. 2004). Given an alignment, a motif matrix and a tree, Motiph calculates for each position in the alignment the probability of the given motif, taking into account the phylogenetic tree. This probability is the sum of all possible 2
3 evolutionary histories (i.e., all possible assignments of nucleotides to the internal nodes of the tree), with the given motif at the root of the tree. Motiph reports a log-odds score, in which the numerator is this probability (computed using a functional evolutionary rate of 1), and the denominator is a similar probability computed using a motif of background probabilities and a non-functional evolutionary rate of 1.2. Supplemental Fig.1 Figure S1A, the percentage of identified known periodic genes (total 127) is plotted against the percentage of total genes as we lower the posterior probability or periodicity calculated by PNM (Lu et al. 2004). Both the alpha30 and the alpha38 data sets identify known periodic transcripts slightly better than the first alpha factor data set (Spellman et al. 1998). Figure S1B demonstrates that adding additional microarray data sets improves the rate at which known periodic genes are identified. These results also show that it is beneficial to include data sets generated with different synchronization methods. The combination of all five available data sets (PNM5) performs the best. PNM2 uses alpha30 and alpha38 data. PNM3 adds the third alpha factor set (Spellman et al. 1998). PNM4 adds the cdc28 arrest synchrony (Cho et al. 1998), and PNM5 analyzes those four plus the cdc15 data set (Spellman et al. 1998). Diamond and cross symbols indicate the positions on each plot representing the probability thresholds of and 0.95, respectively. Supplemental Fig. 2 3
4 Figure S2 plots the enrichment of known periodic genes within the ranked list of genes identified by PBM5 and PNM5 as in Fig. S1. This result is consistent with the previous comparison made between PNM and PBM using three data sets (de Lichtenberg 2005). Table S1: Position-specific probability matrix used to search for the presence of Hcm1 binding sites. Rows are positions, and columns correspond to A, C, G and T. A C G T Supplemental Fig. 3 4
5 5
6 Fig. S3. Cell cycle microarray data for 180 genes with conserved Hcm1 binding sites in their promoters and that are ranked as periodic by both PNM5 and PBM5 have been extracted from the alpha30 data set and visualized in PRISM (Wu and Noble 2004). Each row represents one gene, named to the right. The transcript profiles are ordered by their average peak time, which is also indicated (peak(combined)). Peaks are magenta, troughs are cyan. Common names and short descriptions are downloaded from the Saccharomyces Genome Database. Boffelli, D., McAuliffe, J., Ovcharenko, D., Lewis, K.D., Ovcharenko, I., Pachter, L., and Rubin, E.M Phylogenetic shadowing of primate sequences to find functional regions of the human genome. Science 299: Brudno, M., Do, C.B., Cooper, G.M., Kim, M.F., Davydov, E., Green, E.D., Sidow, A., and Batzoglou, S LAGAN and Multi-LAGAN: efficient tools for large-scale multiple alignment of genomic DNA. Genome Res 13: Cho, R.J., Campbell, M.J., Winzeler, E.A., Steinmetz, L., Conway, A., Wodicka, L., Wolfsberg, T.G., Gabrielian, A.E., Landsman, D., Lockhart, D.J., and Davis, R.W A genome-wide transcriptional analysis of the mitotic cell cycle. Mol. Cell 2: Cliften, P., Sudarsanam, P., Desikan, A., Fulton, L., Fulton, B., Majors, J., Waterston, R., Cohen, B.A., and Johnston, M Finding functional features in Saccharomyces genomes by phylogenetic footprinting. Science 301: de Lichtenberg, U., L.J. Jensen, A. Fausboll, T.S. Jensen, P. Bork and S. Brunak Comparison of computational methods for the identification of cell cycleregulated genes. Bioinformatics 21: Kellis, M., Patterson, N., Endrizzi, M., Birren, B., and Lander, E.S Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature 423: Lu, X., Zhang, W., Qin, Z.S., Kwast, K.E., and Liu, J.S Statistical resynchronization and Bayesian detection of periodically expressed genes. Nucleic Acids Res 32: Moses, A.M., Chiang, D.Y., Pollard, D.A., Iyer, V.N., and Eisen, M.B MONKEY: identifying conserved transcription-factor binding sites in multiple alignments using a binding site-specific evolutionary model. Genome Biol 5: R98. Sinha, S., Blanchette, M., and Tompa, M PhyME: a probabilistic algorithm for finding motifs in sets of orthologous sequences. BMC Bioinformatics 5: 170. Spellman, P.T., Sherlock, G., Zhang, M.Q., Iyer, V.R., Anders, K., Eisen, M.B., Brown, P.O., Botstein, D., and Futcher, B Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol. Biol. Cell 9: Wu, W. and Noble, W.S Genomic data visualization on the Web. Bioinformatics 20: Zhao, L.P., Prentice, R., and Breeden, L Statistical modeling of large microarray data sets to identify stimulus-response profiles. Proc. Natl. Acad. Sci. USA 98:
Analyzing Microarray Time course Genome wide Data
OR 779 Functional Data Analysis Course Project Analyzing Microarray Time course Genome wide Data Presented by Xin Zhao April 29, 2002 Cornell University Overview 1. Introduction Biological Background Biological
More informationKernels for gene regulatory regions
Kernels for gene regulatory regions Jean-Philippe Vert Geostatistics Center Ecole des Mines de Paris - ParisTech Jean-Philippe.Vert@ensmp.fr Robert Thurman Division of Medical Genetics University of Washington
More informationTopographic Independent Component Analysis of Gene Expression Time Series Data
Topographic Independent Component Analysis of Gene Expression Time Series Data Sookjeong Kim and Seungjin Choi Department of Computer Science Pohang University of Science and Technology San 31 Hyoja-dong,
More informationEstimation of Identification Methods of Gene Clusters Using GO Term Annotations from a Hierarchical Cluster Tree
Estimation of Identification Methods of Gene Clusters Using GO Term Annotations from a Hierarchical Cluster Tree YOICHI YAMADA, YUKI MIYATA, MASANORI HIGASHIHARA*, KENJI SATOU Graduate School of Natural
More informationAdvances in microarray technologies (1 5) have enabled
Statistical modeling of large microarray data sets to identify stimulus-response profiles Lue Ping Zhao*, Ross Prentice*, and Linda Breeden Divisions of *Public Health Sciences and Basic Sciences, Fred
More informationProtein Feature Based Identification of Cell Cycle Regulated Proteins in Yeast
Supplementary Information for Protein Feature Based Identification of Cell Cycle Regulated Proteins in Yeast Ulrik de Lichtenberg, Thomas S. Jensen, Lars J. Jensen and Søren Brunak Journal of Molecular
More informationDeciphering the cis-regulatory network of an organism is a
Identifying the conserved network of cis-regulatory sites of a eukaryotic genome Ting Wang and Gary D. Stormo* Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110 Edited
More informationMultiple Alignment of Genomic Sequences
Ross Metzger June 4, 2004 Biochemistry 218 Multiple Alignment of Genomic Sequences Genomic sequence is currently available from ENTREZ for more than 40 eukaryotic and 157 prokaryotic organisms. As part
More informationTata Pramila, 1 Wei Wu, 1,2,3 Shawna Miles, 1 William Stafford Noble, 2,5 and Linda L. Breeden 1,4
The Forkhead transcription factor Hcm1 regulates chromosome segregation genes and fills the S-phase gap in the transcriptional circuitry of the cell cycle Tata Pramila, 1 Wei Wu, 1,2,3 Shawna Miles, 1
More informationCluster Analysis of Gene Expression Microarray Data. BIOL 495S/ CS 490B/ MATH 490B/ STAT 490B Introduction to Bioinformatics April 8, 2002
Cluster Analysis of Gene Expression Microarray Data BIOL 495S/ CS 490B/ MATH 490B/ STAT 490B Introduction to Bioinformatics April 8, 2002 1 Data representations Data are relative measurements log 2 ( red
More informationUses of the Singular Value Decompositions in Biology
Uses of the Singular Value Decompositions in Biology The Singular Value Decomposition is a very useful tool deeply rooted in linear algebra. Despite this, SVDs have found their way into many different
More informationBioinformatics. Transcriptome
Bioinformatics Transcriptome Jacques.van.Helden@ulb.ac.be Université Libre de Bruxelles, Belgique Laboratoire de Bioinformatique des Génomes et des Réseaux (BiGRe) http://www.bigre.ulb.ac.be/ Bioinformatics
More informationLecture 5: November 19, Minimizing the maximum intracluster distance
Analysis of DNA Chips and Gene Networks Spring Semester, 2009 Lecture 5: November 19, 2009 Lecturer: Ron Shamir Scribe: Renana Meller 5.1 Minimizing the maximum intracluster distance 5.1.1 Introduction
More informationA model of the statistical power of comparative genome sequence analysis
Washington University School of Medicine Digital Commons@Becker Open Access Publications 2005 A model of the statistical power of comparative genome sequence analysis Sean R. Eddy Washington University
More informationContact 1 University of California, Davis, 2 Lawrence Berkeley National Laboratory, 3 Stanford University * Corresponding authors
Phylo-VISTA: Interactive Visualization of Multiple DNA Sequence Alignments Nameeta Shah 1,*, Olivier Couronne 2,*, Len A. Pennacchio 2, Michael Brudno 3, Serafim Batzoglou 3, E. Wes Bethel 2, Edward M.
More informationThe Continuum Model of the Eukaryotic Cell Cycle: Application to G1-phase control, Rb phosphorylation, Microarray Analysis of Gene Expression,
The Continuum Model of the Eukaryotic Cell Cycle: Application to G1-phase control, Rb phosphorylation, Microarray Analysis of Gene Expression, and Cell Synchronization Stephen Cooper Department of Microbiology
More informationGenome-wide co-occurrence of promoter elements reveals a cisregulatory. cassette of rrna transcription motifs in S. cerevisiae
Genome-wide co-occurrence of promoter elements reveals a cisregulatory cassette of rrna transcription motifs in S. cerevisiae Priya Sudarsanam #, Yitzhak Pilpel #, and George M. Church * Department of
More informationExploratory statistical analysis of multi-species time course gene expression
Exploratory statistical analysis of multi-species time course gene expression data Eng, Kevin H. University of Wisconsin, Department of Statistics 1300 University Avenue, Madison, WI 53706, USA. E-mail:
More informationSupporting Information
Supporting Information Weghorn and Lässig 10.1073/pnas.1210887110 SI Text Null Distributions of Nucleosome Affinity and of Regulatory Site Content. Our inference of selection is based on a comparison of
More informationPattern Recognition Letters
Pattern Recognition Letters 31 (2010) 2133 2137 Contents lists available at ScienceDirect Pattern Recognition Letters journal homepage: www.elsevier.com/locate/patrec Building gene networks with time-delayed
More informationIntroduction to Bioinformatics
CSCI8980: Applied Machine Learning in Computational Biology Introduction to Bioinformatics Rui Kuang Department of Computer Science and Engineering University of Minnesota kuang@cs.umn.edu History of Bioinformatics
More informationTranscription Regulatory Networks in Yeast Cell Cycle
Transcription Regulatory Networks in Yeast Cell Cycle Nilanjana Banerjee 1 and Michael Q. Zhang 1* 1 Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 * to whom correspondence
More informationT H E J O U R N A L O F C E L L B I O L O G Y
T H E J O U R N A L O F C E L L B I O L O G Y Supplemental material Breker et al., http://www.jcb.org/cgi/content/full/jcb.201301120/dc1 Figure S1. Single-cell proteomics of stress responses. (a) Using
More informationPhyloGibbs: A Gibbs Sampling Motif Finder That Incorporates Phylogeny
PhyloGibbs: A Gibbs Sampling Motif Finder That Incorporates Phylogeny Rahul Siddharthan 1,2, Eric D. Siggia 1, Erik van Nimwegen 1,3* 1 Center for Studies in Physics and Biology, The Rockefeller University,
More informationGraph Alignment and Biological Networks
Graph Alignment and Biological Networks Johannes Berg http://www.uni-koeln.de/ berg Institute for Theoretical Physics University of Cologne Germany p.1/12 Networks in molecular biology New large-scale
More informationAlignment. Peak Detection
ChIP seq ChIP Seq Hongkai Ji et al. Nature Biotechnology 26: 1293-1300. 2008 ChIP Seq Analysis Alignment Peak Detection Annotation Visualization Sequence Analysis Motif Analysis Alignment ELAND Bowtie
More informationLars Juhl Jensen*, Ulrik de Lichtenberg, Thomas Skøt Jensen, Søren Brunak and Peer Bork*
Correspondence Circular reasoning rather than cyclic expression Lars Juhl Jensen*, Ulrik de Lichtenberg, Thomas Skøt Jensen, Søren Brunak and Peer Bork* A response to Combined analysis reveals a core set
More informationGene regulation: From biophysics to evolutionary genetics
Gene regulation: From biophysics to evolutionary genetics Michael Lässig Institute for Theoretical Physics University of Cologne Thanks Ville Mustonen Johannes Berg Stana Willmann Curt Callan (Princeton)
More informationCase story: Analysis of the Cell Cycle
DNA microarray analysis, January 2 nd 2006 Case story: Analysis of the Cell Cycle Center for Biological Sequence Analysis Outline Introduction Cell division and cell cycle regulation Experimental studies
More informationTiffany Samaroo MB&B 452a December 8, Take Home Final. Topic 1
Tiffany Samaroo MB&B 452a December 8, 2003 Take Home Final Topic 1 Prior to 1970, protein and DNA sequence alignment was limited to visual comparison. This was a very tedious process; even proteins with
More informationRegulatory Element Detection using a Probabilistic Segmentation Model
Regulatory Element Detection using a Probabilistic Segmentation Model Harmen J Bussemaker 1, Hao Li 2,3, and Eric D Siggia 2,4 1 Swammerdam Institute for Life Sciences and Amsterdam Center for Computational
More informationTHEORY. Based on sequence Length According to the length of sequence being compared it is of following two types
Exp 11- THEORY Sequence Alignment is a process of aligning two sequences to achieve maximum levels of identity between them. This help to derive functional, structural and evolutionary relationships between
More informationBAYESIAN META-ANALYSIS FOR IDENTIFYING PERIODICALLY EXPRESSED GENES IN FISSION YEAST CELL CYCLE
Submitted to the Annals of Applied Statistics BAYESIAN META-ANALYSIS FOR IDENTIFYING PERIODICALLY EXPRESSED GENES IN FISSION YEAST CELL CYCLE By Xiaodan Fan, Saumyadipta Pyne and Jun S. Liu Harvard University,
More informationModeling Gene Expression from Microarray Expression Data with State-Space Equations. F.X. Wu, W.J. Zhang, and A.J. Kusalik
Modeling Gene Expression from Microarray Expression Data with State-Space Equations FX Wu, WJ Zhang, and AJ Kusalik Pacific Symposium on Biocomputing 9:581-592(2004) MODELING GENE EXPRESSION FROM MICROARRAY
More informationAn Introduction to Sequence Similarity ( Homology ) Searching
An Introduction to Sequence Similarity ( Homology ) Searching Gary D. Stormo 1 UNIT 3.1 1 Washington University, School of Medicine, St. Louis, Missouri ABSTRACT Homologous sequences usually have the same,
More informationComparative Network Analysis
Comparative Network Analysis BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 2016 Anthony Gitter gitter@biostat.wisc.edu These slides, excluding third-party material, are licensed under CC BY-NC 4.0 by
More informationMarkov Models & DNA Sequence Evolution
7.91 / 7.36 / BE.490 Lecture #5 Mar. 9, 2004 Markov Models & DNA Sequence Evolution Chris Burge Review of Markov & HMM Models for DNA Markov Models for splice sites Hidden Markov Models - looking under
More informationINTERACTIVE CLUSTERING FOR EXPLORATION OF GENOMIC DATA
INTERACTIVE CLUSTERING FOR EXPLORATION OF GENOMIC DATA XIUFENG WAN xw6@cs.msstate.edu Department of Computer Science Box 9637 JOHN A. BOYLE jab@ra.msstate.edu Department of Biochemistry and Molecular Biology
More informationOpinion Multi-species sequence comparison: the next frontier in genome annotation Inna Dubchak* and Kelly Frazer
Opinion Multi-species sequence comparison: the next frontier in genome annotation Inna Dubchak* and Kelly Frazer Addresses: *Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720,
More informationMotivating the need for optimal sequence alignments...
1 Motivating the need for optimal sequence alignments... 2 3 Note that this actually combines two objectives of optimal sequence alignments: (i) use the score of the alignment o infer homology; (ii) use
More informationGenome-wide evolutionary rates in laboratory and wild yeast
Genetics: Published Articles Ahead of Print, published on July 2, 2006 as 10.1534/genetics.106.060863 Genome-wide evolutionary rates in laboratory and wild yeast James Ronald *, Hua Tang, and Rachel B.
More informationResearch Proposal. Title: Multiple Sequence Alignment used to investigate the co-evolving positions in OxyR Protein family.
Research Proposal Title: Multiple Sequence Alignment used to investigate the co-evolving positions in OxyR Protein family. Name: Minjal Pancholi Howard University Washington, DC. June 19, 2009 Research
More informationIntro Gene regulation Synteny The End. Today. Gene regulation Synteny Good bye!
Today Gene regulation Synteny Good bye! Gene regulation What governs gene transcription? Genes active under different circumstances. Gene regulation What governs gene transcription? Genes active under
More informationAlthough significant advances have been made in the past 10
Reliable prediction of transcription factor binding sites by phylogenetic verification Xiaoman Li a,b, Sheng Zhong c, and Wing H. Wong a a Department of Statistics, Stanford University, Sequoia Hall, 390
More informationPROPERTIES OF A SINGULAR VALUE DECOMPOSITION BASED DYNAMICAL MODEL OF GENE EXPRESSION DATA
Int J Appl Math Comput Sci, 23, Vol 13, No 3, 337 345 PROPERIES OF A SINGULAR VALUE DECOMPOSIION BASED DYNAMICAL MODEL OF GENE EXPRESSION DAA KRZYSZOF SIMEK Institute of Automatic Control, Silesian University
More information10-810: Advanced Algorithms and Models for Computational Biology. microrna and Whole Genome Comparison
10-810: Advanced Algorithms and Models for Computational Biology microrna and Whole Genome Comparison Central Dogma: 90s Transcription factors DNA transcription mrna translation Proteins Central Dogma:
More informationModelling Gene Expression Data over Time: Curve Clustering with Informative Prior Distributions.
BAYESIAN STATISTICS 7, pp. 000 000 J. M. Bernardo, M. J. Bayarri, J. O. Berger, A. P. Dawid, D. Heckerman, A. F. M. Smith and M. West (Eds.) c Oxford University Press, 2003 Modelling Data over Time: Curve
More informationDe novo identification of motifs in one species. Modified from Serafim Batzoglou s lecture notes
De novo identification of motifs in one species Modified from Serafim Batzoglou s lecture notes Finding Regulatory Motifs... Given a collection of genes that may be regulated by the same transcription
More informationStatistical Machine Learning Methods for Bioinformatics II. Hidden Markov Model for Biological Sequences
Statistical Machine Learning Methods for Bioinformatics II. Hidden Markov Model for Biological Sequences Jianlin Cheng, PhD Department of Computer Science University of Missouri 2008 Free for Academic
More informationModeling Motifs Collecting Data (Measuring and Modeling Specificity of Protein-DNA Interactions)
Modeling Motifs Collecting Data (Measuring and Modeling Specificity of Protein-DNA Interactions) Computational Genomics Course Cold Spring Harbor Labs Oct 31, 2016 Gary D. Stormo Department of Genetics
More informationChapter 7: Regulatory Networks
Chapter 7: Regulatory Networks 7.2 Analyzing Regulation Prof. Yechiam Yemini (YY) Computer Science Department Columbia University The Challenge How do we discover regulatory mechanisms? Complexity: hundreds
More informationMissing Value Estimation for Time Series Microarray Data Using Linear Dynamical Systems Modeling
22nd International Conference on Advanced Information Networking and Applications - Workshops Missing Value Estimation for Time Series Microarray Data Using Linear Dynamical Systems Modeling Connie Phong
More informationMultiple Genome Alignment by Clustering Pairwise Matches
Multiple Genome Alignment by Clustering Pairwise Matches Jeong-Hyeon Choi 1,3, Kwangmin Choi 1, Hwan-Gue Cho 3, and Sun Kim 1,2 1 School of Informatics, Indiana University, IN 47408, USA, {jeochoi,kwchoi,sunkim}@bio.informatics.indiana.edu
More informationA Case Study -- Chu et al. The Transcriptional Program of Sporulation in Budding Yeast. What is Sporulation? CSE 527, W.L. Ruzzo 1
A Case Study -- Chu et al. An interesting early microarray paper My goals Show arrays used in a real experiment Show where computation is important Start looking at analysis techniques The Transcriptional
More informationSequence analysis and comparison
The aim with sequence identification: Sequence analysis and comparison Marjolein Thunnissen Lund September 2012 Is there any known protein sequence that is homologous to mine? Are there any other species
More informationDivergence Pattern of Duplicate Genes in Protein-Protein Interactions Follows the Power Law
Divergence Pattern of Duplicate Genes in Protein-Protein Interactions Follows the Power Law Ze Zhang,* Z. W. Luo,* Hirohisa Kishino,à and Mike J. Kearsey *School of Biosciences, University of Birmingham,
More informationLearning in Bayesian Networks
Learning in Bayesian Networks Florian Markowetz Max-Planck-Institute for Molecular Genetics Computational Molecular Biology Berlin Berlin: 20.06.2002 1 Overview 1. Bayesian Networks Stochastic Networks
More informationTree-Dependent Components of Gene Expression Data for Clustering
Tree-Dependent Components of Gene Expression Data for Clustering Jong Kyoung Kim and Seungjin Choi Department of Computer Science Pohang University of Science and Technology San 31 Hyoja-dong, Nam-gu Pohang
More informationFuzzy Clustering of Gene Expression Data
Fuzzy Clustering of Gene Data Matthias E. Futschik and Nikola K. Kasabov Department of Information Science, University of Otago P.O. Box 56, Dunedin, New Zealand email: mfutschik@infoscience.otago.ac.nz,
More informationInDel 3-5. InDel 8-9. InDel 3-5. InDel 8-9. InDel InDel 8-9
Lecture 5 Alignment I. Introduction. For sequence data, the process of generating an alignment establishes positional homologies; that is, alignment provides the identification of homologous phylogenetic
More informationInferring Gene Regulatory Networks from Time-Ordered Gene Expression Data of Bacillus Subtilis Using Differential Equations
Inferring Gene Regulatory Networks from Time-Ordered Gene Expression Data of Bacillus Subtilis Using Differential Equations M.J.L. de Hoon, S. Imoto, K. Kobayashi, N. Ogasawara, S. Miyano Pacific Symposium
More informationAlgorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment
Algorithms in Bioinformatics FOUR Sami Khuri Department of Computer Science San José State University Pairwise Sequence Alignment Homology Similarity Global string alignment Local string alignment Dot
More informationGibbs Sampling Methods for Multiple Sequence Alignment
Gibbs Sampling Methods for Multiple Sequence Alignment Scott C. Schmidler 1 Jun S. Liu 2 1 Section on Medical Informatics and 2 Department of Statistics Stanford University 11/17/99 1 Outline Statistical
More information17 Non-collinear alignment Motivation A B C A B C A B C A B C D A C. This exposition is based on:
17 Non-collinear alignment This exposition is based on: 1. Darling, A.E., Mau, B., Perna, N.T. (2010) progressivemauve: multiple genome alignment with gene gain, loss and rearrangement. PLoS One 5(6):e11147.
More informationPredicting Protein Functions and Domain Interactions from Protein Interactions
Predicting Protein Functions and Domain Interactions from Protein Interactions Fengzhu Sun, PhD Center for Computational and Experimental Genomics University of Southern California Outline High-throughput
More informationAn Introduction to the spls Package, Version 1.0
An Introduction to the spls Package, Version 1.0 Dongjun Chung 1, Hyonho Chun 1 and Sündüz Keleş 1,2 1 Department of Statistics, University of Wisconsin Madison, WI 53706. 2 Department of Biostatistics
More informationComputational Genomics. Systems biology. Putting it together: Data integration using graphical models
02-710 Computational Genomics Systems biology Putting it together: Data integration using graphical models High throughput data So far in this class we discussed several different types of high throughput
More informationComparative Genomics. Primer. Ross C. Hardison
Primer Comparative Genomics Ross C. Hardison A complete genome sequence of an organism can be considered to be the ultimate genetic map, in the sense that the heritable characteristics are encoded within
More informationMatrix-based pattern discovery algorithms
Regulatory Sequence Analysis Matrix-based pattern discovery algorithms Jacques.van.Helden@ulb.ac.be Université Libre de Bruxelles, Belgique Laboratoire de Bioinformatique des Génomes et des Réseaux (BiGRe)
More informationRule learning for gene expression data
Rule learning for gene expression data Stefan Enroth Original slides by Torgeir R. Hvidsten The Linnaeus Centre for Bioinformatics Predicting biological process from gene expression time profiles Papers:
More informationWhole Genome Alignments and Synteny Maps
Whole Genome Alignments and Synteny Maps IINTRODUCTION It was not until closely related organism genomes have been sequenced that people start to think about aligning genomes and chromosomes instead of
More informationCLUSTER, FUNCTION AND PROMOTER: ANALYSIS OF YEAST EXPRESSION ARRAY
CLUSTER, FUNCTION AND PROMOTER: ANALYSIS OF YEAST EXPRESSION ARRAY J. ZHU, M. Q. ZHANG Cold Spring Harbor Lab, P. O. Box 100 Cold Spring Harbor, NY 11724 Gene clusters could be derived based on expression
More informationO 3 O 4 O 5. q 3. q 4. Transition
Hidden Markov Models Hidden Markov models (HMM) were developed in the early part of the 1970 s and at that time mostly applied in the area of computerized speech recognition. They are first described in
More informationOn the Monotonicity of the String Correction Factor for Words with Mismatches
On the Monotonicity of the String Correction Factor for Words with Mismatches (extended abstract) Alberto Apostolico Georgia Tech & Univ. of Padova Cinzia Pizzi Univ. of Padova & Univ. of Helsinki Abstract.
More informationGOSAP: Gene Ontology Based Semantic Alignment of Biological Pathways
GOSAP: Gene Ontology Based Semantic Alignment of Biological Pathways Jonas Gamalielsson and Björn Olsson Systems Biology Group, Skövde University, Box 407, Skövde, 54128, Sweden, [jonas.gamalielsson][bjorn.olsson]@his.se,
More informationCISC 636 Computational Biology & Bioinformatics (Fall 2016)
CISC 636 Computational Biology & Bioinformatics (Fall 2016) Predicting Protein-Protein Interactions CISC636, F16, Lec22, Liao 1 Background Proteins do not function as isolated entities. Protein-Protein
More informationSequence Database Search Techniques I: Blast and PatternHunter tools
Sequence Database Search Techniques I: Blast and PatternHunter tools Zhang Louxin National University of Singapore Outline. Database search 2. BLAST (and filtration technique) 3. PatternHunter (empowered
More informationHomolog. Orthologue. Comparative Genomics. Paralog. What is Comparative Genomics. What is Comparative Genomics
Orthologue Orthologs are genes in different species that evolved from a common ancestral gene by speciation. Normally, orthologs retain the same function in the course of evolution. Identification of orthologs
More informationPARAMETER UNCERTAINTY QUANTIFICATION USING SURROGATE MODELS APPLIED TO A SPATIAL MODEL OF YEAST MATING POLARIZATION
Ching-Shan Chou Department of Mathematics Ohio State University PARAMETER UNCERTAINTY QUANTIFICATION USING SURROGATE MODELS APPLIED TO A SPATIAL MODEL OF YEAST MATING POLARIZATION In systems biology, we
More informationLearning Causal Networks from Microarray Data
Learning Causal Networks from Microarray Data Nasir Ahsan Michael Bain John Potter Bruno Gaëta Mark Temple Ian Dawes School of Computer Science and Engineering School of Biotechnology and Biomolecular
More informationGenomic dissection of the cell-type-specification circuit in Saccharomyces cerevisiae
Genomic dissection of the cell-type-specification circuit in Saccharomyces cerevisiae David J. Galgoczy*, Ann Cassidy-Stone, Manuel Llinás*, Sean M. O Rourke*, Ira Herskowitz*, Joseph L. DeRisi*, and Alexander
More informationPrediction o f of cis -regulatory B inding Binding Sites and Regulons 11/ / /
Prediction of cis-regulatory Binding Sites and Regulons 11/21/2008 Mapping predicted motifs to their cognate TFs After prediction of cis-regulatory motifs in genome, we want to know what are their cognate
More informationRandom Boolean Networks
Random Boolean Networks Boolean network definition The first Boolean networks were proposed by Stuart A. Kauffman in 1969, as random models of genetic regulatory networks (Kauffman 1969, 1993). A Random
More informationMultiple Sequence Alignment, Gunnar Klau, December 9, 2005, 17:
Multiple Sequence Alignment, Gunnar Klau, December 9, 2005, 17:50 5001 5 Multiple Sequence Alignment The first part of this exposition is based on the following sources, which are recommended reading:
More informationHMMs and biological sequence analysis
HMMs and biological sequence analysis Hidden Markov Model A Markov chain is a sequence of random variables X 1, X 2, X 3,... That has the property that the value of the current state depends only on the
More informationComputational Genomics. Uses of evolutionary theory
Computational Genomics 10-810/02 810/02-710, Spring 2009 Model-based Comparative Genomics Eric Xing Lecture 14, March 2, 2009 Reading: class assignment Eric Xing @ CMU, 2005-2009 1 Uses of evolutionary
More informationSparse regularization for functional logistic regression models
Sparse regularization for functional logistic regression models Hidetoshi Matsui The Center for Data Science Education and Research, Shiga University 1-1-1 Banba, Hikone, Shiga, 5-85, Japan. hmatsui@biwako.shiga-u.ac.jp
More informationNucleosome Switching
Nucleosome Switching David J. Schwab, Robijn F. Bruinsma, Joseph Rudnick Department of Physics and Astronomy, University of California, Los Angeles, Los Angeles, CA, 90024 & Jonathan Widom Department of
More informationEvolution by duplication
6.095/6.895 - Computational Biology: Genomes, Networks, Evolution Lecture 18 Nov 10, 2005 Evolution by duplication Somewhere, something went wrong Challenges in Computational Biology 4 Genome Assembly
More informationSupplementary material to Whitney, K. D., B. Boussau, E. J. Baack, and T. Garland Jr. in press. Drift and genome complexity revisited. PLoS Genetics.
Supplementary material to Whitney, K. D., B. Boussau, E. J. Baack, and T. Garland Jr. in press. Drift and genome complexity revisited. PLoS Genetics. Tree topologies Two topologies were examined, one favoring
More informationOutline. Sequence-comparison methods. Buzzzzzzzz. Why compare sequences? Gerard Kleywegt Uppsala University
MB330 - January, 2006 Sequence-comparison methods erard Kleywegt Uppsala University Outline! Why compare sequences?! Dotplots! airwise sequence alignments &! Multiple sequence alignments! rofile methods!
More informationBIOINFORMATICS. Geometry of gene expression dynamics. S. A. Rifkin 1 and J. Kim INTRODUCTION SYSTEM
BIOINFORMATICS Vol. 18 no. 9 2002 Pages 1176 1183 Geometry of gene expression dynamics S. A. Rifkin 1 and J. Kim 1, 2, 3, 1 Department of Ecology and Evolutionary Biology, PO Box 208106, 2 Department of
More informationEffects of Gap Open and Gap Extension Penalties
Brigham Young University BYU ScholarsArchive All Faculty Publications 200-10-01 Effects of Gap Open and Gap Extension Penalties Hyrum Carroll hyrumcarroll@gmail.com Mark J. Clement clement@cs.byu.edu See
More informationIn the first paper, a new marginal approach is introduced for highdimensional cell-cycle microarray data with no replicates.
AN ABSTRACT OF THE DISSERTATION OF Guei-Feng Tsai for the degree of Doctor of Philosophy in Statistics presented on August 29, 2005. Title: Semiparametric Marginal and Mixed Models for Longitudinal Data
More informationMCMC: Markov Chain Monte Carlo
I529: Machine Learning in Bioinformatics (Spring 2013) MCMC: Markov Chain Monte Carlo Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2013 Contents Review of Markov
More informationHandling Rearrangements in DNA Sequence Alignment
Handling Rearrangements in DNA Sequence Alignment Maneesh Bhand 12/5/10 1 Introduction Sequence alignment is one of the core problems of bioinformatics, with a broad range of applications such as genome
More informationSUPPLEMENTARY INFORMATION
Supplementary information S1 (box). Supplementary Methods description. Prokaryotic Genome Database Archaeal and bacterial genome sequences were downloaded from the NCBI FTP site (ftp://ftp.ncbi.nlm.nih.gov/genomes/all/)
More informationShrinkage-Based Similarity Metric for Cluster Analysis of Microarray Data
Shrinkage-Based Similarity Metric for Cluster Analysis of Microarray Data Vera Cherepinsky, Jiawu Feng, Marc Rejali, and Bud Mishra, Courant Institute, New York University, 5 Mercer Street, New York, NY
More informationSuperstability of the yeast cell-cycle dynamics: Ensuring causality in the presence of biochemical stochasticity
Journal of Theoretical Biology 245 (27) 638 643 www.elsevier.com/locate/yjtbi Superstability of the yeast cell-cycle dynamics: Ensuring causality in the presence of biochemical stochasticity Stefan Braunewell,
More informationBasic Local Alignment Search Tool
Basic Local Alignment Search Tool Alignments used to uncover homologies between sequences combined with phylogenetic studies o can determine orthologous and paralogous relationships Local Alignment uses
More information