Bioinformatics. Transcriptome

Similar documents
Matrix-based pattern discovery algorithms

Introduction to clustering methods for gene expression data analysis

Introduction to clustering methods for gene expression data analysis

Computational Biology: Basics & Interesting Problems

Cluster Analysis of Gene Expression Microarray Data. BIOL 495S/ CS 490B/ MATH 490B/ STAT 490B Introduction to Bioinformatics April 8, 2002

Estimation of Identification Methods of Gene Clusters Using GO Term Annotations from a Hierarchical Cluster Tree

Theoretical distribution of PSSM scores

CONJOINT 541. Translating a Transcriptome at Specific Times and Places. David Morris. Department of Biochemistry

6.047 / Computational Biology: Genomes, Networks, Evolution Fall 2008

Computational Genomics. Reconstructing dynamic regulatory networks in multiple species

Introduction to Bioinformatics

Comparative Network Analysis

Rule learning for gene expression data

Integration of functional genomics data

Clustering & microarray technology

Matrix-based pattern matching

Bioinformatics 2. Yeast two hybrid. Proteomics. Proteomics

10-810: Advanced Algorithms and Models for Computational Biology. Optimal leaf ordering and classification

A Case Study -- Chu et al. The Transcriptional Program of Sporulation in Budding Yeast. What is Sporulation? CSE 527, W.L. Ruzzo 1

Computational Systems Biology

Position-specific scoring matrices (PSSM)

Chapter 15 Active Reading Guide Regulation of Gene Expression

Inferring Transcriptional Regulatory Networks from High-throughput Data

Eukaryotic Gene Expression

Proteomics. Yeast two hybrid. Proteomics - PAGE techniques. Data obtained. What is it?

Inferring Transcriptional Regulatory Networks from Gene Expression Data II

Complete all warm up questions Focus on operon functioning we will be creating operon models on Monday

T H E J O U R N A L O F C E L L B I O L O G Y

Substitution matrices

networks in molecular biology Wolfgang Huber

Analyzing Microarray Time course Genome wide Data

Biochip informatics-(i)

Supplemental Information for Pramila et al. Periodic Normal Mixture Model (PNM)

Dynamic optimisation identifies optimal programs for pathway regulation in prokaryotes. - Supplementary Information -

Lecture 2: Read about the yeast MAT locus in Molecular Biology of the Gene. Watson et al. Chapter 10. Plus section on yeast as a model system Read

Fuzzy Clustering of Gene Expression Data

Emergence of gene regulatory networks under functional constraints

The geneticist s questions. Deleting yeast genes. Functional genomics. From Wikipedia, the free encyclopedia

Simulation of Gene Regulatory Networks

Discovering molecular pathways from protein interaction and ge

Computational Genomics. Systems biology. Putting it together: Data integration using graphical models

Topic 4 - #14 The Lactose Operon

CLUSTER, FUNCTION AND PROMOTER: ANALYSIS OF YEAST EXPRESSION ARRAY

Written Exam 15 December Course name: Introduction to Systems Biology Course no

Principal component analysis (PCA) for clustering gene expression data

Lecture 5: November 19, Minimizing the maximum intracluster distance

Biology. Biology. Slide 1 of 26. End Show. Copyright Pearson Prentice Hall

Cellular Biophysics SS Prof. Manfred Radmacher

Improving the identification of differentially expressed genes in cdna microarray experiments

Regulation of Gene Expression

Microbiome: 16S rrna Sequencing 3/30/2018

The Research Plan. Functional Genomics Research Stream. Transcription Factors. Tuning In Is A Good Idea

Lecture 18 June 2 nd, Gene Expression Regulation Mutations

INTERACTIVE CLUSTERING FOR EXPLORATION OF GENOMIC DATA

Fitness constraints on horizontal gene transfer

Advances in microarray technologies (1 5) have enabled

UNIT 6 PART 3 *REGULATION USING OPERONS* Hillis Textbook, CH 11

Measuring TF-DNA interactions

Taxonomy. Content. How to determine & classify a species. Phylogeny and evolution

Shrinkage-Based Similarity Metric for Cluster Analysis of Microarray Data

Supplementary Information

Exhaustive search. CS 466 Saurabh Sinha

Exploratory statistical analysis of multi-species time course gene expression

Boolean models of gene regulatory networks. Matthew Macauley Math 4500: Mathematical Modeling Clemson University Spring 2016

SPOTTED cdna MICROARRAYS

Plant Molecular and Cellular Biology Lecture 8: Mechanisms of Cell Cycle Control and DNA Synthesis Gary Peter

Kernels for gene regulatory regions

BIOINFORMATICS. Quantitative characterization of the transcriptional regulatory network in the yeast cell cycle

Modelling Gene Expression Data over Time: Curve Clustering with Informative Prior Distributions.

(Lys), resulting in translation of a polypeptide without the Lys amino acid. resulting in translation of a polypeptide without the Lys amino acid.

CHAPTER : Prokaryotic Genetics

How much non-coding DNA do eukaryotes require?

Self Similar (Scale Free, Power Law) Networks (I)

UNIVERSITY OF YORK. BA, BSc, and MSc Degree Examinations Department : BIOLOGY. Title of Exam: Molecular microbiology

Introduction to Bioinformatics. Shifra Ben-Dor Irit Orr

A New Method to Build Gene Regulation Network Based on Fuzzy Hierarchical Clustering Methods

Co-ordination occurs in multiple layers Intracellular regulation: self-regulation Intercellular regulation: coordinated cell signalling e.g.

Chapters 12&13 Notes: DNA, RNA & Protein Synthesis

Support Vector Machine Classification of Microarray Gene Expression Data UCSC-CRL-99-09

Topographic Independent Component Analysis of Gene Expression Time Series Data

Proteomics Systems Biology

Dynamic modular architecture of protein-protein interaction networks beyond the dichotomy of date and party hubs

Genome-wide Gene Expression Profiling in Fission Yeast

Missing Value Estimation for Time Series Microarray Data Using Linear Dynamical Systems Modeling

Controlling Gene Expression

What is Systems Biology

Prokaryotic Gene Expression (Learning Objectives)

Bi 1x Spring 2014: LacI Titration

Cell cycle regulation in the budding yeast

13.4 Gene Regulation and Expression

Whole-genome analysis of GCN4 binding in S.cerevisiae

10-810: Advanced Algorithms and Models for Computational Biology. microrna and Whole Genome Comparison

16 The Cell Cycle. Chapter Outline The Eukaryotic Cell Cycle Regulators of Cell Cycle Progression The Events of M Phase Meiosis and Fertilization

Tiffany Samaroo MB&B 452a December 8, Take Home Final. Topic 1

Clustering and Network

Newly made RNA is called primary transcript and is modified in three ways before leaving the nucleus:

Xiaosi Zhang. A thesis submitted to the graduate faculty. in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE

Central postgenomic. Transcription regulation: a genomic network. Transcriptome: the set of all mrnas expressed in a cell at a specific time.

UNIT 5. Protein Synthesis 11/22/16

Sparse regularization for functional logistic regression models

Transcription:

Bioinformatics Transcriptome Jacques.van.Helden@ulb.ac.be Université Libre de Bruxelles, Belgique Laboratoire de Bioinformatique des Génomes et des Réseaux (BiGRe) http://www.bigre.ulb.ac.be/

Bioinformatics Transcriptome Jacques.van.Helden@ulb.ac.be Université Libre de Bruxelles, Belgique Laboratoire de Bioinformatique des Génomes et des Réseaux (BiGRe) http://www.bigre.ulb.ac.be/

Measuring the expression of all the genes of a genome derisi et al. (1997). Science 278: 680-686 In 1997, derisi and co-workers develop a method to measure the level of transcription of all the genes of a genome. The method allows to compare the concentrations of mrna of each gene between two experimental conditions Green channel: reference Red channel: test The intensity of a spot indicates the average concentration of the corresponding mrna in the two samples. The color of a spot indicates regulation: Red: up-regulated in the test, relative to the reference condition Green: down-regulated

DNA chip technology Cell culture, tissue,... RNA extraction Synthesis of fluorescent cdna Sample 1 Sample 2 RNA cdna RNA cdna Brightness Quantity Color Specificity yellowish reddish greenish not specific sample 1 - specific sample 2 - specific DNA chip Source: derisi et al., Science 1997

Scanning result slide from Peter Sterk

Complete microarray Source:DeRisi et al. (1997) Science, 278(5338), 680-6. derisi et al. (1997). Science 278: 680-686

DNA chips raw measurements Raw measurements Red intensity Red background Green intensity Green background Intensity background = level of expression Red in experimental conditions Green in control

DNA chips useful metrics The level of regulation is represented by the ratio r = red " red.bg green " green.bg r >1 r < 1 up-regulated down-regulated The log-ratio provides a more convenient statistic (we will see why during the course) log 2 is even more convenient because the scale is intuitive # red " red.bg & R = log 2 % ( $ green " green.bg' R < 0 down-regulated R > 0 up-regulated R > 1 regulated by a factor of 2 R > 2 regulated by a factor of 4 R > w regulated by a factor of 2 w

Time series At each time point, the expression level is compared to the control (log-ratio) Example: Nitrogen depletion ORF Gene 30 min 1 hour 2 hours 4 hours 8 hours 12 hours 1 days 2 days 3 days 5 days YAL001C TFC3 0.33 0.37 0.18 0.06 0.32 0.09-0.09 0.02-0.01-0.06 YAL002W VPS8 0.22 0.29-0.04-0.3 0.18 0.04 0.3-0.06-0.07-0.21 YAL003W EFB1-1.6-1.43-0.84-0.25-1.22-0.92-1.22-0.97-0.84-0.89 YAL004W YAL004W -0.81-0.63-1.43-0.85-0.88-0.68-0.59-1.01-0.68-0.71 YAL005C SSA1-0.81-0.74-1.36-0.97-0.94-0.86-1.56-1.89-1.89-1.06 YAL007C ERP2-0.42-0.3-0.38-0.22-0.81-1.15-1.6-1.89-2.06-2.4 YAL008W FUN14 0.23 0.25-0.01-0.42 0.58 0.52 0.34 0.23 0.7 0.24 YAL009W SPO7-0.1-0.01-0.13 0.02 0.15 0.26 0.59 0.26 0.38 1.1 YAL010C MDM10 0.39 0.26 0.18-0.14 0.31 0.28 0.11-0.04 0.11-0.14 YAL011W YAL011W -0.23-0.06 0.2-0.04 0.19-0.17 0.31-0.23 0.25-0.23 YAL012W CYS3 1.94 1.62 0.32 0.04-2.06-2.64-2.84-2.47-2.47-2.12 YAL013W DEP1-0.94-0.22-0.64-0.74-0.32-0.3 0.24-0.62-0.45-0.69 YAL014C YAL014C -0.01 0.28-0.45-0.58-0.06 0.16 0.73 0.54 0.61 0.11 YAL015C NTG1 0.37 0.56 0.01-0.14-0.15-0.14-0.25-0.07 0.16 0.03.................................... Source: Gasch et al (2000) Molecular Biology of the Cell 11:4241-4257

Examples of experimental conditions Presence/absence of a metabolite gal vs glucose Transcription factor mutants Yap1p over-expression TUP1 deletion Massive environmental changes rich versus minimal medium diauxic shift (7 time points during the shift) Cell differentiation sporulation mating type Cell cycle

Temporal profiles of expression derisi et al measured the level of expression of all the genes at 7 time points during the diauxic shit. The figure shows groups of genes show similar expression profiles, Some of these groups contain genes with similar function (e.g. coding for ribosomal proteins) Some of these groups have a common regulatory element in their promoter (e.g. stress response element). derisi et al. (1997). Science 278: 680-686

Cell cycle In 1998, Spellman and colleagues measure the expression of all yeast genes during the cell cycle. They detect 800 genes showing periodical fluctuations of expression. These genes can be sorted according to the peak of expression, in order to group genes induced during the different phases of the cell cycle (G1, S, G2, M). Spellman et al. (1998) Molecular Biology of the Cell 9:3273-3297

Gene expression data: hierarchical clustering Alpha cdc15 cdc28 Elu MCM CLB2 SIC1 MAT CLN2 Y' On the image, genes are clustered according to expression profiles, using Michael Eisen s software cluster (Eisen et al., PNAS 1998: 95, 14863-8). Strengths The profiles and the clusters are visible together Familiar to biologists (frequently used for phylogeny) Weaknesses Isomorphism: each node of the tree can be permuted vertical distance between genes does not reflect the real distance Where to set the cluster boundaries? The tree does not reflect the combinatorial aspect of regulation MET Spellman et al. (1998). Mol Biol Cell 9(12), 3273-97.

Gasch (2000) - gene response to environmental changes Gasch et al. (2000) measure the transcriptional response of yeast genes to various environmental changes 173 microarrays ~6000 genes per microarray

Classification of cancer types Microarrays are also used to select genes which will serve as molecular signatures to classify cancer types. These genes can then be used to establish a diagnostic for new patients. Golub et al. (1999). Science 286: 531-537