Sparse canonical correlation analysis, with applications to genomic data
|
|
- Stanley Bailey
- 6 years ago
- Views:
Transcription
1 Sparse canonical correlation analysis, with applications to genomic data Daniela M. Witten and Robert Tibshirani June 15, 2009
2 The framework Suppose that we have n observations on p 1 + p 2 variables, and the variables are naturally partitioned into two groups of p 1 and p 2 variables, respectively.
3 The framework Suppose that we have n observations on p 1 + p 2 variables, and the variables are naturally partitioned into two groups of p 1 and p 2 variables, respectively. Let X 1 R n p 1 correspond to the first set of variables, and let X 2 R n p 2 correspond to the second set of variables.
4 The framework Suppose that we have n observations on p 1 + p 2 variables, and the variables are naturally partitioned into two groups of p 1 and p 2 variables, respectively. Let X 1 R n p 1 correspond to the first set of variables, and let X 2 R n p 2 correspond to the second set of variables. Assume that the columns of X 1 and X 2 have been standardized to have mean zero and standard deviation one.
5 The data
6 Canonical correlation analysis Canonical correlation analysis (CCA) is a classical method in statistics.
7 Canonical correlation analysis Canonical correlation analysis (CCA) is a classical method in statistics. We can seek w 1 R p 1 and w 2 R p 2 that maximize correlation between X 1 w 1 and X 2 w 2 ; that is, maximize w1,w 2 w T 1 X T 1 X 2 w 2 subject to w T 1 X T 1 X 1 w 1 = w T 2 X T 2 X 2 w 2 = 1.
8 Canonical correlation analysis Canonical correlation analysis (CCA) is a classical method in statistics. We can seek w 1 R p 1 and w 2 R p 2 that maximize correlation between X 1 w 1 and X 2 w 2 ; that is, maximize w1,w 2 w T 1 X T 1 X 2 w 2 subject to w T 1 X T 1 X 1 w 1 = w T 2 X T 2 X 2 w 2 = 1. CCA is not appropriate when p 1, p 2 n or p 1, p 2 n.
9 The data when p 1, p 2 n
10 Sparse Canonical Correlation Analysis maximize w1,w 2 w T 1 X T 1 X 2 w 2 subject to w T 1 X T 1 X 1 w 1 = w T 2 X T 2 X 2 w 2 = 1. We impose L 1 constraints onto w 1 and w 2 : that is, w 1 1 c 1 and w 2 1 c 2. We assume that X T 1 X 1 = I and X T 2 X 2 = I.
11 Sparse Canonical Correlation Analysis The sparse CCA criterion is maximize w1,w 2 w T 1 X T 1 X 2 w 2 subject to w 1 2 1, w 2 2 1, w 1 1 c 1, w 2 1 c 2.
12 Sparse Canonical Correlation Analysis The sparse CCA criterion is maximize w1,w 2 w T 1 X T 1 X 2 w 2 subject to w 1 2 1, w 2 2 1, w 1 1 c 1, w 2 1 c 2. For c 1 and c 2 small, this results in w 1 and w 2 sparse: many of the elements of w 1 and w 2 will exactly equal zero.
13 Sparse CCA Algorithm 1. Begin with an intial value for w 2 R p Iterate until convergence: w 1 argmax w1 w1 T X T 1 X 2 w 2 subject to w 1 2 1, w 1 1 c 1. w 2 argmax w2 w1 T X T 1 X 2 w 2 subject to w 2 2 1, w 2 1 c 2.
14 Sparse CCA Algorithm 1. Begin with an intial value for w 2 R p Iterate until convergence: w 1 argmax w1 w1 T X T 1 X 2 w 2 subject to w 1 2 1, w 1 1 c 1. w 2 argmax w2 w1 T X T 1 X 2 w 2 subject to w 2 2 1, w 2 1 c 2. Each update takes the form w 1 S(XT 1 X 2w 2, 1 ) S(X T 1 X 2w 2, 1 ) 2 where 1 0 is chosen so that w 1 1 = c 1. Here, S is the soft-thresholding operator: S(x, a) = sgn(x)( x a) +.
15 Genomic Data Recently, it has become increasingly common for researchers to use multiple assays to obtain measurements on a single set of patient samples. For instance, a data set might consist of gene expression and DNA copy number measurements on the same set of samples.
16 Genomic Data Recently, it has become increasingly common for researchers to use multiple assays to obtain measurements on a single set of patient samples. For instance, a data set might consist of gene expression and DNA copy number measurements on the same set of samples. The Question: Can we identify sets of genes whose expression is correlated with regions of copy number change?
17 DNA copy number change
18 DNA copy number change
19 Gene expression
20 Gene expression + copy number data
21 Gene expression + copy number data We want to find regions of DNA copy number change that are highly correlated with the expression of a set of genes.
22 Gene expression + copy number data We want to find regions of DNA copy number change that are highly correlated with the expression of a set of genes. That is, we want w 1 R p 1, w 2 R p 2 Cor(X 1 w 1, X 2 w 2 ) is high. sparse such that
23 Gene expression + copy number data We want to find regions of DNA copy number change that are highly correlated with the expression of a set of genes. That is, we want w 1 R p 1, w 2 R p 2 Cor(X 1 w 1, X 2 w 2 ) is high. sparse such that We ll make a statement like 0.3*(gene 1 expression) + 0.2*(gene 2 expression) - 4*(gene 3 expression) is highly correlated with genomic loss on part of chromosome 3.
24 Idea To find regions of copy number change on chromosome 1 that are correlated with gene expression sets anywhere on the genome, run sparse CCA using copy number data on chromosome 1 and expression measurements for all genes.
25 Data set We used a publicly-available breast cancer data set of Chin et al. (2006). 89 samples with breast cancer gene expression measurements DNA copy number measurements.
26 Copy number change on chromosome 1 1
27 Genes correlated w/copy # change on chrom. 1 i Gene Chromosome Weight 1 jumping translocation breakpoint translocated promoter region (to activated MET oncogene) glyceronephosphate O-acyltransferase NADH dehydrogenase (ubiquinone) Fe-S protein nucleoporin 133kD geranylgeranyl diphosphate synthase rab3 GTPase-activating protein, non-catalytic subunit (150kD) peroxisomal biogenesis factor 11B phosphatidylinositol glycan, class C tubulin-specific chaperone e protoporphyrinogen oxidase tuftelin papillary renal cell carcinoma (translocation-associated) splicing factor 3b, subunit 4, 49kD UDP-Gal:betaGlcNAc beta 1,4- galactosyltransferase hypothetical protein FLJ hypothetical protein HSPC mitochondrial ribosomal protein L HSPC003 protein hypothetical protein FLJ CGI-78 protein chromosome 1 open reading frame hypothetical protein My
28 Results, continued 1. Similar results for other chromosomes.
29 Results, continued 1. Similar results for other chromosomes. 2. We can use a permutation approach to estimate a p-value for the w 1 and w 2 obtained.
30 Results, continued 1. Similar results for other chromosomes. 2. We can use a permutation approach to estimate a p-value for the w 1 and w 2 obtained. 3. We described the identification of cis effects. To identify trans effects, run sparse CCA using copy number data on chromosome 1 and expression measurements for all genes NOT on chromosome 1.
31 Conclusions A method for the integrative analysis of genomic data sets. An extension of sparse CCA leads to sparse multiple CCA, for the case of K > 2 data sets on a single set of observations. If an outcome measurement (e.g. survival time, cancer subtype) is available for each observation, than sparse supervised CCA can be used. Software available in R PMA package and Excel AddIn Correlate (implemented by Sam Gross)
32 References 1. Chin et al. (2006) Genomic and transcriptional aberrations linked to breast cancer pathophysiologies, Cancer Cell Parkhomenko, Trichtler, and Beyene (2009) Sparse canonical correlation analysis with application to genomic data integration, Statistical Applications in Genetics and Molecular Biology 8 Article Waaijenborg, Versewel de Witt Hamer, and Zwinderman (2008) Quantifying the association between gene expression and DNA markers by penalized canonical correlation analysis, Statistical Applications in Genetics and Molecular Biology 7 Article Witten, Tibshirani, and Hastie (2009) A penalized matrix decomposition, with applications to sparse canonical correlation analysis and principal components, Biostatistics. 5. Witten and Tibshirani (2009) Extensions of sparse canonical correlation analysis, with applications to genomic data, Statistical Applications in Genetics and Molecular Biology 8 Article 28.
Correlate. A method for the integrative analysis of two genomic data sets
Correlate A method for the integrative analysis of two genomic data sets Sam Gross, Balasubramanian Narasimhan, Robert Tibshirani, and Daniela Witten February 19, 2010 Introduction Sparse Canonical Correlation
More informationA penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis
Biostatistics (2009), 10, 3, pp. 515 534 doi:10.1093/biostatistics/kxp008 Advance Access publication on April 17, 2009 A penalized matrix decomposition, with applications to sparse principal components
More informationA penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis
A penalized matrix decomposition 1 A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis DANIELA M. WITTEN Department of Statistics, Stanford
More informationSparse statistical modelling
Sparse statistical modelling Tom Bartlett Sparse statistical modelling Tom Bartlett 1 / 28 Introduction A sparse statistical model is one having only a small number of nonzero parameters or weights. [1]
More informationGroup lasso for genomic data
Group lasso for genomic data Jean-Philippe Vert Mines ParisTech / Curie Institute / Inserm Machine learning: Theory and Computation workshop, IMA, Minneapolis, March 26-3, 22 J.P Vert (ParisTech) Group
More informationA Framework for Feature Selection in Clustering
A Framework for Feature Selection in Clustering Daniela M. Witten and Robert Tibshirani October 10, 2011 Outline Problem Past work Proposed (sparse) clustering framework Sparse K-means clustering Sparse
More informationSparse Additive Functional and kernel CCA
Sparse Additive Functional and kernel CCA Sivaraman Balakrishnan* Kriti Puniyani* John Lafferty *Carnegie Mellon University University of Chicago Presented by Miao Liu 5/3/2013 Canonical correlation analysis
More informationComputational Genetics Winter 2013 Lecture 10. Eleazar Eskin University of California, Los Angeles
Computational Genetics Winter 2013 Lecture 10 Eleazar Eskin University of California, Los ngeles Pair End Sequencing Lecture 10. February 20th, 2013 (Slides from Ben Raphael) Chromosome Painting: Normal
More informationUnivariate shrinkage in the Cox model for high dimensional data
Univariate shrinkage in the Cox model for high dimensional data Robert Tibshirani January 6, 2009 Abstract We propose a method for prediction in Cox s proportional model, when the number of features (regressors)
More informationThe lasso: some novel algorithms and applications
1 The lasso: some novel algorithms and applications Newton Institute, June 25, 2008 Robert Tibshirani Stanford University Collaborations with Trevor Hastie, Jerome Friedman, Holger Hoefling, Gen Nowak,
More informationGO ID GO term Number of members GO: translation 225 GO: nucleosome 50 GO: calcium ion binding 76 GO: structural
GO ID GO term Number of members GO:0006412 translation 225 GO:0000786 nucleosome 50 GO:0005509 calcium ion binding 76 GO:0003735 structural constituent of ribosome 170 GO:0019861 flagellum 23 GO:0005840
More informationNewly made RNA is called primary transcript and is modified in three ways before leaving the nucleus:
m Eukaryotic mrna processing Newly made RNA is called primary transcript and is modified in three ways before leaving the nucleus: Cap structure a modified guanine base is added to the 5 end. Poly-A tail
More informationO 3 O 4 O 5. q 3. q 4. Transition
Hidden Markov Models Hidden Markov models (HMM) were developed in the early part of the 1970 s and at that time mostly applied in the area of computerized speech recognition. They are first described in
More informationClassification Ensemble That Maximizes the Area Under Receiver Operating Characteristic Curve (AUC)
Classification Ensemble That Maximizes the Area Under Receiver Operating Characteristic Curve (AUC) Eunsik Park 1 and Y-c Ivan Chang 2 1 Chonnam National University, Gwangju, Korea 2 Academia Sinica, Taipei,
More informationChapter 15 Active Reading Guide Regulation of Gene Expression
Name: AP Biology Mr. Croft Chapter 15 Active Reading Guide Regulation of Gene Expression The overview for Chapter 15 introduces the idea that while all cells of an organism have all genes in the genome,
More informationMultiple Choice Review- Eukaryotic Gene Expression
Multiple Choice Review- Eukaryotic Gene Expression 1. Which of the following is the Central Dogma of cell biology? a. DNA Nucleic Acid Protein Amino Acid b. Prokaryote Bacteria - Eukaryote c. Atom Molecule
More informationStandardization and Singular Value Decomposition in Canonical Correlation Analysis
Standardization and Singular Value Decomposition in Canonical Correlation Analysis Melinda Borello Johanna Hardin, Advisor David Bachman, Reader Submitted to Pitzer College in Partial Fulfillment of the
More informationStructural Learning and Integrative Decomposition of Multi-View Data
Structural Learning and Integrative Decomposition of Multi-View Data, Department of Statistics, Texas A&M University JSM 2018, Vancouver, Canada July 31st, 2018 Dr. Gen Li, Columbia University, Mailman
More informationTuesday 9/6/2018 Mike Mueckler
Tuesday 9/6/2018 Mike Mueckler mmueckler@wustl.edu Intracellular Targeting of Nascent Polypeptides Mitochondria are the Sites of Oxidative ATP Production Sugars Triglycerides Figure 14-10 Molecular Biology
More informationSome Statistical Models and Algorithms for Change-Point Problems in Genomics
Some Statistical Models and Algorithms for Change-Point Problems in Genomics S. Robin UMR 518 AgroParisTech / INRA Applied MAth & Comput. Sc. Journées SMAI-MAIRCI Grenoble, September 2012 S. Robin (AgroParisTech
More informationFrom Gene to Protein
From Gene to Protein Gene Expression Process by which DNA directs the synthesis of a protein 2 stages transcription translation All organisms One gene one protein 1. Transcription of DNA Gene Composed
More informationGenomes and Their Evolution
Chapter 21 Genomes and Their Evolution PowerPoint Lecture Presentations for Biology Eighth Edition Neil Campbell and Jane Reece Lectures by Chris Romero, updated by Erin Barley with contributions from
More informationClustering and classification with applications to microarrays and cellular phenotypes
Clustering and classification with applications to microarrays and cellular phenotypes Gregoire Pau, EMBL Heidelberg gregoire.pau@embl.de European Molecular Biology Laboratory 1 Microarray data patients
More informationBME 5742 Biosystems Modeling and Control
BME 5742 Biosystems Modeling and Control Lecture 24 Unregulated Gene Expression Model Dr. Zvi Roth (FAU) 1 The genetic material inside a cell, encoded in its DNA, governs the response of a cell to various
More informationVariable Selection in Structured High-dimensional Covariate Spaces
Variable Selection in Structured High-dimensional Covariate Spaces Fan Li 1 Nancy Zhang 2 1 Department of Health Care Policy Harvard University 2 Department of Statistics Stanford University May 14 2007
More informationFlow of Genetic Information
presents Flow of Genetic Information A Montagud E Navarro P Fernández de Córdoba JF Urchueguía Elements Nucleic acid DNA RNA building block structure & organization genome building block types Amino acid
More informationSparse inverse covariance estimation with the lasso
Sparse inverse covariance estimation with the lasso Jerome Friedman Trevor Hastie and Robert Tibshirani November 8, 2007 Abstract We consider the problem of estimating sparse graphs by a lasso penalty
More informationThe University of Jordan. Accreditation & Quality Assurance Center. COURSE Syllabus
The University of Jordan Accreditation & Quality Assurance Center COURSE Syllabus 1 Course title Principles of Genetics and molecular biology 2 Course number 0501217 3 Credit hours (theory, practical)
More informationI519 Introduction to Bioinformatics, Genome Comparison. Yuzhen Ye School of Informatics & Computing, IUB
I519 Introduction to Bioinformatics, 2015 Genome Comparison Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Whole genome comparison/alignment Build better phylogenies Identify polymorphism
More information1. In most cases, genes code for and it is that
Name Chapter 10 Reading Guide From DNA to Protein: Gene Expression Concept 10.1 Genetics Shows That Genes Code for Proteins 1. In most cases, genes code for and it is that determine. 2. Describe what Garrod
More informationLearning Network from High-dimensional Array Data
1 Learning Network from High-dimensional Array Data Li Hsu, Jie Peng and Pei Wang Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA Department of Statistics, University
More informationFused elastic net EPoC
Fused elastic net EPoC Tobias Abenius November 28, 2013 Data Set and Method Use mrna and copy number aberration (CNA) to build a network using an extension of our EPoC method (Jörnsten et al., 2011; Abenius
More informationThe lasso: some novel algorithms and applications
1 The lasso: some novel algorithms and applications Robert Tibshirani Stanford University ASA Bay Area chapter meeting Collaborations with Trevor Hastie, Jerome Friedman, Ryan Tibshirani, Daniela Witten,
More informationWhat is the central dogma of biology?
Bellringer What is the central dogma of biology? A. RNA DNA Protein B. DNA Protein Gene C. DNA Gene RNA D. DNA RNA Protein Review of DNA processes Replication (7.1) Transcription(7.2) Translation(7.3)
More informationMeasuring TF-DNA interactions
Measuring TF-DNA interactions How is Biological Complexity Achieved? Mediated by Transcription Factors (TFs) 2 Regulation of Gene Expression by Transcription Factors TF trans-acting factors TF TF TF TF
More informationThe Mitochondrion. Definition Structure, ultrastructure Functions
The Mitochondrion Definition Structure, ultrastructure Functions Organelle definition Etymology of the name Carl Benda (1903): (mitos) thread; (khondrion) granule. Light microscopy identification First
More informationChapters 12&13 Notes: DNA, RNA & Protein Synthesis
Chapters 12&13 Notes: DNA, RNA & Protein Synthesis Name Period Words to Know: nucleotides, DNA, complementary base pairing, replication, genes, proteins, mrna, rrna, trna, transcription, translation, codon,
More informationSemi-Penalized Inference with Direct FDR Control
Jian Huang University of Iowa April 4, 2016 The problem Consider the linear regression model y = p x jβ j + ε, (1) j=1 where y IR n, x j IR n, ε IR n, and β j is the jth regression coefficient, Here p
More informationResearch Statement on Statistics Jun Zhang
Research Statement on Statistics Jun Zhang (junzhang@galton.uchicago.edu) My interest on statistics generally includes machine learning and statistical genetics. My recent work focus on detection and interpretation
More informationMultimodal Deep Learning for Predicting Survival from Breast Cancer
Multimodal Deep Learning for Predicting Survival from Breast Cancer Heather Couture Deep Learning Journal Club Nov. 16, 2016 Outline Background on tumor histology & genetic data Background on survival
More informationCanonical Correlation Analysis with Kernels
Canonical Correlation Analysis with Kernels Florian Markowetz Max-Planck-Institute for Molecular Genetics Computational Molecular Biology Berlin Computational Diagnostics Group Seminar 2003 Mar 10 1 Overview
More informationComputational Biology: Basics & Interesting Problems
Computational Biology: Basics & Interesting Problems Summary Sources of information Biological concepts: structure & terminology Sequencing Gene finding Protein structure prediction Sources of information
More informationREVIEW SESSION. Wednesday, September 15 5:30 PM SHANTZ 242 E
REVIEW SESSION Wednesday, September 15 5:30 PM SHANTZ 242 E Gene Regulation Gene Regulation Gene expression can be turned on, turned off, turned up or turned down! For example, as test time approaches,
More informationLinear Regression (1/1/17)
STA613/CBB540: Statistical methods in computational biology Linear Regression (1/1/17) Lecturer: Barbara Engelhardt Scribe: Ethan Hada 1. Linear regression 1.1. Linear regression basics. Linear regression
More informationFrom gene to protein. Premedical biology
From gene to protein Premedical biology Central dogma of Biology, Molecular Biology, Genetics transcription replication reverse transcription translation DNA RNA Protein RNA chemically similar to DNA,
More informationLecture 10: Logistic Regression
BIOINF 585: Machine Learning for Systems Biology & Clinical Informatics Lecture 10: Logistic Regression Jie Wang Department of Computational Medicine & Bioinformatics University of Michigan 1 Outline An
More informationS1 Gene ontology (GO) analysis of the network alignment results
1 Supplementary Material for Effective comparative analysis of protein-protein interaction networks by measuring the steady-state network flow using a Markov model Hyundoo Jeong 1, Xiaoning Qian 1 and
More informationAdvanced Topics in RNA and DNA. DNA Microarrays Aptamers
Quiz 1 Advanced Topics in RNA and DNA DNA Microarrays Aptamers 2 Quantifying mrna levels to asses protein expression 3 The DNA Microarray Experiment 4 Application of DNA Microarrays 5 Some applications
More informationLeast Angle Regression, Forward Stagewise and the Lasso
January 2005 Rob Tibshirani, Stanford 1 Least Angle Regression, Forward Stagewise and the Lasso Brad Efron, Trevor Hastie, Iain Johnstone and Robert Tibshirani Stanford University Annals of Statistics,
More informationLearning with Singular Vectors
Learning with Singular Vectors CIS 520 Lecture 30 October 2015 Barry Slaff Based on: CIS 520 Wiki Materials Slides by Jia Li (PSU) Works cited throughout Overview Linear regression: Given X, Y find w:
More informationBiology 2018 Final Review. Miller and Levine
Biology 2018 Final Review Miller and Levine bones blood cells elements All living things are made up of. cells If a cell of an organism contains a nucleus, the organism is a(n). eukaryote prokaryote plant
More informationInferring Transcriptional Regulatory Networks from Gene Expression Data II
Inferring Transcriptional Regulatory Networks from Gene Expression Data II Lectures 9 Oct 26, 2011 CSE 527 Computational Biology, Fall 2011 Instructor: Su-In Lee TA: Christopher Miles Monday & Wednesday
More informationReconstructing Mitochondrial Evolution?? Morphological Diversity. Mitochondrial Diversity??? What is your definition of a mitochondrion??
Reconstructing Mitochondrial Evolution?? What is your definition of a mitochondrion?? Morphological Diversity Mitochondria as we all know them: Suprarenal gland Liver cell Plasma cell Adrenal cortex Mitochondrial
More informationMultiple Change-Point Detection and Analysis of Chromosome Copy Number Variations
Multiple Change-Point Detection and Analysis of Chromosome Copy Number Variations Yale School of Public Health Joint work with Ning Hao, Yue S. Niu presented @Tsinghua University Outline 1 The Problem
More information25 : Graphical induced structured input/output models
10-708: Probabilistic Graphical Models 10-708, Spring 2013 25 : Graphical induced structured input/output models Lecturer: Eric P. Xing Scribes: Meghana Kshirsagar (mkshirsa), Yiwen Chen (yiwenche) 1 Graph
More informationRegression Model In The Analysis Of Micro Array Data-Gene Expression Detection
Jamal Fathima.J.I 1 and P.Venkatesan 1. Research Scholar -Department of statistics National Institute For Research In Tuberculosis, Indian Council For Medical Research,Chennai,India,.Department of statistics
More informationومن أحياها Translation 1. Translation 1. DONE BY :Maen Faoury
Translation 1 DONE BY :Maen Faoury 0 1 ومن أحياها Translation 1 2 ومن أحياها Translation 1 In this lecture and the coming lectures you are going to see how the genetic information is transferred into proteins
More informationSTATE UNIVERSITY OF NEW YORK COLLEGE OF TECHNOLOGY CANTON, NEW YORK. COURSE OUTLINE BIOL 310 The Human Genome. Prepared By: Ron Tavernier
STATE UNIVERSITY OF NEW YORK COLLEGE OF TECHNOLOGY CANTON, NEW YORK COURSE OUTLINE BIOL 310 The Human Genome Prepared By: Ron Tavernier SCHOOL OF SCIENCE, HEALTH & CRIMINAL JUSTICE SCIENCE DEPARTMENT May
More informationGeneralized Power Method for Sparse Principal Component Analysis
Generalized Power Method for Sparse Principal Component Analysis Peter Richtárik CORE/INMA Catholic University of Louvain Belgium VOCAL 2008, Veszprém, Hungary CORE Discussion Paper #2008/70 joint work
More informationGeneralized Linear Models (1/29/13)
STA613/CBB540: Statistical methods in computational biology Generalized Linear Models (1/29/13) Lecturer: Barbara Engelhardt Scribe: Yangxiaolu Cao When processing discrete data, two commonly used probability
More informationM i t o c h o n d r i a
M i t o c h o n d r i a Dr. Diala Abu-Hassan School of Medicine dr.abuhassand@gmail.com Mitochondria Function: generation of metabolic energy in eukaryotic cells Generation of ATP from the breakdown of
More information- mutations can occur at different levels from single nucleotide positions in DNA to entire genomes.
February 8, 2005 Bio 107/207 Winter 2005 Lecture 11 Mutation and transposable elements - the term mutation has an interesting history. - as far back as the 17th century, it was used to describe any drastic
More informationGraphical Model Selection
May 6, 2013 Trevor Hastie, Stanford Statistics 1 Graphical Model Selection Trevor Hastie Stanford University joint work with Jerome Friedman, Rob Tibshirani, Rahul Mazumder and Jason Lee May 6, 2013 Trevor
More informationAP Bio Module 16: Bacterial Genetics and Operons, Student Learning Guide
Name: Period: Date: AP Bio Module 6: Bacterial Genetics and Operons, Student Learning Guide Getting started. Work in pairs (share a computer). Make sure that you log in for the first quiz so that you get
More informationName: SBI 4U. Gene Expression Quiz. Overall Expectation:
Gene Expression Quiz Overall Expectation: - Demonstrate an understanding of concepts related to molecular genetics, and how genetic modification is applied in industry and agriculture Specific Expectation(s):
More informationTESTING SIGNIFICANCE OF FEATURES BY LASSOED PRINCIPAL COMPONENTS. BY DANIELA M. WITTEN 1 AND ROBERT TIBSHIRANI 2 Stanford University
The Annals of Applied Statistics 2008, Vol. 2, No. 3, 986 1012 DOI: 10.1214/08-AOAS182 Institute of Mathematical Statistics, 2008 TESTING SIGNIFICANCE OF FEATURES BY LASSOED PRINCIPAL COMPONENTS BY DANIELA
More informationSUPPLEMENTARY MATERIAL. The table lists the 437 proteins proposed to be myristoylated in A. thaliana. The AGI
SUPPLEMENTARY MATERIAL Table s1: the N-myristoylome in A. thaliana The table lists the 437 proteins proposed to be myristoylated in A. thaliana. The AGI entries were re-annotated according to the most
More informationPenalized classification using Fisher s linear discriminant
J. R. Statist. Soc. B (2011) 73, Part 5, pp. 753 772 Penalized classification using Fisher s linear discriminant Daniela M. Witten University of Washington, Seattle, USA and Robert Tibshirani Stanford
More informationQuantitative Genetics & Evolutionary Genetics
Quantitative Genetics & Evolutionary Genetics (CHAPTER 24 & 26- Brooker Text) May 14, 2007 BIO 184 Dr. Tom Peavy Quantitative genetics (the study of traits that can be described numerically) is important
More informationSTATS 306B: Unsupervised Learning Spring Lecture 13 May 12
STATS 306B: Unsupervised Learning Spring 2014 Lecture 13 May 12 Lecturer: Lester Mackey Scribe: Jessy Hwang, Minzhe Wang 13.1 Canonical correlation analysis 13.1.1 Recap CCA is a linear dimensionality
More informationScale in the biological world
Scale in the biological world 2 A cell seen by TEM 3 4 From living cells to atoms 5 Compartmentalisation in the cell: internal membranes and the cytosol 6 The Origin of mitochondria: The endosymbion hypothesis
More informationIndirect multivariate response linear regression
Biometrika (2016), xx, x, pp. 1 22 1 2 3 4 5 6 C 2007 Biometrika Trust Printed in Great Britain Indirect multivariate response linear regression BY AARON J. MOLSTAD AND ADAM J. ROTHMAN School of Statistics,
More informationLinear models for the joint analysis of multiple. array-cgh profiles
Linear models for the joint analysis of multiple array-cgh profiles F. Picard, E. Lebarbier, B. Thiam, S. Robin. UMR 5558 CNRS Univ. Lyon 1, Lyon UMR 518 AgroParisTech/INRA, F-75231, Paris Statistics for
More informationBioinformatics. Transcriptome
Bioinformatics Transcriptome Jacques.van.Helden@ulb.ac.be Université Libre de Bruxelles, Belgique Laboratoire de Bioinformatique des Génomes et des Réseaux (BiGRe) http://www.bigre.ulb.ac.be/ Bioinformatics
More informationAn Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models
Proceedings 59th ISI World Statistics Congress, 25-30 August 2013, Hong Kong (Session CPS023) p.3938 An Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models Vitara Pungpapong
More informationChapter 17. From Gene to Protein. Biology Kevin Dees
Chapter 17 From Gene to Protein DNA The information molecule Sequences of bases is a code DNA organized in to chromosomes Chromosomes are organized into genes What do the genes actually say??? Reflecting
More informationAnalysis of MALDI-TOF Data: from Data Preprocessing to Model Validation for Survival Outcome
Analysis of MALDI-TOF Data: from Data Preprocessing to Model Validation for Survival Outcome Heidi Chen, Ph.D. Cancer Biostatistics Center Vanderbilt University School of Medicine March 20, 2009 Outline
More informationUNIT 5. Protein Synthesis 11/22/16
UNIT 5 Protein Synthesis IV. Transcription (8.4) A. RNA carries DNA s instruction 1. Francis Crick defined the central dogma of molecular biology a. Replication copies DNA b. Transcription converts DNA
More informationMolecular Biology of the Cell
Alberts Johnson Lewis Morgan Raff Roberts Walter Molecular Biology of the Cell Sixth Edition Chapter 6 (pp. 333-368) How Cells Read the Genome: From DNA to Protein Copyright Garland Science 2015 Genetic
More informationBiology Notes 2. Mitosis vs Meiosis
Biology Notes 2 Mitosis vs Meiosis Diagram Booklet Cell Cycle (bottom corner draw cell in interphase) Mitosis Meiosis l Meiosis ll Cell Cycle Interphase Cell spends the majority of its life in this phase
More information16 The Cell Cycle. Chapter Outline The Eukaryotic Cell Cycle Regulators of Cell Cycle Progression The Events of M Phase Meiosis and Fertilization
The Cell Cycle 16 The Cell Cycle Chapter Outline The Eukaryotic Cell Cycle Regulators of Cell Cycle Progression The Events of M Phase Meiosis and Fertilization Introduction Self-reproduction is perhaps
More informationProteomics. 2 nd semester, Department of Biotechnology and Bioinformatics Laboratory of Nano-Biotechnology and Artificial Bioengineering
Proteomics 2 nd semester, 2013 1 Text book Principles of Proteomics by R. M. Twyman, BIOS Scientific Publications Other Reference books 1) Proteomics by C. David O Connor and B. David Hames, Scion Publishing
More informationDirichlet process Bayesian clustering with the R package PReMiuM
Dirichlet process Bayesian clustering with the R package PReMiuM Dr Silvia Liverani Brunel University London July 2015 Silvia Liverani (Brunel University London) Profile Regression 1 / 18 Outline Motivation
More informationStatistical testing. Samantha Kleinberg. October 20, 2009
October 20, 2009 Intro to significance testing Significance testing and bioinformatics Gene expression: Frequently have microarray data for some group of subjects with/without the disease. Want to find
More informationPrinciples of Genetics
Principles of Genetics Snustad, D ISBN-13: 9780470903599 Table of Contents C H A P T E R 1 The Science of Genetics 1 An Invitation 2 Three Great Milestones in Genetics 2 DNA as the Genetic Material 6 Genetics
More informationSequence analysis and comparison
The aim with sequence identification: Sequence analysis and comparison Marjolein Thunnissen Lund September 2012 Is there any known protein sequence that is homologous to mine? Are there any other species
More informationData visualization and clustering: an application to gene expression data
Data visualization and clustering: an application to gene expression data Francesco Napolitano Università degli Studi di Salerno Dipartimento di Matematica e Informatica DAA Erice, April 2007 Thanks to
More informationFuzzy Clustering of Gene Expression Data
Fuzzy Clustering of Gene Data Matthias E. Futschik and Nikola K. Kasabov Department of Information Science, University of Otago P.O. Box 56, Dunedin, New Zealand email: mfutschik@infoscience.otago.ac.nz,
More informationMathematics, Genomics, and Cancer
School of Informatics IUB April 6, 2009 Outline Introduction Class Comparison Class Discovery Class Prediction Example Biological states and state modulation Software Tools Research directions Math & Biology
More informationCHAPTER4 Translation
CHAPTER4 Translation 4.1 Outline of Translation 4.2 Genetic Code 4.3 trna and Anticodon 4.4 Ribosome 4.5 Protein Synthesis 4.6 Posttranslational Events 4.1 Outline of Translation From mrna to protein
More informationReading Assignments. A. Genes and the Synthesis of Polypeptides. Lecture Series 7 From DNA to Protein: Genotype to Phenotype
Lecture Series 7 From DNA to Protein: Genotype to Phenotype Reading Assignments Read Chapter 7 From DNA to Protein A. Genes and the Synthesis of Polypeptides Genes are made up of DNA and are expressed
More informationTable 5. Genes unique to G. thermodenitrificans NG80-2 Gene ID Gene name Gene product COG functional category
Table 5. Genes unique to G. thermodenitrificans NG80-2 GT0030 gt30 Methionine--tRNA ligase/methionyl-trna synthetase COG0143, Translation GT0033 gt33 Unknown GT0106 gt106 Ribosomal protein L3 COG0087,
More informationPCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani
PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2015 Soleymani Dimensionality Reduction: Feature Selection vs. Feature Extraction Feature selection Select a subset of a given
More informationStudent Learning Outcomes: Nucleus distinguishes Eukaryotes from Prokaryotes
9 The Nucleus Student Learning Outcomes: Nucleus distinguishes Eukaryotes from Prokaryotes Explain general structures of Nuclear Envelope, Nuclear Lamina, Nuclear Pore Complex Explain movement of proteins
More informationFitness constraints on horizontal gene transfer
Fitness constraints on horizontal gene transfer Dan I Andersson University of Uppsala, Department of Medical Biochemistry and Microbiology, Uppsala, Sweden GMM 3, 30 Aug--2 Sep, Oslo, Norway Acknowledgements:
More informationObjective 3.01 (DNA, RNA and Protein Synthesis)
Objective 3.01 (DNA, RNA and Protein Synthesis) DNA Structure o Discovered by Watson and Crick o Double-stranded o Shape is a double helix (twisted ladder) o Made of chains of nucleotides: o Has four types
More informationLARGE NUMBERS OF EXPLANATORY VARIABLES. H.S. Battey. WHAO-PSI, St Louis, 9 September 2018
LARGE NUMBERS OF EXPLANATORY VARIABLES HS Battey Department of Mathematics, Imperial College London WHAO-PSI, St Louis, 9 September 2018 Regression, broadly defined Response variable Y i, eg, blood pressure,
More information9/11/18. Molecular and Cellular Biology. 3. The Cell From Genes to Proteins. key processes
Molecular and Cellular Biology Animal Cell ((eukaryotic cell) -----> compare with prokaryotic cell) ENDOPLASMIC RETICULUM (ER) Rough ER Smooth ER Flagellum Nuclear envelope Nucleolus NUCLEUS Chromatin
More informationHonors Biology Reading Guide Chapter 11
Honors Biology Reading Guide Chapter 11 v Promoter a specific nucleotide sequence in DNA located near the start of a gene that is the binding site for RNA polymerase and the place where transcription begins
More informationTopographic Mapping and Dimensionality Reduction of Binary Tensor Data of Arbitrary Rank
Topographic Mapping and Dimensionality Reduction of Binary Tensor Data of Arbitrary Rank Peter Tiňo, Jakub Mažgút, Hong Yan, Mikael Bodén Topographic Mapping and Dimensionality Reduction of Binary Tensor
More information