GWAS. Genotype-Phenotype Association CMSC858P Spring 2012 Hector Corrada Bravo University of Maryland. logistic regression. logistic regression
|
|
- Lester Matthew Short
- 6 years ago
- Views:
Transcription
1 Genotype-Phenotype Association CMSC858P Spring 202 Hector Corrada Bravo University of Maryland GWAS Genome-wide association studies Scans for SNPs (or other structural variants) that show association with some phenotype categorical phenotypes: age-related macular degeneration continuous phenotypes (QTL): blood pressure Commonly: 0^3 samples, 0^6 SNPs logistic regression Estimate log odds diseaseratio f is linear Binary outcome, disease/no θ(x) = P r{y = x} f(x) = log θ(x) θ(x) Predictors (genotypes) logistic regression f(x) = log θ(x) θ(x) = β 0 + β x Encoding genotype data We usually think of major/minor alleles, where minor allele occurs at a less frequency in the population (e.g., 5%) haplotype: minor allele: AA, Aa -> x=0; aa -> x= major allele: AA,Aa -> x=; aa -> x=0 both:aa->x=,x2=;aa->x=,x2=0,etc... genotype (dosage): AA -> x=0; Aa -> x=; aa-> x=2
2 Interpretation Odds of outcome for, e.g, genotype AA P (Y = X = 0) P (Y =0 X = 0) = eβ 0 Odds of outcome for, e.g, genotype Aa P (Y = X = ) P (Y =0 X = ) = eβ 0+β Odds-ratio P (Y = X = )/P (Y =0 X = ) P (Y = X = 0)/P (Y =0 X = 0) = eβ GWAS gwas Discovering association: how unexpected is this odds ratio? Expensive and pervasive...
3 Published Genome-Wide Associations through 2/200, 22 published GWA at p<5x0-8 for 20 traits NHGRI GWA Catalog GWAS!"#$%&''(%)*+&(% Most diseases here ,)-./0'-23'%4053)%06-)7%% %5(9%8)&:%373(%% 20!4*-2&@*%A3BC&40*+&D% G)32E43(% 30 $E0%<0.@3=*+&% 40 ;3'%5(%&/)3'%F*0)%2&4&)%% >30.F=% 50 ;3(<&(3%=&%>3<*++(%?%=)3*=@3=% Testing for marginal effects is limited Epistasis, interactions Environment/risk factors, unaccounted dependencies Not all SNPs are created equal (annotation)
4 Examining the relative influence of familial, genetic, and environmental covariate information in flexible risk models Héctor Corrada Bravo a,, Kristine E. Lee b, Barbara E. K. Klein b, Ronald Klein b, Sudha K. Iyengar c, and Grace Wahba d, a Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 2205; b Department of Ophthalmology and Visual Science, University of Wisconsin, Madison, WI 53706; c Departments of Epidemiology and Biostatistics, Genetics, and Ophthalmology, Case Western Reserve University, Cleveland, OH 4406; and d Departments of Statistics, Biostatistics and Medical Informatics, and Computer Sciences, University of Wisconsin, Madison, WI 53706b; Contributed by Grace Wahba, March 9, 2009 (sent for review February 22, 2009) Environment/risk factors, unaccounted dependencies How to incorporate subject dependence Splines and BDES History of Smoothing Spline (SS) models for analyzing BDES data [Wahba et al. 998a,b,999,2000,2002,2006] In particular, SS-ANOVA model of pigmentary abnormalities [Ann. Statistics 28 (2000)] SS-ANOVA [Ann. Statistics 28 (2000)] Model for pigmentary abnormalities (PA), female BDES I subjects f(t) = µ + f (sysbp) + f 2 (chol) + f 2 (sysbp, chol) + d age age + d bmi bmi + d horm I (horm) + d hist I 2 (hist), hormone replacement yes/ history of heavy no drinking probability SS-ANOVA bmi : 32.2 age : 55 bmi : 28 age : 55 bmi : 24.6 age : sysbp = 09 sysbp = 24 sysbp = 39 sysbp = bmi : 32.2 age : 66 bmi : 28 age : 66 bmi : 24.6 age : 66 cholesterol bmi : 32.2 age : 73 bmi : 28 age : 73 bmi : 24.6 age : nonlinear protective effect of cholesterol
5 SS-ANOVA SS-ANOVA (w/ ARMS2) Recent results linking variation in specific genetic regions and AMD (age-related macular degeneration) In particular, CFH and LOC38775 (ARMS2) genes probability snp2 : 22 age : 48.5 snp2 : 2 age : 48.5 snp2 : age : 48.5 sysbp = 09 sysbp = 24 sysbp = 39 sysbp = snp2 : 22 age : 59.5 snp2 : 2 age : 59.5 snp2 : age : 59.5 snp2 : 22 age : 69.5 snp2 : 2 age : 69.5 snp2 : age : snp2 : 22 age : 80.5 snp2 : 2 age : 80.5 snp2 : age : 80.5 protective effect gone cholesterol Pedigrees Pedigree Distance PA present male female PA absent Use Malecot s kinship coefficient (φ): for subjects i and j: the probability that randomly chosen alleles, one from each subject, are identical by descent e.g. parent-offspring: /4 e.g. siblings: /4 Pedigree distance: (-2 φ)
6 Relationship Graph Example Pedigree Graph Metric embeddings 26! Relationship sibs avuncular first-cousins unrelated Distance We will extend the SS-ANOVA model with an encoding of this relationship graph ! 35! Interpretation: embedding gives relationship pseudo-attributes over which smooth functions can be estimated.5 0! 8! Extensions Comparison to Covariate-Only Model Percent change in mean AUC w.r.t. C only model f(t) = µ + d SNP, I(X = 2) + d SNP,2 I(X = 22) + d SNP2, I(X 2 = 2) + d SNP2,2 I(X 2 = 22) + f (sysbp) + f 2 (chol) + f 2 (sysbp, chol) + d age age + d bmi bmi + d horm I (horm) + d hist I 2 (hist) + d smoke I 3 (smoke) + h(z(t)) SNP data environmental covariates pedigree data!auc C"only S only S+C P only S+P C+P S+C+P [Corrada Bravo, et al., PNAS 2009]
7 Epistasis Testing marginal effects is limited Modeling is straightforward: We want to test interactions (epistasis) add non-linear interaction terms to logistic regression model Computationally, it s a problem we started with 0^6 SNPs... BIOINFORMATICS ORIGINAL PAPER Vol. 26 no , pages doi:0.093/bioinformatics/btq529 Genetics and population analysis Advance Access publication September 24, 200 RAPID detection of gene gene interactions in genome-wide association studies Dumitru Brinza, Matthew Schultz 2, Glenn Tesler 3 and Vineet Bafna 4, Life Technologies, Foster City, CA, 2 Graduate Bioinformatics Program, 3 Department of Mathematics and 4 Department of Computer Science and Engineering, Institute for Genomic Medicine, University of California, San Diego, CA, USA Associate Editor: Jeffrey Barrett A filtering approach: Discover possible interactions quickly Test good candidates completely RAPID If two SNPs (x and y) associate with disease (d) then at least one of the following must hold:. x associates with d 2. y associates with d 3. x associates with y in cases 4. x associates with y in controls RAPID finds SNPs where 3 holds RAPID Look at cases only, and define vector for each SNP as: 0, v x (a) = a P x n Px ( P x ) Proportion of s
8 RAPID RAPID dist(v x,v y )= 2 2 χ 2 x,y/n Association between x and y Statistical association is now a geometric problem RAPID Use random projections to find possible interacting pairs RAPID Do this repeatedly, to avoid false positives vx r Hash(x, r, B) = B
9 Interactions/Epistasis A MAJOR problem Inherently computational and statistical We are nowhere close We will be inundated with data (sequencing) Learning a Prior on Regulatory Potential from eqtl Data Su-In Lee, Aimée M. Dudley 2, David Drubin 3, Pamela A. Silver 3, Nevan J. Krogan 4, Dana Pe er 5, Daphne Koller * Computer Science Department, Stanford University, Stanford, California, United States of America, 2 Institute for Systems Biology, Seattle, Washington, United States of America, 3 Department of Systems Biology, Harvard Medical School, Boston, Massachusetts, United States of America, 4 Department of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco, California, United States of America, 5 Department of Biological Sciences, Columbia University, New York, New York, United States of America SNP annotation Outcome is gene expression (eqtl) The goal is to learn regulatory programs Potential for a mutation to have an effect on expression depends on SNP features Regulatory programs gene exp. SNPs y m,g ~w m, x zw m,2 x 2 z...zw m,n x n ze, for all g 0 s, in module m, linear regression
10 Regulatory Potential SNP features PrðSNP n causes variation in expression levels of genesþ ~sigmoid X b k kf n,k, SNP features sigmoidðþ~= t ðzexp ð{tþþ, Yeast dataset Regulatory programs gene exp. SNPs y m,g ~w m, x zw m,2 x 2 z...zw m,n x n ze, for all g 0 s, in module m, Hierarchical model Prðw r Þ!exp ð{c r jw r jþ, linear regression minimize module m y m,g gene g Estimation regulator r expression fit w m,r x r 2 + regulator r C r w m,r + D SNP selection uses regulatory potential Parameters estimated iteratively regulator r w 2 m,r + E k β 2 k C r ~C PrðRegulator r is causalþ zc 0 ½{PrðRegulator r is causalþš:
11 Where are SNPs with largest regulatory potential?
Examining the Relative Influence of Familial, Genetic and Covariate Information In Flexible Risk Models. Grace Wahba
Examining the Relative Influence of Familial, Genetic and Covariate Information In Flexible Risk Models Grace Wahba Based on a paper of the same name which has appeared in PNAS May 19, 2009, by Hector
More informationExamining the Relative Influence of Familial, Genetic and Environmental Covariate Information in Flexible Risk Models (Supplementary Information)
Examining the Relative Influence of Familial, Genetic and Environmental Covariate Information in Flexible Risk Models (Supplementary Information) Héctor Corrada Bravo Johns Hopkins Bloomberg School of
More informationRobustness and Reproducing Kernel Hilbert Spaces. Grace Wahba
Robustness and Reproducing Kernel Hilbert Spaces Grace Wahba Part 1. Regularized Kernel Estimation RKE. (Robustly) Part 2. Smoothing Spline ANOVA SS-ANOVA and RKE. Part 3. Partly missing covariates SS-ANOVA
More informationLecture 2: Genetic Association Testing with Quantitative Traits. Summer Institute in Statistical Genetics 2017
Lecture 2: Genetic Association Testing with Quantitative Traits Instructors: Timothy Thornton and Michael Wu Summer Institute in Statistical Genetics 2017 1 / 29 Introduction to Quantitative Trait Mapping
More informationSNP Association Studies with Case-Parent Trios
SNP Association Studies with Case-Parent Trios Department of Biostatistics Johns Hopkins Bloomberg School of Public Health September 3, 2009 Population-based Association Studies Balding (2006). Nature
More informationProportional Variance Explained by QLT and Statistical Power. Proportional Variance Explained by QTL and Statistical Power
Proportional Variance Explained by QTL and Statistical Power Partitioning the Genetic Variance We previously focused on obtaining variance components of a quantitative trait to determine the proportion
More informationAssociation Testing with Quantitative Traits: Common and Rare Variants. Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5
Association Testing with Quantitative Traits: Common and Rare Variants Timothy Thornton and Katie Kerr Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5 1 / 41 Introduction to Quantitative
More informationGraph-Based Data Analysis:
DEPARTMENT OF STATISTICS University of Wisconsin 1300 University Ave. Madison, WI 53706 TECHNICAL REPORT NO. 1145 15 August 2008 Graph-Based Data Analysis: Tree-Structured Covariance Estimation, Prediction
More informationComputational Systems Biology: Biology X
Bud Mishra Room 1002, 715 Broadway, Courant Institute, NYU, New York, USA L#7:(Mar-23-2010) Genome Wide Association Studies 1 The law of causality... is a relic of a bygone age, surviving, like the monarchy,
More informationGenotype Imputation. Biostatistics 666
Genotype Imputation Biostatistics 666 Previously Hidden Markov Models for Relative Pairs Linkage analysis using affected sibling pairs Estimation of pairwise relationships Identity-by-Descent Relatives
More informationBayesian Inference of Interactions and Associations
Bayesian Inference of Interactions and Associations Jun Liu Department of Statistics Harvard University http://www.fas.harvard.edu/~junliu Based on collaborations with Yu Zhang, Jing Zhang, Yuan Yuan,
More informationEM algorithm. Rather than jumping into the details of the particular EM algorithm, we ll look at a simpler example to get the idea of how it works
EM algorithm The example in the book for doing the EM algorithm is rather difficult, and was not available in software at the time that the authors wrote the book, but they implemented a SAS macro to implement
More informationSNP-SNP Interactions in Case-Parent Trios
Detection of SNP-SNP Interactions in Case-Parent Trios Department of Biostatistics Johns Hopkins Bloomberg School of Public Health June 2, 2009 Karyotypes http://ghr.nlm.nih.gov/ Single Nucleotide Polymphisms
More informationRegulatory Inferece from Gene Expression. CMSC858P Spring 2012 Hector Corrada Bravo
Regulatory Inferece from Gene Expression CMSC858P Spring 2012 Hector Corrada Bravo 2 Graphical Model Let y be a vector- valued random variable Suppose some condi8onal independence proper8es hold for some
More informationMapping multiple QTL in experimental crosses
Human vs mouse Mapping multiple QTL in experimental crosses Karl W Broman Department of Biostatistics & Medical Informatics University of Wisconsin Madison www.biostat.wisc.edu/~kbroman www.daviddeen.com
More informationMathematical Programming in Machine Learning and Data Mining January 14-19, 2007 Banff International Research Station. Grace Wahba
Mathematical Programming in Machine Learning and Data Mining January 14-19, 27 Banff International Research Station Grace Wahba On Consuming Mathematical Programming: Selection of High Order Patterns in
More informationResemblance between relatives
Resemblance between relatives 1 Key concepts Model phenotypes by fixed effects and random effects including genetic value (additive, dominance, epistatic) Model covariance of genetic effects by relationship
More informationLASSO-Patternsearch Algorithm with Application to Ophthalmology and Genomic Data
DEPARTMENT OF STATISTICS University of Wisconsin 1300 University Ave. Madison, WI 53706 TECHNICAL REPORT NO. 1141 January 2, 2008 LASSO-Patternsearch Algorithm with Application to Ophthalmology and Genomic
More informationJohns Hopkins Bloomberg School of Public Health Department of Biostatistics. The LASSO-Patternsearch Algorithm: Finding patterns in a haystack
Johns Hopkins Bloomberg School of Public Health Department of Biostatistics The LASSO-Patternsearch Algorithm: Finding patterns in a haystack Grace Wahba Joint work with Weiliang Shi, Steve Wright, Kristine
More informationLecture 7: Interaction Analysis. Summer Institute in Statistical Genetics 2017
Lecture 7: Interaction Analysis Timothy Thornton and Michael Wu Summer Institute in Statistical Genetics 2017 1 / 39 Lecture Outline Beyond main SNP effects Introduction to Concept of Statistical Interaction
More informationLinear Regression (1/1/17)
STA613/CBB540: Statistical methods in computational biology Linear Regression (1/1/17) Lecturer: Barbara Engelhardt Scribe: Ethan Hada 1. Linear regression 1.1. Linear regression basics. Linear regression
More informationStatistical aspects of prediction models with high-dimensional data
Statistical aspects of prediction models with high-dimensional data Anne Laure Boulesteix Institut für Medizinische Informationsverarbeitung, Biometrie und Epidemiologie February 15th, 2017 Typeset by
More informationPositive definite functions, Reproducing Kernel Hilbert Spaces and all that. Grace Wahba
Positive definite functions, Reproducing Kernel Hilbert Spaces and all that Grace Wahba Regarding Analysis of Variance in Function Spaces, and an Application to Mortality as it Runs in Families The Fisher
More informationBTRY 7210: Topics in Quantitative Genomics and Genetics
BTRY 7210: Topics in Quantitative Genomics and Genetics Jason Mezey Biological Statistics and Computational Biology (BSCB) Department of Genetic Medicine jgm45@cornell.edu February 12, 2015 Lecture 3:
More informationVariance Component Models for Quantitative Traits. Biostatistics 666
Variance Component Models for Quantitative Traits Biostatistics 666 Today Analysis of quantitative traits Modeling covariance for pairs of individuals estimating heritability Extending the model beyond
More informationTheoretical and computational aspects of association tests: application in case-control genome-wide association studies.
Theoretical and computational aspects of association tests: application in case-control genome-wide association studies Mathieu Emily November 18, 2014 Caen mathieu.emily@agrocampus-ouest.fr - Agrocampus
More informationQuantitative characters - exercises
Quantitative characters - exercises 1. a) Calculate the genetic covariance between half sibs, expressed in the ij notation (Cockerham's notation), when up to loci are considered. b) Calculate the genetic
More informationCalculation of IBD probabilities
Calculation of IBD probabilities David Evans University of Bristol This Session Identity by Descent (IBD) vs Identity by state (IBS) Why is IBD important? Calculating IBD probabilities Lander-Green Algorithm
More informationLASSO-Patternsearch Algorithm with Application to Ophthalmology and Genomic Data 1
LASSO-Patternsearch Algorithm with Application to Ophthalmology and Genomic Data 1 Weiliang Shi 2 shiw@stat.wisc.edu Department of Statistics, University of Wisconsin 1300 University Avenue, Madison WI
More informationLASSO-Patternsearch Algorithm with Application to Ophthalmology Data
DEPARTMENT OF STATISTICS University of Wisconsin 1300 University Ave. Madison, WI 53706 TECHNICAL REPORT NO. 1131 October 28, 2006 LASSO-Patternsearch Algorithm with Application to Ophthalmology Data Weiliang
More informationDistinctive aspects of non-parametric fitting
5. Introduction to nonparametric curve fitting: Loess, kernel regression, reproducing kernel methods, neural networks Distinctive aspects of non-parametric fitting Objectives: investigate patterns free
More informationLecture 6: Introduction to Quantitative genetics. Bruce Walsh lecture notes Liege May 2011 course version 25 May 2011
Lecture 6: Introduction to Quantitative genetics Bruce Walsh lecture notes Liege May 2011 course version 25 May 2011 Quantitative Genetics The analysis of traits whose variation is determined by both a
More informationCalculation of IBD probabilities
Calculation of IBD probabilities David Evans and Stacey Cherny University of Oxford Wellcome Trust Centre for Human Genetics This Session IBD vs IBS Why is IBD important? Calculating IBD probabilities
More informationAssociation studies and regression
Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration
More informationLecture WS Evolutionary Genetics Part I 1
Quantitative genetics Quantitative genetics is the study of the inheritance of quantitative/continuous phenotypic traits, like human height and body size, grain colour in winter wheat or beak depth in
More informationIntroduction to QTL mapping in model organisms
Introduction to QTL mapping in model organisms Karl W Broman Department of Biostatistics and Medical Informatics University of Wisconsin Madison www.biostat.wisc.edu/~kbroman [ Teaching Miscellaneous lectures]
More informationCS 4491/CS 7990 SPECIAL TOPICS IN BIOINFORMATICS
CS 4491/CS 7990 SPECIAL TOPICS IN BIOINFORMATICS * Some contents are adapted from Dr. Hung Huang and Dr. Chengkai Li at UT Arlington Mingon Kang, Ph.D. Computer Science, Kennesaw State University Problems
More informationIntroduction to Analysis of Genomic Data Using R Lecture 6: Review Statistics (Part II)
1/45 Introduction to Analysis of Genomic Data Using R Lecture 6: Review Statistics (Part II) Dr. Yen-Yi Ho (hoyen@stat.sc.edu) Feb 9, 2018 2/45 Objectives of Lecture 6 Association between Variables Goodness
More informationStatistics in medicine
Statistics in medicine Lecture 4: and multivariable regression Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology Department Yale School of Public Health Fatma.shebl@yale.edu
More information1 Springer. Nan M. Laird Christoph Lange. The Fundamentals of Modern Statistical Genetics
1 Springer Nan M. Laird Christoph Lange The Fundamentals of Modern Statistical Genetics 1 Introduction to Statistical Genetics and Background in Molecular Genetics 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
More informationThe Lander-Green Algorithm. Biostatistics 666 Lecture 22
The Lander-Green Algorithm Biostatistics 666 Lecture Last Lecture Relationship Inferrence Likelihood of genotype data Adapt calculation to different relationships Siblings Half-Siblings Unrelated individuals
More informationPower and sample size calculations for designing rare variant sequencing association studies.
Power and sample size calculations for designing rare variant sequencing association studies. Seunggeun Lee 1, Michael C. Wu 2, Tianxi Cai 1, Yun Li 2,3, Michael Boehnke 4 and Xihong Lin 1 1 Department
More informationA novel fuzzy set based multifactor dimensionality reduction method for detecting gene-gene interaction
A novel fuzzy set based multifactor dimensionality reduction method for detecting gene-gene interaction Sangseob Leem, Hye-Young Jung, Sungyoung Lee and Taesung Park Bioinformatics and Biostatistics lab
More informationOdds ratio estimation in Bernoulli smoothing spline analysis-ofvariance
The Statistician (1997) 46, No. 1, pp. 49 56 Odds ratio estimation in Bernoulli smoothing spline analysis-ofvariance models By YUEDONG WANG{ University of Michigan, Ann Arbor, USA [Received June 1995.
More informationHERITABILITY ESTIMATION USING A REGULARIZED REGRESSION APPROACH (HERRA)
BIRS 016 1 HERITABILITY ESTIMATION USING A REGULARIZED REGRESSION APPROACH (HERRA) Malka Gorfine, Tel Aviv University, Israel Joint work with Li Hsu, FHCRC, Seattle, USA BIRS 016 The concept of heritability
More informationDistribution-free ROC Analysis Using Binary Regression Techniques
Distribution-free Analysis Using Binary Techniques Todd A. Alonzo and Margaret S. Pepe As interpreted by: Andrew J. Spieker University of Washington Dept. of Biostatistics Introductory Talk No, not that!
More informationInferring Transcriptional Regulatory Networks from Gene Expression Data II
Inferring Transcriptional Regulatory Networks from Gene Expression Data II Lectures 9 Oct 26, 2011 CSE 527 Computational Biology, Fall 2011 Instructor: Su-In Lee TA: Christopher Miles Monday & Wednesday
More informationPackage LBLGXE. R topics documented: July 20, Type Package
Type Package Package LBLGXE July 20, 2015 Title Bayesian Lasso for detecting Rare (or Common) Haplotype Association and their interactions with Environmental Covariates Version 1.2 Date 2015-07-09 Author
More informationDNA polymorphisms such as SNP and familial effects (additive genetic, common environment) to
1 1 1 1 1 1 1 1 0 SUPPLEMENTARY MATERIALS, B. BIVARIATE PEDIGREE-BASED ASSOCIATION ANALYSIS Introduction We propose here a statistical method of bivariate genetic analysis, designed to evaluate contribution
More informationLecture 9. QTL Mapping 2: Outbred Populations
Lecture 9 QTL Mapping 2: Outbred Populations Bruce Walsh. Aug 2004. Royal Veterinary and Agricultural University, Denmark The major difference between QTL analysis using inbred-line crosses vs. outbred
More informationQuantitative characters II: heritability
Quantitative characters II: heritability The variance of a trait (x) is the average squared deviation of x from its mean: V P = (1/n)Σ(x-m x ) 2 This total phenotypic variance can be partitioned into components:
More information25 : Graphical induced structured input/output models
10-708: Probabilistic Graphical Models 10-708, Spring 2016 25 : Graphical induced structured input/output models Lecturer: Eric P. Xing Scribes: Raied Aljadaany, Shi Zong, Chenchen Zhu Disclaimer: A large
More informationSupplementary Materials for Molecular QTL Discovery Incorporating Genomic Annotations using Bayesian False Discovery Rate Control
Supplementary Materials for Molecular QTL Discovery Incorporating Genomic Annotations using Bayesian False Discovery Rate Control Xiaoquan Wen Department of Biostatistics, University of Michigan A Model
More information25 : Graphical induced structured input/output models
10-708: Probabilistic Graphical Models 10-708, Spring 2013 25 : Graphical induced structured input/output models Lecturer: Eric P. Xing Scribes: Meghana Kshirsagar (mkshirsa), Yiwen Chen (yiwenche) 1 Graph
More informationStatistical issues in QTL mapping in mice
Statistical issues in QTL mapping in mice Karl W Broman Department of Biostatistics Johns Hopkins University http://www.biostat.jhsph.edu/~kbroman Outline Overview of QTL mapping The X chromosome Mapping
More informationAn Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models
Proceedings 59th ISI World Statistics Congress, 25-30 August 2013, Hong Kong (Session CPS023) p.3938 An Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models Vitara Pungpapong
More informationCONTENTS. P A R T I Genomes 1. P A R T II Gene Transcription and Regulation 109
CONTENTS ix Preface xv Acknowledgments xxi Editors and contributors xxiv A computational micro primer xxvi P A R T I Genomes 1 1 Identifying the genetic basis of disease 3 Vineet Bafna 2 Pattern identification
More informationMODEL-FREE LINKAGE AND ASSOCIATION MAPPING OF COMPLEX TRAITS USING QUANTITATIVE ENDOPHENOTYPES
MODEL-FREE LINKAGE AND ASSOCIATION MAPPING OF COMPLEX TRAITS USING QUANTITATIVE ENDOPHENOTYPES Saurabh Ghosh Human Genetics Unit Indian Statistical Institute, Kolkata Most common diseases are caused by
More informationLecture 1: Case-Control Association Testing. Summer Institute in Statistical Genetics 2015
Timothy Thornton and Michael Wu Summer Institute in Statistical Genetics 2015 1 / 1 Introduction Association mapping is now routinely being used to identify loci that are involved with complex traits.
More informationSelection-adjusted estimation of effect sizes
Selection-adjusted estimation of effect sizes with an application in eqtl studies Snigdha Panigrahi 19 October, 2017 Stanford University Selective inference - introduction Selective inference Statistical
More informationUNIT 8 BIOLOGY: Meiosis and Heredity Page 148
UNIT 8 BIOLOGY: Meiosis and Heredity Page 148 CP: CHAPTER 6, Sections 1-6; CHAPTER 7, Sections 1-4; HN: CHAPTER 11, Section 1-5 Standard B-4: The student will demonstrate an understanding of the molecular
More informationABTEKNILLINEN KORKEAKOULU
Two-way analysis of high-dimensional collinear data 1 Tommi Suvitaival 1 Janne Nikkilä 1,2 Matej Orešič 3 Samuel Kaski 1 1 Department of Information and Computer Science, Helsinki University of Technology,
More informationQuantitative Genomics and Genetics BTRY 4830/6830; PBSB
Quantitative Genomics and Genetics BTRY 4830/6830; PBSB.5201.01 Lecture16: Population structure and logistic regression I Jason Mezey jgm45@cornell.edu April 11, 2017 (T) 8:40-9:55 Announcements I April
More informationSome New Methods for Family-Based Association Studies
Some New Methods for Family-Based Association Studies Ingo Ruczinski Department of Biostatistics Johns Hopkins Bloomberg School of Public Health April 8, 20 http: //biostat.jhsph.edu/ iruczins/ Topics
More informationScience Unit Learning Summary
Learning Summary Inheritance, variation and evolution Content Sexual and asexual reproduction. Meiosis leads to non-identical cells being formed while mitosis leads to identical cells being formed. In
More informationR/qtl workshop. (part 2) Karl Broman. Biostatistics and Medical Informatics University of Wisconsin Madison. kbroman.org
R/qtl workshop (part 2) Karl Broman Biostatistics and Medical Informatics University of Wisconsin Madison kbroman.org github.com/kbroman @kwbroman Example Sugiyama et al. Genomics 71:70-77, 2001 250 male
More informationNetwork Biology-part II
Network Biology-part II Jun Zhu, Ph. D. Professor of Genomics and Genetic Sciences Icahn Institute of Genomics and Multi-scale Biology The Tisch Cancer Institute Icahn Medical School at Mount Sinai New
More informationModel Accuracy Measures
Model Accuracy Measures Master in Bioinformatics UPF 2017-2018 Eduardo Eyras Computational Genomics Pompeu Fabra University - ICREA Barcelona, Spain Variables What we can measure (attributes) Hypotheses
More informationMultivariate Bernoulli Distribution 1
DEPARTMENT OF STATISTICS University of Wisconsin 1300 University Ave. Madison, WI 53706 TECHNICAL REPORT NO. 1170 June 6, 2012 Multivariate Bernoulli Distribution 1 Bin Dai 2 Department of Statistics University
More information1 Errors in mitosis and meiosis can result in chromosomal abnormalities.
Slide 1 / 21 1 Errors in mitosis and meiosis can result in chromosomal abnormalities. a. Identify and describe a common chromosomal mutation. Slide 2 / 21 Errors in mitosis and meiosis can result in chromosomal
More informationCase-Control Association Testing. Case-Control Association Testing
Introduction Association mapping is now routinely being used to identify loci that are involved with complex traits. Technological advances have made it feasible to perform case-control association studies
More informationSNP Association Studies with Case-Parent Trios
SNP Association Studies with Case-Parent Trios Department of Biostatistics Johns Hopkins Bloomberg School of Public Health July, 00 Acknowledgments Collaborators: Qing Li, Rob Scharpf, Holger Schwender,
More informationMapping multiple QTL in experimental crosses
Mapping multiple QTL in experimental crosses Karl W Broman Department of Biostatistics and Medical Informatics University of Wisconsin Madison www.biostat.wisc.edu/~kbroman [ Teaching Miscellaneous lectures]
More informationQuantitative Genomics and Genetics BTRY 4830/6830; PBSB
Quantitative Genomics and Genetics BTRY 4830/6830; PBSB.5201.01 Lecture 18: Introduction to covariates, the QQ plot, and population structure II + minimal GWAS steps Jason Mezey jgm45@cornell.edu April
More informationInferring Transcriptional Regulatory Networks from High-throughput Data
Inferring Transcriptional Regulatory Networks from High-throughput Data Lectures 9 Oct 26, 2011 CSE 527 Computational Biology, Fall 2011 Instructor: Su-In Lee TA: Christopher Miles Monday & Wednesday 12:00-1:20
More informationTest for interactions between a genetic marker set and environment in generalized linear models Supplementary Materials
Biostatistics (2013), pp. 1 31 doi:10.1093/biostatistics/kxt006 Test for interactions between a genetic marker set and environment in generalized linear models Supplementary Materials XINYI LIN, SEUNGGUEN
More informationPathway Association Analysis Trey Ideker UCSD
Pathway Association Analysis Trey Ideker UCSD A working network map of the cell Network evolutionary comparison / cross-species alignment to identify conserved modules The Working Map Network-based classification
More informationComputational Network Biology Biostatistics & Medical Informatics 826 Fall 2018
Computational Network Biology Biostatistics & Medical Informatics 826 Fall 2018 Sushmita Roy sroy@biostat.wisc.edu https://compnetbiocourse.discovery.wisc.edu Sep 6 th 2018 Goals for today Administrivia
More informationp(d g A,g B )p(g B ), g B
Supplementary Note Marginal effects for two-locus models Here we derive the marginal effect size of the three models given in Figure 1 of the main text. For each model we assume the two loci (A and B)
More informationAn introduction to biostatistics: part 1
An introduction to biostatistics: part 1 Cavan Reilly September 6, 2017 Table of contents Introduction to data analysis Uncertainty Probability Conditional probability Random variables Discrete random
More informationLecture 28. Ingo Ruczinski. December 3, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University
Lecture 28 Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University December 3, 2015 1 2 3 4 5 1 Familywise error rates 2 procedure 3 Performance of with multiple
More informationIntroduction to Bioinformatics
CSCI8980: Applied Machine Learning in Computational Biology Introduction to Bioinformatics Rui Kuang Department of Computer Science and Engineering University of Minnesota kuang@cs.umn.edu History of Bioinformatics
More informationCausal inference in biomedical sciences: causal models involving genotypes. Mendelian randomization genes as Instrumental Variables
Causal inference in biomedical sciences: causal models involving genotypes Causal models for observational data Instrumental variables estimation and Mendelian randomization Krista Fischer Estonian Genome
More informationModeling IBD for Pairs of Relatives. Biostatistics 666 Lecture 17
Modeling IBD for Pairs of Relatives Biostatistics 666 Lecture 7 Previously Linkage Analysis of Relative Pairs IBS Methods Compare observed and expected sharing IBD Methods Account for frequency of shared
More informationMapping QTL to a phylogenetic tree
Mapping QTL to a phylogenetic tree Karl W Broman Department of Biostatistics & Medical Informatics University of Wisconsin Madison www.biostat.wisc.edu/~kbroman Human vs mouse www.daviddeen.com 3 Intercross
More informationMath 124: Modules Overall Goal. Point Estimations. Interval Estimation. Math 124: Modules Overall Goal.
What we will do today s David Meredith Department of Mathematics San Francisco State University October 22, 2009 s 1 2 s 3 What is a? Decision support Political decisions s s Goal of statistics: optimize
More informationBiostatistics-Lecture 16 Model Selection. Ruibin Xi Peking University School of Mathematical Sciences
Biostatistics-Lecture 16 Model Selection Ruibin Xi Peking University School of Mathematical Sciences Motivating example1 Interested in factors related to the life expectancy (50 US states,1969-71 ) Per
More informationBuilding a Prognostic Biomarker
Building a Prognostic Biomarker Noah Simon and Richard Simon July 2016 1 / 44 Prognostic Biomarker for a Continuous Measure On each of n patients measure y i - single continuous outcome (eg. blood pressure,
More informationHomework Assignment, Evolutionary Systems Biology, Spring Homework Part I: Phylogenetics:
Homework Assignment, Evolutionary Systems Biology, Spring 2009. Homework Part I: Phylogenetics: Introduction. The objective of this assignment is to understand the basics of phylogenetic relationships
More informationPCA and admixture models
PCA and admixture models CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar, Alkes Price PCA and admixture models 1 / 57 Announcements HW1
More informationGBLUP and G matrices 1
GBLUP and G matrices 1 GBLUP from SNP-BLUP We have defined breeding values as sum of SNP effects:! = #$ To refer breeding values to an average value of 0, we adopt the centered coding for genotypes described
More informationIntroduction to QTL mapping in model organisms
Introduction to QTL mapping in model organisms Karl Broman Biostatistics and Medical Informatics University of Wisconsin Madison kbroman.org github.com/kbroman @kwbroman Backcross P 1 P 2 P 1 F 1 BC 4
More information2003 National Name Exchange Annual Report
2003 National Name Exchange Annual Report Executive Summary 28 th annual meeting Hilton, University of Florida Conference Center April 16, 2004 Hosted by the University of Florida http://www.grad.washington.edu/nameexch/national/
More informationThe phenotype of this worm is wild type. When both genes are mutant: The phenotype of this worm is double mutant Dpy and Unc phenotype.
Series 2: Cross Diagrams - Complementation There are two alleles for each trait in a diploid organism In C. elegans gene symbols are ALWAYS italicized. To represent two different genes on the same chromosome:
More informationThe Sum of Standardized Residuals: Goodness of Fit Test for Binary Response Model
The Sum of Standardized Residuals: Goodness of Fit Test for Binary Response Model Thesis Presented in Partial Fulfillment of the Requirements for the Degree Master of Science in the Graduate School of
More informationLogistic Regression Review Fall 2012 Recitation. September 25, 2012 TA: Selen Uguroglu
Logistic Regression Review 10-601 Fall 2012 Recitation September 25, 2012 TA: Selen Uguroglu!1 Outline Decision Theory Logistic regression Goal Loss function Inference Gradient Descent!2 Training Data
More informationFriday Harbor From Genetics to GWAS (Genome-wide Association Study) Sept David Fardo
Friday Harbor 2017 From Genetics to GWAS (Genome-wide Association Study) Sept 7 2017 David Fardo Purpose: prepare for tomorrow s tutorial Genetic Variants Quality Control Imputation Association Visualization
More informationPredicting Protein Functions and Domain Interactions from Protein Interactions
Predicting Protein Functions and Domain Interactions from Protein Interactions Fengzhu Sun, PhD Center for Computational and Experimental Genomics University of Southern California Outline High-throughput
More informationCombining dependent tests for linkage or association across multiple phenotypic traits
Biostatistics (2003), 4, 2,pp. 223 229 Printed in Great Britain Combining dependent tests for linkage or association across multiple phenotypic traits XIN XU Program for Population Genetics, Harvard School
More informationLatent Variable models for GWAs
Latent Variable models for GWAs Oliver Stegle Machine Learning and Computational Biology Research Group Max-Planck-Institutes Tübingen, Germany September 2011 O. Stegle Latent variable models for GWAs
More information