Calculation of IBD probabilities
|
|
- Russell Carroll
- 5 years ago
- Views:
Transcription
1 Calculation of IBD probabilities David Evans and Stacey Cherny University of Oxford Wellcome Trust Centre for Human Genetics
2 This Session IBD vs IBS Why is IBD important? Calculating IBD probabilities Lander-Green Algorithm MERLIN) Single locus probabilities Hidden Markov Model Other ways of calculating IBD status Elston-Stewart Algorithm MCMC approaches MERLIN Practical Example IBD determination Information content mapping SNPs vs micro-satellite markers?
3 Aim of Gene Mapping Experiments Identify variants that control interesting traits Susceptibility to human disease Phenotypic variation in the population The hypothesis Individuals sharing these variants will be more similar for traits they control The difficulty Testing ~0 million variants is impractical
4 Identity-by-Descent IBD) Two alleles are IBD if they are descended from the same ancestral allele If a stretch of chromosome is IBD among a set of individuals, ALL variants within that stretch will also be shared IBD markers, QTLs, disease genes) Allows surveys of large amounts of variation even when a few polymorphisms measured
5 A Segregating Disease Allele +/+ +/mut +/mut +/+ +/mut +/mut +/+ +/mut +/+ All affected individuals IBD for disease causing mutation
6 Segregating Chromosomes MARKER DISEASE LOCUS Affected individuals tend to share adjacent areas of chromosome IBD
7 Marker Shared Among Affecteds /2 3/4 3/4 /3 2/4 /4 4/4 4/4 /4 4 allele segregates with disease
8 Why is IBD sharing important? /2 3/4 IBD sharing forms the basis of nonparametric linkage statistics 3/4 /3 2/4 /4 4/4 4/4 /4 Affected relatives tend to share marker alleles close to the disease locus IBD more often than chance
9 Linkage between QTL and marker QTL Marker IBD 0 IBD IBD 2
10 NO Linkage between QTL and marker Marker
11 IBD vs IBS Identical by Descent and Identical by State Identical by state only
12 Example: IBD in Siblings Consider a mating between mother AB x father CD: Sib AC AD BC BD Sib 2 AC AD BC 0 2 BD 0 2 IBD 0 : : 2 25% : 50% : 25%
13 IBD can be trivial / 2 / 2 IBD0 / 2 / 2
14 Two Other Simple Cases / 2 / 2 / 2 / 2 IBD2 / / 2 2 / 2 2 /
15 A little more complicated / 2 2 / 2 IBD 50% chance) / 2 / 2 IBD2 50% chance)
16 And even more complicated IBD? / /
17 Bayes Theorem j A j B P A j P A i B P A i P B P A i B P A i P B P B B A i P ) ) ) ) ) ) ) ), P ) ) A i
18 Bayes Theorem for IBD Probabilities j j IBD G P j IBD P i IBD G P i IBD P G P i IBD G P i IBD P G P G i G i IBD P ) ) ) ) ) ) ) ) ), PIBD )
19 PMarker Genotype IBD State) Sib Sib 2 k0 k k2 A A A A p 4 p 3 p 2 A A A A 2 2p 3 p 2 p 2 p 2 0 A A A 2 A 2 p 2 p A A 2 A A 2p 3 p 2 p 2 p 2 0 A A 2 A A 2 4p 2 p 2 2 p p 2 2p p 2 A A 2 A 2 A 2 2p p 3 2 p p A 2 A 2 A A p 2 p A 2 A 2 A 2p p p 2 p 3 A A p 4 p 3 p 2 2 A A 2 A 2 Pobserving genotypes / k alleles IBD)
20 Worked Example p 0.5 / /
21 Worked Example / / 9 4 ) 4 ) ) 2 ) 9 ) 4 ) ) 4 2) 8 ) 6 0) P G p G IBD P P G p G IBD P P G p G IBD P p p p P G p IBD P G p IBD P G p IBD P G p
22 For ANY PEDIGREE the inheritance pattern at every point in the genome can be completely described by a binary inheritance vector: vx) p, m, p 2, m 2,,p n,m n ) whose coordinates describe the outcome of the 2n paternal and maternal meioses giving rise to the n non-founders in the pedigree p i m i ) is 0 if the grandpaternal allele transmitted p i m i ) is if the grandmaternal allele is transmitted a / b c / d p p 2 m m 2 vx) [0,0,,] a / c b / d
23 Inheritance Vector In practice, it is not possible to determine the true inheritance vector at every point in the genome, rather we represent partial information as a probability distribution over the 2 2n possible inheritance vectors a b 2 a c a c p m p 2 m 2 a b b b Inheritance vector Prior Posterior /6 /8 000 /6 /8 000 / / /6 /8 00 /6 /8 00 /6 0 0 / /6 /8 00 /6 /8 00 /6 0 0 / /6 /8 0 /6 /8 0 /6 0 /6 0
24 Computer Representation Define inheritance vector v l Each inheritance vector indexed by a different memory location Likelihood for each gene flow pattern Conditional on observed genotypes at location l 2 2n elements!!! At each marker location l L L L L L L L L L L L L L L L L
25 a) bit-indexed array L L 2 L L 2 L L 2 L L 2 b) packed tree L L 2 L L 2 L L 2 L L 2 c) sparse tree Legend Node with zero likelihood Node identical to sibling L L 2 L L 2 Likelihood for this branch Abecasis et al 2002) Nat Genet 30:97-0
26 Multipoint IBD IBD status may not be able to be ascertained with certainty because e.g. the mating is not informative, parental information is not available IBD information at uninformative loci can be made more precise by examining nearby linked loci
27 Multipoint IBD a / b / c / d 2 / IBD 0 a / c b / d / / 2 IBD 0 or IBD?
28 Complexity of the Problem in Larger Pedigrees 2n meioses in pedigree with n nonfounders Each meiosis has 2 possible outcomes Therefore 2 2n possibilities for each locus For each genetic locus One location for each of m genetic markers Distinct, non-independent meiotic outcomes Up to 4 nm distinct outcomes!!!
29 Example: Sib-pair Genotyped at 0 Markers Inheritance vector m 0 Marker 2 2xn ) m 2 2 x 2 ) possible paths!!!
30 Lander-Green Algorithm The inheritance vector at a locus is conditionally independent of the inheritance vectors at all preceding loci given the inheritance vector at the immediately preceding locus Hidden Markov chain ) The conditional probability of an inheritance vector v i+ at locus i+, given the inheritance vector v i at locus i is θ ij -θ i ) 2n-j where θ is the recombination fraction and j is the number of changes in elements of the inheritance vector transition probabilities ) Example: Locus [0000] Locus 2 [000] Conditional probability θ) 3 θ
31 m Total Likelihood Q T Q 2 T 2 T m- Q m Q i P[0000] P[000] P[] T i -θ) 4 -θ) 3 θ θ 4 -θ) 3 θ -θ) 4 -θ)θ 3 θ 4 -θ)θ 3 -θ) 4 2 2n x 2 2n diagonal matrix of single locus probabilities at locus i 2 2n x 2 2n matrix of transitional probabilities between locus i and locus i+ ~0 x 2 2 x 2 ) 2 operations 2560 for this case!!!
32 PIBD) 2 at Marker Three Inheritance vector m 0 Marker L[IBD 2 at marker 3] / L[ALL] L[0000] + L[00] + L[00] + L[] ) / L[ALL]
33 PIBD) 2 at arbitrary position on the chromosome Inheritance vector m 0 Marker L[0000] + L[00] + L[00] + L[] ) / L[ALL]
34 Further speedups Trees summarize redundant information Portions of inheritance vector that are repeated Portions of inheritance vector that are constant or zero Use sparse-matrix by vector multiplication Regularities in transition matrices Use symmetries in divide and conquer algorithm Idury & Elston, 997)
35 Lander-Green Algorithm Summary Factorize likelihood by marker Complexity m e n Large number of markers e.g. dense SNP data) Relatively small pedigrees MERLIN, GENEHUNTER, ALLEGRO etc
36 Elston-Stewart Algorithm Factorize likelihood by individual Complexity n e m Small number of markers Large pedigrees With little inbreeding VITESSE etc
37 Other methods Number of MCMC methods proposed ~Linear on # markers ~Linear on # people Hard to guarantee convergence on very large datasets Many widely separated local minima E.g. SIMWALK, LOKI
38 MERLIN-- Multipoint Engine for Rapid Likelihood Inference
39 Capabilities Linkage Analysis NPL and K&C LOD Variance Components Haplotypes Most likely Sampling All IBD and info content Error Detection Most SNP typing errors are Mendelian consistent Recombination No. of recombinants per family per interval can be controlled Simulation
40 MERLIN Website Reference FAQ Source Tutorial Linkage Haplotyping Simulation Error detection IBD calculation Binaries
41 Test Case Pedigrees
42 Timings Marker Locations Top Generation Genotyped A x000) B C D Genehunter 38s 37s 8m6s * Allegro 8s 2m7s 3h54m3s * Merlin s 8s 3m55s * Top Generation Not Genotyped A x000) B C D Genehunter 45s m54s * * Allegro 8s m08s h2m38s * Merlin 3s 25s 5m50s *
43 Intuition: Approximate Sparse T Dense maps, closely spaced markers Small recombination fractions θ Reasonable to set θ k with zero Produces a very sparse transition matrix Consider only elements of v separated by <k recombination events At consecutive locations
44 Additional Speedup Time Memory Exact 40s 00 MB No recombination <s 4 MB recombinant 2s 7 MB 2 recombinants 5s 54 MB Genehunter 2. 6min 024MB Keavney et al 998) ACE data, 0 SNPs within gene, 4-8 individuals per family
45 Input Files Pedigree File Relationships Genotype data Phenotype data Data File Describes contents of pedigree file Map File Records location of genetic markers
46 Example Pedigree File <contents of example.ped> 0 0 x 3 3 x x x 4 4 x x x 2 x x x 4 3 x x <end of example.ped> Encodes family relationships, marker and phenotype information
47 Example Data File <contents of example.dat> T some_trait_of_interest M some_marker M another_marker <end of example.dat> Provides information necessary to decode pedigree file
48 Data File Field Codes Code M A T C Z Description Marker Genotype Affection Status. Quantitative Trait. Covariate. Zygosity.
49 Example Map File <contents of example.map> CHROMOSOME MARKER POSITION 2 D2S D2S <end of example.map> Indicates location of individual markers, necessary to derive recombination fractions between them
50 Worked Example p 0.5 P IBD 0 G) P IBD G) P IBD 2 G) / / merlin d example.dat p example.ped m example.map --ibd
51 Application: Information Content Mapping Information content: Provides a measure of how well a marker set approaches the goal of completely determining the inheritance outcome Based on concept of entropy E -ΣP i log 2 P i where P i is probability of the ith outcome I E x) Ex)/E 0 Always lies between 0 and Does not depend on test for linkage Scales linearly with power
52 Application: Information Content Mapping Simulations sib-pairs with/out parental genotypes) micro-satellite per 0cM ABI) microsatellite per 3cM decode) SNP per 0.5cM Illumina) SNP per 0.2 cm Affymetrix) Which panel performs best in terms of extracting marker information? Do the results depend upon the presence of parental genotypes? merlin d file.dat p file.ped m file.map --information --step --markernames
53 SNPs vs Microsatellites with parents Information Content Densities SNP microsat 0.2 cm 3 cm 0.5 cm 0 cm Position cm) SNPs + parents microsat + parents
54 SNPs vs Microsatellites without parents Information Content Densities SNP Densities microsat 0.2 cm SNP 3 microsat cm cm cm 0 cm 3 cm 0.5 cm 0 cm SNPs - parents microsat - parents Position cm)
Calculation of IBD probabilities
Calculation of IBD probabilities David Evans University of Bristol This Session Identity by Descent (IBD) vs Identity by state (IBS) Why is IBD important? Calculating IBD probabilities Lander-Green Algorithm
More informationThe Lander-Green Algorithm. Biostatistics 666 Lecture 22
The Lander-Green Algorithm Biostatistics 666 Lecture Last Lecture Relationship Inferrence Likelihood of genotype data Adapt calculation to different relationships Siblings Half-Siblings Unrelated individuals
More informationGenotype Imputation. Biostatistics 666
Genotype Imputation Biostatistics 666 Previously Hidden Markov Models for Relative Pairs Linkage analysis using affected sibling pairs Estimation of pairwise relationships Identity-by-Descent Relatives
More informationAffected Sibling Pairs. Biostatistics 666
Affected Sibling airs Biostatistics 666 Today Discussion of linkage analysis using affected sibling pairs Our exploration will include several components we have seen before: A simple disease model IBD
More informationTutorial Session 2. MCMC for the analysis of genetic data on pedigrees:
MCMC for the analysis of genetic data on pedigrees: Tutorial Session 2 Elizabeth Thompson University of Washington Genetic mapping and linkage lod scores Monte Carlo likelihood and likelihood ratio estimation
More informationMCMC IN THE ANALYSIS OF GENETIC DATA ON PEDIGREES
MCMC IN THE ANALYSIS OF GENETIC DATA ON PEDIGREES Elizabeth A. Thompson Department of Statistics, University of Washington Box 354322, Seattle, WA 98195-4322, USA Email: thompson@stat.washington.edu This
More informationLecture 9. QTL Mapping 2: Outbred Populations
Lecture 9 QTL Mapping 2: Outbred Populations Bruce Walsh. Aug 2004. Royal Veterinary and Agricultural University, Denmark The major difference between QTL analysis using inbred-line crosses vs. outbred
More information2. Map genetic distance between markers
Chapter 5. Linkage Analysis Linkage is an important tool for the mapping of genetic loci and a method for mapping disease loci. With the availability of numerous DNA markers throughout the human genome,
More informationThe Quantitative TDT
The Quantitative TDT (Quantitative Transmission Disequilibrium Test) Warren J. Ewens NUS, Singapore 10 June, 2009 The initial aim of the (QUALITATIVE) TDT was to test for linkage between a marker locus
More informationThe genomes of recombinant inbred lines
The genomes of recombinant inbred lines Karl W Broman Department of Biostatistics Johns Hopkins University http://www.biostat.jhsph.edu/~kbroman C57BL/6 2 1 Recombinant inbred lines (by sibling mating)
More informationGene mapping, linkage analysis and computational challenges. Konstantin Strauch
Gene mapping, linkage analysis an computational challenges Konstantin Strauch Institute for Meical Biometry, Informatics, an Epiemiology (IMBIE) University of Bonn E-mail: strauch@uni-bonn.e Genetics an
More informationPrediction of the Confidence Interval of Quantitative Trait Loci Location
Behavior Genetics, Vol. 34, No. 4, July 2004 ( 2004) Prediction of the Confidence Interval of Quantitative Trait Loci Location Peter M. Visscher 1,3 and Mike E. Goddard 2 Received 4 Sept. 2003 Final 28
More informationAssociation Testing with Quantitative Traits: Common and Rare Variants. Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5
Association Testing with Quantitative Traits: Common and Rare Variants Timothy Thornton and Katie Kerr Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5 1 / 41 Introduction to Quantitative
More informationLecture 2: Genetic Association Testing with Quantitative Traits. Summer Institute in Statistical Genetics 2017
Lecture 2: Genetic Association Testing with Quantitative Traits Instructors: Timothy Thornton and Michael Wu Summer Institute in Statistical Genetics 2017 1 / 29 Introduction to Quantitative Trait Mapping
More informationThe universal validity of the possible triangle constraint for Affected-Sib-Pairs
The Canadian Journal of Statistics Vol. 31, No.?, 2003, Pages???-??? La revue canadienne de statistique The universal validity of the possible triangle constraint for Affected-Sib-Pairs Zeny Z. Feng, Jiahua
More information1 Springer. Nan M. Laird Christoph Lange. The Fundamentals of Modern Statistical Genetics
1 Springer Nan M. Laird Christoph Lange The Fundamentals of Modern Statistical Genetics 1 Introduction to Statistical Genetics and Background in Molecular Genetics 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
More informationFor 5% confidence χ 2 with 1 degree of freedom should exceed 3.841, so there is clear evidence for disequilibrium between S and M.
STAT 550 Howework 6 Anton Amirov 1. This question relates to the same study you saw in Homework-4, by Dr. Arno Motulsky and coworkers, and published in Thompson et al. (1988; Am.J.Hum.Genet, 42, 113-124).
More informationp(d g A,g B )p(g B ), g B
Supplementary Note Marginal effects for two-locus models Here we derive the marginal effect size of the three models given in Figure 1 of the main text. For each model we assume the two loci (A and B)
More informationUNIT 8 BIOLOGY: Meiosis and Heredity Page 148
UNIT 8 BIOLOGY: Meiosis and Heredity Page 148 CP: CHAPTER 6, Sections 1-6; CHAPTER 7, Sections 1-4; HN: CHAPTER 11, Section 1-5 Standard B-4: The student will demonstrate an understanding of the molecular
More informationIntroduction to population genetics & evolution
Introduction to population genetics & evolution Course Organization Exam dates: Feb 19 March 1st Has everybody registered? Did you get the email with the exam schedule Summer seminar: Hot topics in Bioinformatics
More informationAnalytic power calculation for QTL linkage analysis of small pedigrees
(2001) 9, 335 ± 340 ã 2001 Nature Publishing Group All rights reserved 1018-4813/01 $15.00 www.nature.com/ejhg ARTICLE for QTL linkage analysis of small pedigrees FruÈhling V Rijsdijk*,1, John K Hewitt
More informationInference on pedigree structure from genome screen data. Running title: Inference on pedigree structure. Mary Sara McPeek. The University of Chicago
1 Inference on pedigree structure from genome screen data Running title: Inference on pedigree structure Mary Sara McPeek The University of Chicago Address for correspondence: Department of Statistics,
More informationBreeding Values and Inbreeding. Breeding Values and Inbreeding
Breeding Values and Inbreeding Genotypic Values For the bi-allelic single locus case, we previously defined the mean genotypic (or equivalently the mean phenotypic values) to be a if genotype is A 2 A
More informationLecture WS Evolutionary Genetics Part I 1
Quantitative genetics Quantitative genetics is the study of the inheritance of quantitative/continuous phenotypic traits, like human height and body size, grain colour in winter wheat or beak depth in
More informationVariance Component Models for Quantitative Traits. Biostatistics 666
Variance Component Models for Quantitative Traits Biostatistics 666 Today Analysis of quantitative traits Modeling covariance for pairs of individuals estimating heritability Extending the model beyond
More informationAFFECTED RELATIVE PAIR LINKAGE STATISTICS THAT MODEL RELATIONSHIP UNCERTAINTY
AFFECTED RELATIVE PAIR LINKAGE STATISTICS THAT MODEL RELATIONSHIP UNCERTAINTY by Amrita Ray BSc, Presidency College, Calcutta, India, 2001 MStat, Indian Statistical Institute, Calcutta, India, 2003 Submitted
More informationI Have the Power in QTL linkage: single and multilocus analysis
I Have the Power in QTL linkage: single and multilocus analysis Benjamin Neale 1, Sir Shaun Purcell 2 & Pak Sham 13 1 SGDP, IoP, London, UK 2 Harvard School of Public Health, Cambridge, MA, USA 3 Department
More informationGene mapping in model organisms
Gene mapping in model organisms Karl W Broman Department of Biostatistics Johns Hopkins University http://www.biostat.jhsph.edu/~kbroman Goal Identify genes that contribute to common human diseases. 2
More information(Genome-wide) association analysis
(Genome-wide) association analysis 1 Key concepts Mapping QTL by association relies on linkage disequilibrium in the population; LD can be caused by close linkage between a QTL and marker (= good) or by
More informationChapter 6 Linkage Disequilibrium & Gene Mapping (Recombination)
12/5/14 Chapter 6 Linkage Disequilibrium & Gene Mapping (Recombination) Linkage Disequilibrium Genealogical Interpretation of LD Association Mapping 1 Linkage and Recombination v linkage equilibrium ²
More informationStatistical issues in QTL mapping in mice
Statistical issues in QTL mapping in mice Karl W Broman Department of Biostatistics Johns Hopkins University http://www.biostat.jhsph.edu/~kbroman Outline Overview of QTL mapping The X chromosome Mapping
More informationComplexity and Approximation of the Minimum Recombination Haplotype Configuration Problem
Complexity and Approximation of the Minimum Recombination Haplotype Configuration Problem Lan Liu 1, Xi Chen 3, Jing Xiao 3, and Tao Jiang 1,2 1 Department of Computer Science and Engineering, University
More informationLecture 6. QTL Mapping
Lecture 6 QTL Mapping Bruce Walsh. Aug 2003. Nordic Summer Course MAPPING USING INBRED LINE CROSSES We start by considering crosses between inbred lines. The analysis of such crosses illustrates many of
More informationLinkage and Linkage Disequilibrium
Linkage and Linkage Disequilibrium Summer Institute in Statistical Genetics 2014 Module 10 Topic 3 Linkage in a simple genetic cross Linkage In the early 1900 s Bateson and Punnet conducted genetic studies
More informationNatural Selection. Population Dynamics. The Origins of Genetic Variation. The Origins of Genetic Variation. Intergenerational Mutation Rate
Natural Selection Population Dynamics Humans, Sickle-cell Disease, and Malaria How does a population of humans become resistant to malaria? Overproduction Environmental pressure/competition Pre-existing
More informationProportional Variance Explained by QLT and Statistical Power. Proportional Variance Explained by QTL and Statistical Power
Proportional Variance Explained by QTL and Statistical Power Partitioning the Genetic Variance We previously focused on obtaining variance components of a quantitative trait to determine the proportion
More informationMODEL-FREE LINKAGE AND ASSOCIATION MAPPING OF COMPLEX TRAITS USING QUANTITATIVE ENDOPHENOTYPES
MODEL-FREE LINKAGE AND ASSOCIATION MAPPING OF COMPLEX TRAITS USING QUANTITATIVE ENDOPHENOTYPES Saurabh Ghosh Human Genetics Unit Indian Statistical Institute, Kolkata Most common diseases are caused by
More informationResemblance between relatives
Resemblance between relatives 1 Key concepts Model phenotypes by fixed effects and random effects including genetic value (additive, dominance, epistatic) Model covariance of genetic effects by relationship
More informationResultants in genetic linkage analysis
Journal of Symbolic Computation 41 (2006) 125 137 www.elsevier.com/locate/jsc Resultants in genetic linkage analysis Ingileif B. Hallgrímsdóttir a,, Bernd Sturmfels b a Department of Statistics, University
More informationPopulation Genetics. with implications for Linkage Disequilibrium. Chiara Sabatti, Human Genetics 6357a Gonda
1 Population Genetics with implications for Linkage Disequilibrium Chiara Sabatti, Human Genetics 6357a Gonda csabatti@mednet.ucla.edu 2 Hardy-Weinberg Hypotheses: infinite populations; no inbreeding;
More informationPowerful Regression-Based Quantitative-Trait Linkage Analysis of General Pedigrees
Am. J. Hum. Genet. 71:38 53, 00 Powerful Regression-Based Quantitative-Trait Linkage Analysis of General Pedigrees Pak C. Sham, 1 Shaun Purcell, 1 Stacey S. Cherny, 1, and Gonçalo R. Abecasis 3 1 Institute
More informationGBLUP and G matrices 1
GBLUP and G matrices 1 GBLUP from SNP-BLUP We have defined breeding values as sum of SNP effects:! = #$ To refer breeding values to an average value of 0, we adopt the centered coding for genotypes described
More informationNotes on Population Genetics
Notes on Population Genetics Graham Coop 1 1 Department of Evolution and Ecology & Center for Population Biology, University of California, Davis. To whom correspondence should be addressed: gmcoop@ucdavis.edu
More informationTOPICS IN STATISTICAL METHODS FOR HUMAN GENE MAPPING
TOPICS IN STATISTICAL METHODS FOR HUMAN GENE MAPPING by Chia-Ling Kuo MS, Biostatstics, National Taiwan University, Taipei, Taiwan, 003 BBA, Statistics, National Chengchi University, Taipei, Taiwan, 001
More informationQTL mapping under ascertainment
QTL mapping under ascertainment J. PENG Department of Statistics, University of California, Davis, CA 95616 D. SIEGMUND Department of Statistics, Stanford University, Stanford, CA 94305 February 15, 2006
More informationgenome a specific characteristic that varies from one individual to another gene the passing of traits from one generation to the next
genetics the study of heredity heredity sequence of DNA that codes for a protein and thus determines a trait genome a specific characteristic that varies from one individual to another gene trait the passing
More informationOptimal Allele-Sharing Statistics for Genetic Mapping Using Affected Relatives
Genetic Epidemiology 16:225 249 (1999) Optimal Allele-Sharing Statistics for Genetic Mapping Using Affected Relatives Mary Sara McPeek* Department of Statistics, University of Chicago, Chicago, Illinois
More informationBayesian Inference of Interactions and Associations
Bayesian Inference of Interactions and Associations Jun Liu Department of Statistics Harvard University http://www.fas.harvard.edu/~junliu Based on collaborations with Yu Zhang, Jing Zhang, Yuan Yuan,
More informationComputational Approaches to Statistical Genetics
Computational Approaches to Statistical Genetics GWAS I: Concepts and Probability Theory Christoph Lippert Dr. Oliver Stegle Prof. Dr. Karsten Borgwardt Max-Planck-Institutes Tübingen, Germany Tübingen
More informationSOLUTIONS TO EXERCISES FOR CHAPTER 9
SOLUTIONS TO EXERCISES FOR CHPTER 9 gronomy 65 Statistical Genetics W. E. Nyquist March 00 Exercise 9.. a. lgebraic method for the grandparent-grandoffspring covariance (see parent-offspring covariance,
More informationMIXED MODELS THE GENERAL MIXED MODEL
MIXED MODELS This chapter introduces best linear unbiased prediction (BLUP), a general method for predicting random effects, while Chapter 27 is concerned with the estimation of variances by restricted
More informationLinear Regression (1/1/17)
STA613/CBB540: Statistical methods in computational biology Linear Regression (1/1/17) Lecturer: Barbara Engelhardt Scribe: Ethan Hada 1. Linear regression 1.1. Linear regression basics. Linear regression
More informationMathematical models in population genetics II
Mathematical models in population genetics II Anand Bhaskar Evolutionary Biology and Theory of Computing Bootcamp January 1, 014 Quick recap Large discrete-time randomly mating Wright-Fisher population
More informationExpression QTLs and Mapping of Complex Trait Loci. Paul Schliekelman Statistics Department University of Georgia
Expression QTLs and Mapping of Complex Trait Loci Paul Schliekelman Statistics Department University of Georgia Definitions: Genes, Loci and Alleles A gene codes for a protein. Proteins due everything.
More informationHomework Assignment, Evolutionary Systems Biology, Spring Homework Part I: Phylogenetics:
Homework Assignment, Evolutionary Systems Biology, Spring 2009. Homework Part I: Phylogenetics: Introduction. The objective of this assignment is to understand the basics of phylogenetic relationships
More informationUnivariate Linkage in Mx. Boulder, TC 18, March 2005 Posthuma, Maes, Neale
Univariate Linkage in Mx Boulder, TC 18, March 2005 Posthuma, Maes, Neale VC analysis of Linkage Incorporating IBD Coefficients Covariance might differ according to sharing at a particular locus. Sharing
More informationIdentification of Pedigree Relationship from Genome Sharing
INVESTIGATION Identification of Pedigree Relationship from Genome Sharing William G. Hill 1 and Ian M. S. White Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh,
More informationSolutions to Problem Set 4
Question 1 Solutions to 7.014 Problem Set 4 Because you have not read much scientific literature, you decide to study the genetics of garden peas. You have two pure breeding pea strains. One that is tall
More informationThe E-M Algorithm in Genetics. Biostatistics 666 Lecture 8
The E-M Algorithm in Genetics Biostatistics 666 Lecture 8 Maximum Likelihood Estimation of Allele Frequencies Find parameter estimates which make observed data most likely General approach, as long as
More informationMEIOSIS, THE BASIS OF SEXUAL REPRODUCTION
MEIOSIS, THE BASIS OF SEXUAL REPRODUCTION Why do kids look different from the parents? How are they similar to their parents? Why aren t brothers or sisters more alike? Meiosis A process where the number
More informationModeling IBD for Pairs of Relatives. Biostatistics 666 Lecture 17
Modeling IBD for Pairs of Relatives Biostatistics 666 Lecture 7 Previously Linkage Analysis of Relative Pairs IBS Methods Compare observed and expected sharing IBD Methods Account for frequency of shared
More informationLecture 5: BLUP (Best Linear Unbiased Predictors) of genetic values. Bruce Walsh lecture notes Tucson Winter Institute 9-11 Jan 2013
Lecture 5: BLUP (Best Linear Unbiased Predictors) of genetic values Bruce Walsh lecture notes Tucson Winter Institute 9-11 Jan 013 1 Estimation of Var(A) and Breeding Values in General Pedigrees The classic
More informationCSci 8980: Advanced Topics in Graphical Models Analysis of Genetic Variation
CSci 8980: Advanced Topics in Graphical Models Analysis of Genetic Variation Instructor: Arindam Banerjee November 26, 2007 Genetic Polymorphism Single nucleotide polymorphism (SNP) Genetic Polymorphism
More informationThe Admixture Model in Linkage Analysis
The Admixture Model in Linkage Analysis Jie Peng D. Siegmund Department of Statistics, Stanford University, Stanford, CA 94305 SUMMARY We study an appropriate version of the score statistic to test the
More informationQuantitative Genetics
Bruce Walsh, University of Arizona, Tucson, Arizona, USA Almost any trait that can be defined shows variation, both within and between populations. Quantitative genetics is concerned with the analysis
More informationEvaluating the Performance of a Block Updating McMC Sampler in a Simple Genetic Application
Evaluating the Performance of a Block Updating McMC Sampler in a Simple Genetic Application N.A. SHEEHAN B. GULDBRANDTSEN 1 D.A. SORENSEN 1 1 Markov chain Monte Carlo (McMC) methods have provided an enormous
More informationMethods for Cryptic Structure. Methods for Cryptic Structure
Case-Control Association Testing Review Consider testing for association between a disease and a genetic marker Idea is to look for an association by comparing allele/genotype frequencies between the cases
More informationTHEORETICAL ASPECTS OF PEDIGREE ANALYSIS
THEORETICAL ASPECTS OF PEDIGREE ANALYSIS E. Ginsburg, I. Malkin, R.C. Elston THEORETICAL ASPECTS OF PEDIGREE ANALYSIS RAMOT PUBLISHING HOUSE TEL-AVIV UNIVERSITY, ISRAEL E. Ginsburg, I. Malkin, R.C. Elston
More informationIntroduction to QTL mapping in model organisms
Introduction to QTL mapping in model organisms Karl W Broman Department of Biostatistics Johns Hopkins University kbroman@jhsph.edu www.biostat.jhsph.edu/ kbroman Outline Experiments and data Models ANOVA
More informationLife Cycles, Meiosis and Genetic Variability24/02/2015 2:26 PM
Life Cycles, Meiosis and Genetic Variability iclicker: 1. A chromosome just before mitosis contains two double stranded DNA molecules. 2. This replicated chromosome contains DNA from only one of your parents
More informationUse of hidden Markov models for QTL mapping
Use of hidden Markov models for QTL mapping Karl W Broman Department of Biostatistics, Johns Hopkins University December 5, 2006 An important aspect of the QTL mapping problem is the treatment of missing
More informationLander-Green Algorithm. Biostatistics 666
Lander-Green Algorith Biostatistics 666 Identity-by-Descent (IBD) A roerty o chroosoe stretches that descend ro the sae ancestor Two alleles are IBD i they descend ro the sae ounder chroosoe For a air
More informationLecture 11: Multiple trait models for QTL analysis
Lecture 11: Multiple trait models for QTL analysis Julius van der Werf Multiple trait mapping of QTL...99 Increased power of QTL detection...99 Testing for linked QTL vs pleiotropic QTL...100 Multiple
More informationLearning ancestral genetic processes using nonparametric Bayesian models
Learning ancestral genetic processes using nonparametric Bayesian models Kyung-Ah Sohn October 31, 2011 Committee Members: Eric P. Xing, Chair Zoubin Ghahramani Russell Schwartz Kathryn Roeder Matthew
More information1. Understand the methods for analyzing population structure in genomes
MSCBIO 2070/02-710: Computational Genomics, Spring 2016 HW3: Population Genetics Due: 24:00 EST, April 4, 2016 by autolab Your goals in this assignment are to 1. Understand the methods for analyzing population
More informationSNP Association Studies with Case-Parent Trios
SNP Association Studies with Case-Parent Trios Department of Biostatistics Johns Hopkins Bloomberg School of Public Health September 3, 2009 Population-based Association Studies Balding (2006). Nature
More informationLesson 4: Understanding Genetics
Lesson 4: Understanding Genetics 1 Terms Alleles Chromosome Co dominance Crossover Deoxyribonucleic acid DNA Dominant Genetic code Genome Genotype Heredity Heritability Heritability estimate Heterozygous
More informationA mixed model based QTL / AM analysis of interactions (G by G, G by E, G by treatment) for plant breeding
Professur Pflanzenzüchtung Professur Pflanzenzüchtung A mixed model based QTL / AM analysis of interactions (G by G, G by E, G by treatment) for plant breeding Jens Léon 4. November 2014, Oulu Workshop
More informationObjectives. Announcements. Comparison of mitosis and meiosis
Announcements Colloquium sessions for which you can get credit posted on web site: Feb 20, 27 Mar 6, 13, 20 Apr 17, 24 May 15. Review study CD that came with text for lab this week (especially mitosis
More informationGenotype Imputation. Class Discussion for January 19, 2016
Genotype Imputation Class Discussion for January 19, 2016 Intuition Patterns of genetic variation in one individual guide our interpretation of the genomes of other individuals Imputation uses previously
More informationMajor Genes, Polygenes, and
Major Genes, Polygenes, and QTLs Major genes --- genes that have a significant effect on the phenotype Polygenes --- a general term of the genes of small effect that influence a trait QTL, quantitative
More informationEXERCISES FOR CHAPTER 7. Exercise 7.1. Derive the two scales of relation for each of the two following recurrent series:
Statistical Genetics Agronomy 65 W. E. Nyquist March 004 EXERCISES FOR CHAPTER 7 Exercise 7.. Derive the two scales of relation for each of the two following recurrent series: u: 0, 8, 6, 48, 46,L 36 7
More informationOutline. P o purple % x white & white % x purple& F 1 all purple all purple. F purple, 224 white 781 purple, 263 white
Outline - segregation of alleles in single trait crosses - independent assortment of alleles - using probability to predict outcomes - statistical analysis of hypotheses - conditional probability in multi-generation
More informationQ Expected Coverage Achievement Merit Excellence. Punnett square completed with correct gametes and F2.
NCEA Level 2 Biology (91157) 2018 page 1 of 6 Assessment Schedule 2018 Biology: Demonstrate understanding of genetic variation and change (91157) Evidence Q Expected Coverage Achievement Merit Excellence
More informationMethods for QTL analysis
Methods for QTL analysis Julius van der Werf METHODS FOR QTL ANALYSIS... 44 SINGLE VERSUS MULTIPLE MARKERS... 45 DETERMINING ASSOCIATIONS BETWEEN GENETIC MARKERS AND QTL WITH TWO MARKERS... 45 INTERVAL
More informationSolutions to Even-Numbered Exercises to accompany An Introduction to Population Genetics: Theory and Applications Rasmus Nielsen Montgomery Slatkin
Solutions to Even-Numbered Exercises to accompany An Introduction to Population Genetics: Theory and Applications Rasmus Nielsen Montgomery Slatkin CHAPTER 1 1.2 The expected homozygosity, given allele
More informationFriday Harbor From Genetics to GWAS (Genome-wide Association Study) Sept David Fardo
Friday Harbor 2017 From Genetics to GWAS (Genome-wide Association Study) Sept 7 2017 David Fardo Purpose: prepare for tomorrow s tutorial Genetic Variants Quality Control Imputation Association Visualization
More informationCase-Control Association Testing with Related Individuals: A More Powerful Quasi-Likelihood Score Test
Case-Control Association esting with Related Individuals: A More Powerful Quasi-Likelihood Score est imothy hornton and Mary Sara McPeek ARICLE We consider the problem of genomewide association testing
More informationGenetics (patterns of inheritance)
MENDELIAN GENETICS branch of biology that studies how genetic characteristics are inherited MENDELIAN GENETICS Gregory Mendel, an Augustinian monk (1822-1884), was the first who systematically studied
More informationA Robust Identity-by-Descent Procedure Using Affected Sib Pairs: Multipoint Mapping for Complex Diseases
Original Paper Hum Hered 001;51:64 78 Received: May 1, 1999 Revision received: September 10, 1999 Accepted: October 6, 1999 A Robust Identity-by-Descent Procedure Using Affected Sib Pairs: Multipoint Mapping
More informationIntroduction to QTL mapping in model organisms
Introduction to QTL mapping in model organisms Karl W Broman Department of Biostatistics Johns Hopkins University kbroman@jhsph.edu www.biostat.jhsph.edu/ kbroman Outline Experiments and data Models ANOVA
More informationNormal approximation to Binomial
Normal approximation to Binomial 24.10.2007 GE02: day 3 part 3 Yurii Aulchenko Erasmus MC Rotterdam Binomial distribution at different n and p 0.3 N=10 0.2 N=25 0.1 N=100 P=0.5 0.2 0.1 0.1 0 0 0 k k k
More informationEM algorithm. Rather than jumping into the details of the particular EM algorithm, we ll look at a simpler example to get the idea of how it works
EM algorithm The example in the book for doing the EM algorithm is rather difficult, and was not available in software at the time that the authors wrote the book, but they implemented a SAS macro to implement
More informationCovariance between relatives
UNIVERSIDADE DE SÃO PAULO ESCOLA SUPERIOR DE AGRICULTURA LUIZ DE QUEIROZ DEPARTAMENTO DE GENÉTICA LGN5825 Genética e Melhoramento de Espécies Alógamas Covariance between relatives Prof. Roberto Fritsche-Neto
More informationMajor questions of evolutionary genetics. Experimental tools of evolutionary genetics. Theoretical population genetics.
Evolutionary Genetics (for Encyclopedia of Biodiversity) Sergey Gavrilets Departments of Ecology and Evolutionary Biology and Mathematics, University of Tennessee, Knoxville, TN 37996-6 USA Evolutionary
More informationConstructing a Pedigree
Constructing a Pedigree Use the appropriate symbols: Unaffected Male Unaffected Female Affected Male Affected Female Male carrier of trait Mating of Offspring 2. Label each generation down the left hand
More informationPhasing via the Expectation Maximization (EM) Algorithm
Computing Haplotype Frequencies and Haplotype Phasing via the Expectation Maximization (EM) Algorithm Department of Computer Science Brown University, Providence sorin@cs.brown.edu September 14, 2010 Outline
More informationComputation of Multilocus Prior Probability of Autozygosity for Complex Inbred Pedigrees
Genetic Epidemiology 14:1 15 (1997) Computation of Multilocus Prior Probability of Autozygosity for Complex Inbred Pedigrees Sun-Wei Guo* Department of Biostatistics, University of Michigan, Ann Arbor
More informationMapping quantitative trait loci in oligogenic models
Biostatistics (2001), 2, 2,pp. 147 162 Printed in Great Britain Mapping quantitative trait loci in oligogenic models HSIU-KHUERN TANG, D. SIEGMUND Department of Statistics, 390 Serra Mall, Sequoia Hall,
More information25 : Graphical induced structured input/output models
10-708: Probabilistic Graphical Models 10-708, Spring 2016 25 : Graphical induced structured input/output models Lecturer: Eric P. Xing Scribes: Raied Aljadaany, Shi Zong, Chenchen Zhu Disclaimer: A large
More information