Calculation of IBD probabilities

Size: px
Start display at page:

Download "Calculation of IBD probabilities"

Transcription

1 Calculation of IBD probabilities David Evans and Stacey Cherny University of Oxford Wellcome Trust Centre for Human Genetics

2 This Session IBD vs IBS Why is IBD important? Calculating IBD probabilities Lander-Green Algorithm MERLIN) Single locus probabilities Hidden Markov Model Other ways of calculating IBD status Elston-Stewart Algorithm MCMC approaches MERLIN Practical Example IBD determination Information content mapping SNPs vs micro-satellite markers?

3 Aim of Gene Mapping Experiments Identify variants that control interesting traits Susceptibility to human disease Phenotypic variation in the population The hypothesis Individuals sharing these variants will be more similar for traits they control The difficulty Testing ~0 million variants is impractical

4 Identity-by-Descent IBD) Two alleles are IBD if they are descended from the same ancestral allele If a stretch of chromosome is IBD among a set of individuals, ALL variants within that stretch will also be shared IBD markers, QTLs, disease genes) Allows surveys of large amounts of variation even when a few polymorphisms measured

5 A Segregating Disease Allele +/+ +/mut +/mut +/+ +/mut +/mut +/+ +/mut +/+ All affected individuals IBD for disease causing mutation

6 Segregating Chromosomes MARKER DISEASE LOCUS Affected individuals tend to share adjacent areas of chromosome IBD

7 Marker Shared Among Affecteds /2 3/4 3/4 /3 2/4 /4 4/4 4/4 /4 4 allele segregates with disease

8 Why is IBD sharing important? /2 3/4 IBD sharing forms the basis of nonparametric linkage statistics 3/4 /3 2/4 /4 4/4 4/4 /4 Affected relatives tend to share marker alleles close to the disease locus IBD more often than chance

9 Linkage between QTL and marker QTL Marker IBD 0 IBD IBD 2

10 NO Linkage between QTL and marker Marker

11 IBD vs IBS Identical by Descent and Identical by State Identical by state only

12 Example: IBD in Siblings Consider a mating between mother AB x father CD: Sib AC AD BC BD Sib 2 AC AD BC 0 2 BD 0 2 IBD 0 : : 2 25% : 50% : 25%

13 IBD can be trivial / 2 / 2 IBD0 / 2 / 2

14 Two Other Simple Cases / 2 / 2 / 2 / 2 IBD2 / / 2 2 / 2 2 /

15 A little more complicated / 2 2 / 2 IBD 50% chance) / 2 / 2 IBD2 50% chance)

16 And even more complicated IBD? / /

17 Bayes Theorem j A j B P A j P A i B P A i P B P A i B P A i P B P B B A i P ) ) ) ) ) ) ) ), P ) ) A i

18 Bayes Theorem for IBD Probabilities j j IBD G P j IBD P i IBD G P i IBD P G P i IBD G P i IBD P G P G i G i IBD P ) ) ) ) ) ) ) ) ), PIBD )

19 PMarker Genotype IBD State) Sib Sib 2 k0 k k2 A A A A p 4 p 3 p 2 A A A A 2 2p 3 p 2 p 2 p 2 0 A A A 2 A 2 p 2 p A A 2 A A 2p 3 p 2 p 2 p 2 0 A A 2 A A 2 4p 2 p 2 2 p p 2 2p p 2 A A 2 A 2 A 2 2p p 3 2 p p A 2 A 2 A A p 2 p A 2 A 2 A 2p p p 2 p 3 A A p 4 p 3 p 2 2 A A 2 A 2 Pobserving genotypes / k alleles IBD)

20 Worked Example p 0.5 / /

21 Worked Example / / 9 4 ) 4 ) ) 2 ) 9 ) 4 ) ) 4 2) 8 ) 6 0) P G p G IBD P P G p G IBD P P G p G IBD P p p p P G p IBD P G p IBD P G p IBD P G p

22 For ANY PEDIGREE the inheritance pattern at every point in the genome can be completely described by a binary inheritance vector: vx) p, m, p 2, m 2,,p n,m n ) whose coordinates describe the outcome of the 2n paternal and maternal meioses giving rise to the n non-founders in the pedigree p i m i ) is 0 if the grandpaternal allele transmitted p i m i ) is if the grandmaternal allele is transmitted a / b c / d p p 2 m m 2 vx) [0,0,,] a / c b / d

23 Inheritance Vector In practice, it is not possible to determine the true inheritance vector at every point in the genome, rather we represent partial information as a probability distribution over the 2 2n possible inheritance vectors a b 2 a c a c p m p 2 m 2 a b b b Inheritance vector Prior Posterior /6 /8 000 /6 /8 000 / / /6 /8 00 /6 /8 00 /6 0 0 / /6 /8 00 /6 /8 00 /6 0 0 / /6 /8 0 /6 /8 0 /6 0 /6 0

24 Computer Representation Define inheritance vector v l Each inheritance vector indexed by a different memory location Likelihood for each gene flow pattern Conditional on observed genotypes at location l 2 2n elements!!! At each marker location l L L L L L L L L L L L L L L L L

25 a) bit-indexed array L L 2 L L 2 L L 2 L L 2 b) packed tree L L 2 L L 2 L L 2 L L 2 c) sparse tree Legend Node with zero likelihood Node identical to sibling L L 2 L L 2 Likelihood for this branch Abecasis et al 2002) Nat Genet 30:97-0

26 Multipoint IBD IBD status may not be able to be ascertained with certainty because e.g. the mating is not informative, parental information is not available IBD information at uninformative loci can be made more precise by examining nearby linked loci

27 Multipoint IBD a / b / c / d 2 / IBD 0 a / c b / d / / 2 IBD 0 or IBD?

28 Complexity of the Problem in Larger Pedigrees 2n meioses in pedigree with n nonfounders Each meiosis has 2 possible outcomes Therefore 2 2n possibilities for each locus For each genetic locus One location for each of m genetic markers Distinct, non-independent meiotic outcomes Up to 4 nm distinct outcomes!!!

29 Example: Sib-pair Genotyped at 0 Markers Inheritance vector m 0 Marker 2 2xn ) m 2 2 x 2 ) possible paths!!!

30 Lander-Green Algorithm The inheritance vector at a locus is conditionally independent of the inheritance vectors at all preceding loci given the inheritance vector at the immediately preceding locus Hidden Markov chain ) The conditional probability of an inheritance vector v i+ at locus i+, given the inheritance vector v i at locus i is θ ij -θ i ) 2n-j where θ is the recombination fraction and j is the number of changes in elements of the inheritance vector transition probabilities ) Example: Locus [0000] Locus 2 [000] Conditional probability θ) 3 θ

31 m Total Likelihood Q T Q 2 T 2 T m- Q m Q i P[0000] P[000] P[] T i -θ) 4 -θ) 3 θ θ 4 -θ) 3 θ -θ) 4 -θ)θ 3 θ 4 -θ)θ 3 -θ) 4 2 2n x 2 2n diagonal matrix of single locus probabilities at locus i 2 2n x 2 2n matrix of transitional probabilities between locus i and locus i+ ~0 x 2 2 x 2 ) 2 operations 2560 for this case!!!

32 PIBD) 2 at Marker Three Inheritance vector m 0 Marker L[IBD 2 at marker 3] / L[ALL] L[0000] + L[00] + L[00] + L[] ) / L[ALL]

33 PIBD) 2 at arbitrary position on the chromosome Inheritance vector m 0 Marker L[0000] + L[00] + L[00] + L[] ) / L[ALL]

34 Further speedups Trees summarize redundant information Portions of inheritance vector that are repeated Portions of inheritance vector that are constant or zero Use sparse-matrix by vector multiplication Regularities in transition matrices Use symmetries in divide and conquer algorithm Idury & Elston, 997)

35 Lander-Green Algorithm Summary Factorize likelihood by marker Complexity m e n Large number of markers e.g. dense SNP data) Relatively small pedigrees MERLIN, GENEHUNTER, ALLEGRO etc

36 Elston-Stewart Algorithm Factorize likelihood by individual Complexity n e m Small number of markers Large pedigrees With little inbreeding VITESSE etc

37 Other methods Number of MCMC methods proposed ~Linear on # markers ~Linear on # people Hard to guarantee convergence on very large datasets Many widely separated local minima E.g. SIMWALK, LOKI

38 MERLIN-- Multipoint Engine for Rapid Likelihood Inference

39 Capabilities Linkage Analysis NPL and K&C LOD Variance Components Haplotypes Most likely Sampling All IBD and info content Error Detection Most SNP typing errors are Mendelian consistent Recombination No. of recombinants per family per interval can be controlled Simulation

40 MERLIN Website Reference FAQ Source Tutorial Linkage Haplotyping Simulation Error detection IBD calculation Binaries

41 Test Case Pedigrees

42 Timings Marker Locations Top Generation Genotyped A x000) B C D Genehunter 38s 37s 8m6s * Allegro 8s 2m7s 3h54m3s * Merlin s 8s 3m55s * Top Generation Not Genotyped A x000) B C D Genehunter 45s m54s * * Allegro 8s m08s h2m38s * Merlin 3s 25s 5m50s *

43 Intuition: Approximate Sparse T Dense maps, closely spaced markers Small recombination fractions θ Reasonable to set θ k with zero Produces a very sparse transition matrix Consider only elements of v separated by <k recombination events At consecutive locations

44 Additional Speedup Time Memory Exact 40s 00 MB No recombination <s 4 MB recombinant 2s 7 MB 2 recombinants 5s 54 MB Genehunter 2. 6min 024MB Keavney et al 998) ACE data, 0 SNPs within gene, 4-8 individuals per family

45 Input Files Pedigree File Relationships Genotype data Phenotype data Data File Describes contents of pedigree file Map File Records location of genetic markers

46 Example Pedigree File <contents of example.ped> 0 0 x 3 3 x x x 4 4 x x x 2 x x x 4 3 x x <end of example.ped> Encodes family relationships, marker and phenotype information

47 Example Data File <contents of example.dat> T some_trait_of_interest M some_marker M another_marker <end of example.dat> Provides information necessary to decode pedigree file

48 Data File Field Codes Code M A T C Z Description Marker Genotype Affection Status. Quantitative Trait. Covariate. Zygosity.

49 Example Map File <contents of example.map> CHROMOSOME MARKER POSITION 2 D2S D2S <end of example.map> Indicates location of individual markers, necessary to derive recombination fractions between them

50 Worked Example p 0.5 P IBD 0 G) P IBD G) P IBD 2 G) / / merlin d example.dat p example.ped m example.map --ibd

51 Application: Information Content Mapping Information content: Provides a measure of how well a marker set approaches the goal of completely determining the inheritance outcome Based on concept of entropy E -ΣP i log 2 P i where P i is probability of the ith outcome I E x) Ex)/E 0 Always lies between 0 and Does not depend on test for linkage Scales linearly with power

52 Application: Information Content Mapping Simulations sib-pairs with/out parental genotypes) micro-satellite per 0cM ABI) microsatellite per 3cM decode) SNP per 0.5cM Illumina) SNP per 0.2 cm Affymetrix) Which panel performs best in terms of extracting marker information? Do the results depend upon the presence of parental genotypes? merlin d file.dat p file.ped m file.map --information --step --markernames

53 SNPs vs Microsatellites with parents Information Content Densities SNP microsat 0.2 cm 3 cm 0.5 cm 0 cm Position cm) SNPs + parents microsat + parents

54 SNPs vs Microsatellites without parents Information Content Densities SNP Densities microsat 0.2 cm SNP 3 microsat cm cm cm 0 cm 3 cm 0.5 cm 0 cm SNPs - parents microsat - parents Position cm)

Calculation of IBD probabilities

Calculation of IBD probabilities Calculation of IBD probabilities David Evans University of Bristol This Session Identity by Descent (IBD) vs Identity by state (IBS) Why is IBD important? Calculating IBD probabilities Lander-Green Algorithm

More information

The Lander-Green Algorithm. Biostatistics 666 Lecture 22

The Lander-Green Algorithm. Biostatistics 666 Lecture 22 The Lander-Green Algorithm Biostatistics 666 Lecture Last Lecture Relationship Inferrence Likelihood of genotype data Adapt calculation to different relationships Siblings Half-Siblings Unrelated individuals

More information

Genotype Imputation. Biostatistics 666

Genotype Imputation. Biostatistics 666 Genotype Imputation Biostatistics 666 Previously Hidden Markov Models for Relative Pairs Linkage analysis using affected sibling pairs Estimation of pairwise relationships Identity-by-Descent Relatives

More information

Affected Sibling Pairs. Biostatistics 666

Affected Sibling Pairs. Biostatistics 666 Affected Sibling airs Biostatistics 666 Today Discussion of linkage analysis using affected sibling pairs Our exploration will include several components we have seen before: A simple disease model IBD

More information

Tutorial Session 2. MCMC for the analysis of genetic data on pedigrees:

Tutorial Session 2. MCMC for the analysis of genetic data on pedigrees: MCMC for the analysis of genetic data on pedigrees: Tutorial Session 2 Elizabeth Thompson University of Washington Genetic mapping and linkage lod scores Monte Carlo likelihood and likelihood ratio estimation

More information

MCMC IN THE ANALYSIS OF GENETIC DATA ON PEDIGREES

MCMC IN THE ANALYSIS OF GENETIC DATA ON PEDIGREES MCMC IN THE ANALYSIS OF GENETIC DATA ON PEDIGREES Elizabeth A. Thompson Department of Statistics, University of Washington Box 354322, Seattle, WA 98195-4322, USA Email: thompson@stat.washington.edu This

More information

Lecture 9. QTL Mapping 2: Outbred Populations

Lecture 9. QTL Mapping 2: Outbred Populations Lecture 9 QTL Mapping 2: Outbred Populations Bruce Walsh. Aug 2004. Royal Veterinary and Agricultural University, Denmark The major difference between QTL analysis using inbred-line crosses vs. outbred

More information

2. Map genetic distance between markers

2. Map genetic distance between markers Chapter 5. Linkage Analysis Linkage is an important tool for the mapping of genetic loci and a method for mapping disease loci. With the availability of numerous DNA markers throughout the human genome,

More information

The Quantitative TDT

The Quantitative TDT The Quantitative TDT (Quantitative Transmission Disequilibrium Test) Warren J. Ewens NUS, Singapore 10 June, 2009 The initial aim of the (QUALITATIVE) TDT was to test for linkage between a marker locus

More information

The genomes of recombinant inbred lines

The genomes of recombinant inbred lines The genomes of recombinant inbred lines Karl W Broman Department of Biostatistics Johns Hopkins University http://www.biostat.jhsph.edu/~kbroman C57BL/6 2 1 Recombinant inbred lines (by sibling mating)

More information

Gene mapping, linkage analysis and computational challenges. Konstantin Strauch

Gene mapping, linkage analysis and computational challenges. Konstantin Strauch Gene mapping, linkage analysis an computational challenges Konstantin Strauch Institute for Meical Biometry, Informatics, an Epiemiology (IMBIE) University of Bonn E-mail: strauch@uni-bonn.e Genetics an

More information

Prediction of the Confidence Interval of Quantitative Trait Loci Location

Prediction of the Confidence Interval of Quantitative Trait Loci Location Behavior Genetics, Vol. 34, No. 4, July 2004 ( 2004) Prediction of the Confidence Interval of Quantitative Trait Loci Location Peter M. Visscher 1,3 and Mike E. Goddard 2 Received 4 Sept. 2003 Final 28

More information

Association Testing with Quantitative Traits: Common and Rare Variants. Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5

Association Testing with Quantitative Traits: Common and Rare Variants. Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5 Association Testing with Quantitative Traits: Common and Rare Variants Timothy Thornton and Katie Kerr Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5 1 / 41 Introduction to Quantitative

More information

Lecture 2: Genetic Association Testing with Quantitative Traits. Summer Institute in Statistical Genetics 2017

Lecture 2: Genetic Association Testing with Quantitative Traits. Summer Institute in Statistical Genetics 2017 Lecture 2: Genetic Association Testing with Quantitative Traits Instructors: Timothy Thornton and Michael Wu Summer Institute in Statistical Genetics 2017 1 / 29 Introduction to Quantitative Trait Mapping

More information

The universal validity of the possible triangle constraint for Affected-Sib-Pairs

The universal validity of the possible triangle constraint for Affected-Sib-Pairs The Canadian Journal of Statistics Vol. 31, No.?, 2003, Pages???-??? La revue canadienne de statistique The universal validity of the possible triangle constraint for Affected-Sib-Pairs Zeny Z. Feng, Jiahua

More information

1 Springer. Nan M. Laird Christoph Lange. The Fundamentals of Modern Statistical Genetics

1 Springer. Nan M. Laird Christoph Lange. The Fundamentals of Modern Statistical Genetics 1 Springer Nan M. Laird Christoph Lange The Fundamentals of Modern Statistical Genetics 1 Introduction to Statistical Genetics and Background in Molecular Genetics 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

More information

For 5% confidence χ 2 with 1 degree of freedom should exceed 3.841, so there is clear evidence for disequilibrium between S and M.

For 5% confidence χ 2 with 1 degree of freedom should exceed 3.841, so there is clear evidence for disequilibrium between S and M. STAT 550 Howework 6 Anton Amirov 1. This question relates to the same study you saw in Homework-4, by Dr. Arno Motulsky and coworkers, and published in Thompson et al. (1988; Am.J.Hum.Genet, 42, 113-124).

More information

p(d g A,g B )p(g B ), g B

p(d g A,g B )p(g B ), g B Supplementary Note Marginal effects for two-locus models Here we derive the marginal effect size of the three models given in Figure 1 of the main text. For each model we assume the two loci (A and B)

More information

UNIT 8 BIOLOGY: Meiosis and Heredity Page 148

UNIT 8 BIOLOGY: Meiosis and Heredity Page 148 UNIT 8 BIOLOGY: Meiosis and Heredity Page 148 CP: CHAPTER 6, Sections 1-6; CHAPTER 7, Sections 1-4; HN: CHAPTER 11, Section 1-5 Standard B-4: The student will demonstrate an understanding of the molecular

More information

Introduction to population genetics & evolution

Introduction to population genetics & evolution Introduction to population genetics & evolution Course Organization Exam dates: Feb 19 March 1st Has everybody registered? Did you get the email with the exam schedule Summer seminar: Hot topics in Bioinformatics

More information

Analytic power calculation for QTL linkage analysis of small pedigrees

Analytic power calculation for QTL linkage analysis of small pedigrees (2001) 9, 335 ± 340 ã 2001 Nature Publishing Group All rights reserved 1018-4813/01 $15.00 www.nature.com/ejhg ARTICLE for QTL linkage analysis of small pedigrees FruÈhling V Rijsdijk*,1, John K Hewitt

More information

Inference on pedigree structure from genome screen data. Running title: Inference on pedigree structure. Mary Sara McPeek. The University of Chicago

Inference on pedigree structure from genome screen data. Running title: Inference on pedigree structure. Mary Sara McPeek. The University of Chicago 1 Inference on pedigree structure from genome screen data Running title: Inference on pedigree structure Mary Sara McPeek The University of Chicago Address for correspondence: Department of Statistics,

More information

Breeding Values and Inbreeding. Breeding Values and Inbreeding

Breeding Values and Inbreeding. Breeding Values and Inbreeding Breeding Values and Inbreeding Genotypic Values For the bi-allelic single locus case, we previously defined the mean genotypic (or equivalently the mean phenotypic values) to be a if genotype is A 2 A

More information

Lecture WS Evolutionary Genetics Part I 1

Lecture WS Evolutionary Genetics Part I 1 Quantitative genetics Quantitative genetics is the study of the inheritance of quantitative/continuous phenotypic traits, like human height and body size, grain colour in winter wheat or beak depth in

More information

Variance Component Models for Quantitative Traits. Biostatistics 666

Variance Component Models for Quantitative Traits. Biostatistics 666 Variance Component Models for Quantitative Traits Biostatistics 666 Today Analysis of quantitative traits Modeling covariance for pairs of individuals estimating heritability Extending the model beyond

More information

AFFECTED RELATIVE PAIR LINKAGE STATISTICS THAT MODEL RELATIONSHIP UNCERTAINTY

AFFECTED RELATIVE PAIR LINKAGE STATISTICS THAT MODEL RELATIONSHIP UNCERTAINTY AFFECTED RELATIVE PAIR LINKAGE STATISTICS THAT MODEL RELATIONSHIP UNCERTAINTY by Amrita Ray BSc, Presidency College, Calcutta, India, 2001 MStat, Indian Statistical Institute, Calcutta, India, 2003 Submitted

More information

I Have the Power in QTL linkage: single and multilocus analysis

I Have the Power in QTL linkage: single and multilocus analysis I Have the Power in QTL linkage: single and multilocus analysis Benjamin Neale 1, Sir Shaun Purcell 2 & Pak Sham 13 1 SGDP, IoP, London, UK 2 Harvard School of Public Health, Cambridge, MA, USA 3 Department

More information

Gene mapping in model organisms

Gene mapping in model organisms Gene mapping in model organisms Karl W Broman Department of Biostatistics Johns Hopkins University http://www.biostat.jhsph.edu/~kbroman Goal Identify genes that contribute to common human diseases. 2

More information

(Genome-wide) association analysis

(Genome-wide) association analysis (Genome-wide) association analysis 1 Key concepts Mapping QTL by association relies on linkage disequilibrium in the population; LD can be caused by close linkage between a QTL and marker (= good) or by

More information

Chapter 6 Linkage Disequilibrium & Gene Mapping (Recombination)

Chapter 6 Linkage Disequilibrium & Gene Mapping (Recombination) 12/5/14 Chapter 6 Linkage Disequilibrium & Gene Mapping (Recombination) Linkage Disequilibrium Genealogical Interpretation of LD Association Mapping 1 Linkage and Recombination v linkage equilibrium ²

More information

Statistical issues in QTL mapping in mice

Statistical issues in QTL mapping in mice Statistical issues in QTL mapping in mice Karl W Broman Department of Biostatistics Johns Hopkins University http://www.biostat.jhsph.edu/~kbroman Outline Overview of QTL mapping The X chromosome Mapping

More information

Complexity and Approximation of the Minimum Recombination Haplotype Configuration Problem

Complexity and Approximation of the Minimum Recombination Haplotype Configuration Problem Complexity and Approximation of the Minimum Recombination Haplotype Configuration Problem Lan Liu 1, Xi Chen 3, Jing Xiao 3, and Tao Jiang 1,2 1 Department of Computer Science and Engineering, University

More information

Lecture 6. QTL Mapping

Lecture 6. QTL Mapping Lecture 6 QTL Mapping Bruce Walsh. Aug 2003. Nordic Summer Course MAPPING USING INBRED LINE CROSSES We start by considering crosses between inbred lines. The analysis of such crosses illustrates many of

More information

Linkage and Linkage Disequilibrium

Linkage and Linkage Disequilibrium Linkage and Linkage Disequilibrium Summer Institute in Statistical Genetics 2014 Module 10 Topic 3 Linkage in a simple genetic cross Linkage In the early 1900 s Bateson and Punnet conducted genetic studies

More information

Natural Selection. Population Dynamics. The Origins of Genetic Variation. The Origins of Genetic Variation. Intergenerational Mutation Rate

Natural Selection. Population Dynamics. The Origins of Genetic Variation. The Origins of Genetic Variation. Intergenerational Mutation Rate Natural Selection Population Dynamics Humans, Sickle-cell Disease, and Malaria How does a population of humans become resistant to malaria? Overproduction Environmental pressure/competition Pre-existing

More information

Proportional Variance Explained by QLT and Statistical Power. Proportional Variance Explained by QTL and Statistical Power

Proportional Variance Explained by QLT and Statistical Power. Proportional Variance Explained by QTL and Statistical Power Proportional Variance Explained by QTL and Statistical Power Partitioning the Genetic Variance We previously focused on obtaining variance components of a quantitative trait to determine the proportion

More information

MODEL-FREE LINKAGE AND ASSOCIATION MAPPING OF COMPLEX TRAITS USING QUANTITATIVE ENDOPHENOTYPES

MODEL-FREE LINKAGE AND ASSOCIATION MAPPING OF COMPLEX TRAITS USING QUANTITATIVE ENDOPHENOTYPES MODEL-FREE LINKAGE AND ASSOCIATION MAPPING OF COMPLEX TRAITS USING QUANTITATIVE ENDOPHENOTYPES Saurabh Ghosh Human Genetics Unit Indian Statistical Institute, Kolkata Most common diseases are caused by

More information

Resemblance between relatives

Resemblance between relatives Resemblance between relatives 1 Key concepts Model phenotypes by fixed effects and random effects including genetic value (additive, dominance, epistatic) Model covariance of genetic effects by relationship

More information

Resultants in genetic linkage analysis

Resultants in genetic linkage analysis Journal of Symbolic Computation 41 (2006) 125 137 www.elsevier.com/locate/jsc Resultants in genetic linkage analysis Ingileif B. Hallgrímsdóttir a,, Bernd Sturmfels b a Department of Statistics, University

More information

Population Genetics. with implications for Linkage Disequilibrium. Chiara Sabatti, Human Genetics 6357a Gonda

Population Genetics. with implications for Linkage Disequilibrium. Chiara Sabatti, Human Genetics 6357a Gonda 1 Population Genetics with implications for Linkage Disequilibrium Chiara Sabatti, Human Genetics 6357a Gonda csabatti@mednet.ucla.edu 2 Hardy-Weinberg Hypotheses: infinite populations; no inbreeding;

More information

Powerful Regression-Based Quantitative-Trait Linkage Analysis of General Pedigrees

Powerful Regression-Based Quantitative-Trait Linkage Analysis of General Pedigrees Am. J. Hum. Genet. 71:38 53, 00 Powerful Regression-Based Quantitative-Trait Linkage Analysis of General Pedigrees Pak C. Sham, 1 Shaun Purcell, 1 Stacey S. Cherny, 1, and Gonçalo R. Abecasis 3 1 Institute

More information

GBLUP and G matrices 1

GBLUP and G matrices 1 GBLUP and G matrices 1 GBLUP from SNP-BLUP We have defined breeding values as sum of SNP effects:! = #$ To refer breeding values to an average value of 0, we adopt the centered coding for genotypes described

More information

Notes on Population Genetics

Notes on Population Genetics Notes on Population Genetics Graham Coop 1 1 Department of Evolution and Ecology & Center for Population Biology, University of California, Davis. To whom correspondence should be addressed: gmcoop@ucdavis.edu

More information

TOPICS IN STATISTICAL METHODS FOR HUMAN GENE MAPPING

TOPICS IN STATISTICAL METHODS FOR HUMAN GENE MAPPING TOPICS IN STATISTICAL METHODS FOR HUMAN GENE MAPPING by Chia-Ling Kuo MS, Biostatstics, National Taiwan University, Taipei, Taiwan, 003 BBA, Statistics, National Chengchi University, Taipei, Taiwan, 001

More information

QTL mapping under ascertainment

QTL mapping under ascertainment QTL mapping under ascertainment J. PENG Department of Statistics, University of California, Davis, CA 95616 D. SIEGMUND Department of Statistics, Stanford University, Stanford, CA 94305 February 15, 2006

More information

genome a specific characteristic that varies from one individual to another gene the passing of traits from one generation to the next

genome a specific characteristic that varies from one individual to another gene the passing of traits from one generation to the next genetics the study of heredity heredity sequence of DNA that codes for a protein and thus determines a trait genome a specific characteristic that varies from one individual to another gene trait the passing

More information

Optimal Allele-Sharing Statistics for Genetic Mapping Using Affected Relatives

Optimal Allele-Sharing Statistics for Genetic Mapping Using Affected Relatives Genetic Epidemiology 16:225 249 (1999) Optimal Allele-Sharing Statistics for Genetic Mapping Using Affected Relatives Mary Sara McPeek* Department of Statistics, University of Chicago, Chicago, Illinois

More information

Bayesian Inference of Interactions and Associations

Bayesian Inference of Interactions and Associations Bayesian Inference of Interactions and Associations Jun Liu Department of Statistics Harvard University http://www.fas.harvard.edu/~junliu Based on collaborations with Yu Zhang, Jing Zhang, Yuan Yuan,

More information

Computational Approaches to Statistical Genetics

Computational Approaches to Statistical Genetics Computational Approaches to Statistical Genetics GWAS I: Concepts and Probability Theory Christoph Lippert Dr. Oliver Stegle Prof. Dr. Karsten Borgwardt Max-Planck-Institutes Tübingen, Germany Tübingen

More information

SOLUTIONS TO EXERCISES FOR CHAPTER 9

SOLUTIONS TO EXERCISES FOR CHAPTER 9 SOLUTIONS TO EXERCISES FOR CHPTER 9 gronomy 65 Statistical Genetics W. E. Nyquist March 00 Exercise 9.. a. lgebraic method for the grandparent-grandoffspring covariance (see parent-offspring covariance,

More information

MIXED MODELS THE GENERAL MIXED MODEL

MIXED MODELS THE GENERAL MIXED MODEL MIXED MODELS This chapter introduces best linear unbiased prediction (BLUP), a general method for predicting random effects, while Chapter 27 is concerned with the estimation of variances by restricted

More information

Linear Regression (1/1/17)

Linear Regression (1/1/17) STA613/CBB540: Statistical methods in computational biology Linear Regression (1/1/17) Lecturer: Barbara Engelhardt Scribe: Ethan Hada 1. Linear regression 1.1. Linear regression basics. Linear regression

More information

Mathematical models in population genetics II

Mathematical models in population genetics II Mathematical models in population genetics II Anand Bhaskar Evolutionary Biology and Theory of Computing Bootcamp January 1, 014 Quick recap Large discrete-time randomly mating Wright-Fisher population

More information

Expression QTLs and Mapping of Complex Trait Loci. Paul Schliekelman Statistics Department University of Georgia

Expression QTLs and Mapping of Complex Trait Loci. Paul Schliekelman Statistics Department University of Georgia Expression QTLs and Mapping of Complex Trait Loci Paul Schliekelman Statistics Department University of Georgia Definitions: Genes, Loci and Alleles A gene codes for a protein. Proteins due everything.

More information

Homework Assignment, Evolutionary Systems Biology, Spring Homework Part I: Phylogenetics:

Homework Assignment, Evolutionary Systems Biology, Spring Homework Part I: Phylogenetics: Homework Assignment, Evolutionary Systems Biology, Spring 2009. Homework Part I: Phylogenetics: Introduction. The objective of this assignment is to understand the basics of phylogenetic relationships

More information

Univariate Linkage in Mx. Boulder, TC 18, March 2005 Posthuma, Maes, Neale

Univariate Linkage in Mx. Boulder, TC 18, March 2005 Posthuma, Maes, Neale Univariate Linkage in Mx Boulder, TC 18, March 2005 Posthuma, Maes, Neale VC analysis of Linkage Incorporating IBD Coefficients Covariance might differ according to sharing at a particular locus. Sharing

More information

Identification of Pedigree Relationship from Genome Sharing

Identification of Pedigree Relationship from Genome Sharing INVESTIGATION Identification of Pedigree Relationship from Genome Sharing William G. Hill 1 and Ian M. S. White Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh,

More information

Solutions to Problem Set 4

Solutions to Problem Set 4 Question 1 Solutions to 7.014 Problem Set 4 Because you have not read much scientific literature, you decide to study the genetics of garden peas. You have two pure breeding pea strains. One that is tall

More information

The E-M Algorithm in Genetics. Biostatistics 666 Lecture 8

The E-M Algorithm in Genetics. Biostatistics 666 Lecture 8 The E-M Algorithm in Genetics Biostatistics 666 Lecture 8 Maximum Likelihood Estimation of Allele Frequencies Find parameter estimates which make observed data most likely General approach, as long as

More information

MEIOSIS, THE BASIS OF SEXUAL REPRODUCTION

MEIOSIS, THE BASIS OF SEXUAL REPRODUCTION MEIOSIS, THE BASIS OF SEXUAL REPRODUCTION Why do kids look different from the parents? How are they similar to their parents? Why aren t brothers or sisters more alike? Meiosis A process where the number

More information

Modeling IBD for Pairs of Relatives. Biostatistics 666 Lecture 17

Modeling IBD for Pairs of Relatives. Biostatistics 666 Lecture 17 Modeling IBD for Pairs of Relatives Biostatistics 666 Lecture 7 Previously Linkage Analysis of Relative Pairs IBS Methods Compare observed and expected sharing IBD Methods Account for frequency of shared

More information

Lecture 5: BLUP (Best Linear Unbiased Predictors) of genetic values. Bruce Walsh lecture notes Tucson Winter Institute 9-11 Jan 2013

Lecture 5: BLUP (Best Linear Unbiased Predictors) of genetic values. Bruce Walsh lecture notes Tucson Winter Institute 9-11 Jan 2013 Lecture 5: BLUP (Best Linear Unbiased Predictors) of genetic values Bruce Walsh lecture notes Tucson Winter Institute 9-11 Jan 013 1 Estimation of Var(A) and Breeding Values in General Pedigrees The classic

More information

CSci 8980: Advanced Topics in Graphical Models Analysis of Genetic Variation

CSci 8980: Advanced Topics in Graphical Models Analysis of Genetic Variation CSci 8980: Advanced Topics in Graphical Models Analysis of Genetic Variation Instructor: Arindam Banerjee November 26, 2007 Genetic Polymorphism Single nucleotide polymorphism (SNP) Genetic Polymorphism

More information

The Admixture Model in Linkage Analysis

The Admixture Model in Linkage Analysis The Admixture Model in Linkage Analysis Jie Peng D. Siegmund Department of Statistics, Stanford University, Stanford, CA 94305 SUMMARY We study an appropriate version of the score statistic to test the

More information

Quantitative Genetics

Quantitative Genetics Bruce Walsh, University of Arizona, Tucson, Arizona, USA Almost any trait that can be defined shows variation, both within and between populations. Quantitative genetics is concerned with the analysis

More information

Evaluating the Performance of a Block Updating McMC Sampler in a Simple Genetic Application

Evaluating the Performance of a Block Updating McMC Sampler in a Simple Genetic Application Evaluating the Performance of a Block Updating McMC Sampler in a Simple Genetic Application N.A. SHEEHAN B. GULDBRANDTSEN 1 D.A. SORENSEN 1 1 Markov chain Monte Carlo (McMC) methods have provided an enormous

More information

Methods for Cryptic Structure. Methods for Cryptic Structure

Methods for Cryptic Structure. Methods for Cryptic Structure Case-Control Association Testing Review Consider testing for association between a disease and a genetic marker Idea is to look for an association by comparing allele/genotype frequencies between the cases

More information

THEORETICAL ASPECTS OF PEDIGREE ANALYSIS

THEORETICAL ASPECTS OF PEDIGREE ANALYSIS THEORETICAL ASPECTS OF PEDIGREE ANALYSIS E. Ginsburg, I. Malkin, R.C. Elston THEORETICAL ASPECTS OF PEDIGREE ANALYSIS RAMOT PUBLISHING HOUSE TEL-AVIV UNIVERSITY, ISRAEL E. Ginsburg, I. Malkin, R.C. Elston

More information

Introduction to QTL mapping in model organisms

Introduction to QTL mapping in model organisms Introduction to QTL mapping in model organisms Karl W Broman Department of Biostatistics Johns Hopkins University kbroman@jhsph.edu www.biostat.jhsph.edu/ kbroman Outline Experiments and data Models ANOVA

More information

Life Cycles, Meiosis and Genetic Variability24/02/2015 2:26 PM

Life Cycles, Meiosis and Genetic Variability24/02/2015 2:26 PM Life Cycles, Meiosis and Genetic Variability iclicker: 1. A chromosome just before mitosis contains two double stranded DNA molecules. 2. This replicated chromosome contains DNA from only one of your parents

More information

Use of hidden Markov models for QTL mapping

Use of hidden Markov models for QTL mapping Use of hidden Markov models for QTL mapping Karl W Broman Department of Biostatistics, Johns Hopkins University December 5, 2006 An important aspect of the QTL mapping problem is the treatment of missing

More information

Lander-Green Algorithm. Biostatistics 666

Lander-Green Algorithm. Biostatistics 666 Lander-Green Algorith Biostatistics 666 Identity-by-Descent (IBD) A roerty o chroosoe stretches that descend ro the sae ancestor Two alleles are IBD i they descend ro the sae ounder chroosoe For a air

More information

Lecture 11: Multiple trait models for QTL analysis

Lecture 11: Multiple trait models for QTL analysis Lecture 11: Multiple trait models for QTL analysis Julius van der Werf Multiple trait mapping of QTL...99 Increased power of QTL detection...99 Testing for linked QTL vs pleiotropic QTL...100 Multiple

More information

Learning ancestral genetic processes using nonparametric Bayesian models

Learning ancestral genetic processes using nonparametric Bayesian models Learning ancestral genetic processes using nonparametric Bayesian models Kyung-Ah Sohn October 31, 2011 Committee Members: Eric P. Xing, Chair Zoubin Ghahramani Russell Schwartz Kathryn Roeder Matthew

More information

1. Understand the methods for analyzing population structure in genomes

1. Understand the methods for analyzing population structure in genomes MSCBIO 2070/02-710: Computational Genomics, Spring 2016 HW3: Population Genetics Due: 24:00 EST, April 4, 2016 by autolab Your goals in this assignment are to 1. Understand the methods for analyzing population

More information

SNP Association Studies with Case-Parent Trios

SNP Association Studies with Case-Parent Trios SNP Association Studies with Case-Parent Trios Department of Biostatistics Johns Hopkins Bloomberg School of Public Health September 3, 2009 Population-based Association Studies Balding (2006). Nature

More information

Lesson 4: Understanding Genetics

Lesson 4: Understanding Genetics Lesson 4: Understanding Genetics 1 Terms Alleles Chromosome Co dominance Crossover Deoxyribonucleic acid DNA Dominant Genetic code Genome Genotype Heredity Heritability Heritability estimate Heterozygous

More information

A mixed model based QTL / AM analysis of interactions (G by G, G by E, G by treatment) for plant breeding

A mixed model based QTL / AM analysis of interactions (G by G, G by E, G by treatment) for plant breeding Professur Pflanzenzüchtung Professur Pflanzenzüchtung A mixed model based QTL / AM analysis of interactions (G by G, G by E, G by treatment) for plant breeding Jens Léon 4. November 2014, Oulu Workshop

More information

Objectives. Announcements. Comparison of mitosis and meiosis

Objectives. Announcements. Comparison of mitosis and meiosis Announcements Colloquium sessions for which you can get credit posted on web site: Feb 20, 27 Mar 6, 13, 20 Apr 17, 24 May 15. Review study CD that came with text for lab this week (especially mitosis

More information

Genotype Imputation. Class Discussion for January 19, 2016

Genotype Imputation. Class Discussion for January 19, 2016 Genotype Imputation Class Discussion for January 19, 2016 Intuition Patterns of genetic variation in one individual guide our interpretation of the genomes of other individuals Imputation uses previously

More information

Major Genes, Polygenes, and

Major Genes, Polygenes, and Major Genes, Polygenes, and QTLs Major genes --- genes that have a significant effect on the phenotype Polygenes --- a general term of the genes of small effect that influence a trait QTL, quantitative

More information

EXERCISES FOR CHAPTER 7. Exercise 7.1. Derive the two scales of relation for each of the two following recurrent series:

EXERCISES FOR CHAPTER 7. Exercise 7.1. Derive the two scales of relation for each of the two following recurrent series: Statistical Genetics Agronomy 65 W. E. Nyquist March 004 EXERCISES FOR CHAPTER 7 Exercise 7.. Derive the two scales of relation for each of the two following recurrent series: u: 0, 8, 6, 48, 46,L 36 7

More information

Outline. P o purple % x white & white % x purple& F 1 all purple all purple. F purple, 224 white 781 purple, 263 white

Outline. P o purple % x white & white % x purple& F 1 all purple all purple. F purple, 224 white 781 purple, 263 white Outline - segregation of alleles in single trait crosses - independent assortment of alleles - using probability to predict outcomes - statistical analysis of hypotheses - conditional probability in multi-generation

More information

Q Expected Coverage Achievement Merit Excellence. Punnett square completed with correct gametes and F2.

Q Expected Coverage Achievement Merit Excellence. Punnett square completed with correct gametes and F2. NCEA Level 2 Biology (91157) 2018 page 1 of 6 Assessment Schedule 2018 Biology: Demonstrate understanding of genetic variation and change (91157) Evidence Q Expected Coverage Achievement Merit Excellence

More information

Methods for QTL analysis

Methods for QTL analysis Methods for QTL analysis Julius van der Werf METHODS FOR QTL ANALYSIS... 44 SINGLE VERSUS MULTIPLE MARKERS... 45 DETERMINING ASSOCIATIONS BETWEEN GENETIC MARKERS AND QTL WITH TWO MARKERS... 45 INTERVAL

More information

Solutions to Even-Numbered Exercises to accompany An Introduction to Population Genetics: Theory and Applications Rasmus Nielsen Montgomery Slatkin

Solutions to Even-Numbered Exercises to accompany An Introduction to Population Genetics: Theory and Applications Rasmus Nielsen Montgomery Slatkin Solutions to Even-Numbered Exercises to accompany An Introduction to Population Genetics: Theory and Applications Rasmus Nielsen Montgomery Slatkin CHAPTER 1 1.2 The expected homozygosity, given allele

More information

Friday Harbor From Genetics to GWAS (Genome-wide Association Study) Sept David Fardo

Friday Harbor From Genetics to GWAS (Genome-wide Association Study) Sept David Fardo Friday Harbor 2017 From Genetics to GWAS (Genome-wide Association Study) Sept 7 2017 David Fardo Purpose: prepare for tomorrow s tutorial Genetic Variants Quality Control Imputation Association Visualization

More information

Case-Control Association Testing with Related Individuals: A More Powerful Quasi-Likelihood Score Test

Case-Control Association Testing with Related Individuals: A More Powerful Quasi-Likelihood Score Test Case-Control Association esting with Related Individuals: A More Powerful Quasi-Likelihood Score est imothy hornton and Mary Sara McPeek ARICLE We consider the problem of genomewide association testing

More information

Genetics (patterns of inheritance)

Genetics (patterns of inheritance) MENDELIAN GENETICS branch of biology that studies how genetic characteristics are inherited MENDELIAN GENETICS Gregory Mendel, an Augustinian monk (1822-1884), was the first who systematically studied

More information

A Robust Identity-by-Descent Procedure Using Affected Sib Pairs: Multipoint Mapping for Complex Diseases

A Robust Identity-by-Descent Procedure Using Affected Sib Pairs: Multipoint Mapping for Complex Diseases Original Paper Hum Hered 001;51:64 78 Received: May 1, 1999 Revision received: September 10, 1999 Accepted: October 6, 1999 A Robust Identity-by-Descent Procedure Using Affected Sib Pairs: Multipoint Mapping

More information

Introduction to QTL mapping in model organisms

Introduction to QTL mapping in model organisms Introduction to QTL mapping in model organisms Karl W Broman Department of Biostatistics Johns Hopkins University kbroman@jhsph.edu www.biostat.jhsph.edu/ kbroman Outline Experiments and data Models ANOVA

More information

Normal approximation to Binomial

Normal approximation to Binomial Normal approximation to Binomial 24.10.2007 GE02: day 3 part 3 Yurii Aulchenko Erasmus MC Rotterdam Binomial distribution at different n and p 0.3 N=10 0.2 N=25 0.1 N=100 P=0.5 0.2 0.1 0.1 0 0 0 k k k

More information

EM algorithm. Rather than jumping into the details of the particular EM algorithm, we ll look at a simpler example to get the idea of how it works

EM algorithm. Rather than jumping into the details of the particular EM algorithm, we ll look at a simpler example to get the idea of how it works EM algorithm The example in the book for doing the EM algorithm is rather difficult, and was not available in software at the time that the authors wrote the book, but they implemented a SAS macro to implement

More information

Covariance between relatives

Covariance between relatives UNIVERSIDADE DE SÃO PAULO ESCOLA SUPERIOR DE AGRICULTURA LUIZ DE QUEIROZ DEPARTAMENTO DE GENÉTICA LGN5825 Genética e Melhoramento de Espécies Alógamas Covariance between relatives Prof. Roberto Fritsche-Neto

More information

Major questions of evolutionary genetics. Experimental tools of evolutionary genetics. Theoretical population genetics.

Major questions of evolutionary genetics. Experimental tools of evolutionary genetics. Theoretical population genetics. Evolutionary Genetics (for Encyclopedia of Biodiversity) Sergey Gavrilets Departments of Ecology and Evolutionary Biology and Mathematics, University of Tennessee, Knoxville, TN 37996-6 USA Evolutionary

More information

Constructing a Pedigree

Constructing a Pedigree Constructing a Pedigree Use the appropriate symbols: Unaffected Male Unaffected Female Affected Male Affected Female Male carrier of trait Mating of Offspring 2. Label each generation down the left hand

More information

Phasing via the Expectation Maximization (EM) Algorithm

Phasing via the Expectation Maximization (EM) Algorithm Computing Haplotype Frequencies and Haplotype Phasing via the Expectation Maximization (EM) Algorithm Department of Computer Science Brown University, Providence sorin@cs.brown.edu September 14, 2010 Outline

More information

Computation of Multilocus Prior Probability of Autozygosity for Complex Inbred Pedigrees

Computation of Multilocus Prior Probability of Autozygosity for Complex Inbred Pedigrees Genetic Epidemiology 14:1 15 (1997) Computation of Multilocus Prior Probability of Autozygosity for Complex Inbred Pedigrees Sun-Wei Guo* Department of Biostatistics, University of Michigan, Ann Arbor

More information

Mapping quantitative trait loci in oligogenic models

Mapping quantitative trait loci in oligogenic models Biostatistics (2001), 2, 2,pp. 147 162 Printed in Great Britain Mapping quantitative trait loci in oligogenic models HSIU-KHUERN TANG, D. SIEGMUND Department of Statistics, 390 Serra Mall, Sequoia Hall,

More information

25 : Graphical induced structured input/output models

25 : Graphical induced structured input/output models 10-708: Probabilistic Graphical Models 10-708, Spring 2016 25 : Graphical induced structured input/output models Lecturer: Eric P. Xing Scribes: Raied Aljadaany, Shi Zong, Chenchen Zhu Disclaimer: A large

More information