Linkage analysis and QTL mapping in autotetraploid species. Christine Hackett Biomathematics and Statistics Scotland Dundee DD2 5DA

Similar documents
Mapping QTL to a phylogenetic tree

For 5% confidence χ 2 with 1 degree of freedom should exceed 3.841, so there is clear evidence for disequilibrium between S and M.

Statistical issues in QTL mapping in mice

Introduc)on to Gene)cs How to Analyze Your Own Genome Fall 2013

Outline. P o purple % x white & white % x purple& F 1 all purple all purple. F purple, 224 white 781 purple, 263 white

Methodology Report EM Algorithm for Mapping Quantitative Trait Loci in Multivalent Tetraploids

Lecture 9. QTL Mapping 2: Outbred Populations

Introduction to QTL mapping in model organisms

Gene mapping in model organisms

Heredity and Genetics WKSH

Introduction to QTL mapping in model organisms

Affected Sibling Pairs. Biostatistics 666

Lecture 11: Multiple trait models for QTL analysis

Genetics (patterns of inheritance)

2. Map genetic distance between markers

Introduction to QTL mapping in model organisms

Calculation of IBD probabilities

1.5.1 ESTIMATION OF HAPLOTYPE FREQUENCIES:

The phenotype of this worm is wild type. When both genes are mutant: The phenotype of this worm is double mutant Dpy and Unc phenotype.

polysegratio: An R library for autopolyploid segregation analysis

Ch 11.Introduction to Genetics.Biology.Landis

I Have the Power in QTL linkage: single and multilocus analysis

Genetics Review Sheet Learning Target 11: Explain where and how an organism inherits its genetic information and this influences their

1 Springer. Nan M. Laird Christoph Lange. The Fundamentals of Modern Statistical Genetics

Name Class Date. KEY CONCEPT Gametes have half the number of chromosomes that body cells have.

Introduction to QTL mapping in model organisms

Sexual Reproduction and Genetics

Chapter 6 Linkage Disequilibrium & Gene Mapping (Recombination)

Genotype Imputation. Biostatistics 666

The Origin of Species

MODEL-FREE LINKAGE AND ASSOCIATION MAPPING OF COMPLEX TRAITS USING QUANTITATIVE ENDOPHENOTYPES

F1 Parent Cell R R. Name Period. Concept 15.1 Mendelian inheritance has its physical basis in the behavior of chromosomes

Parts 2. Modeling chromosome segregation

MOLECULAR MAPS AND MARKERS FOR DIPLOID ROSES

Expression QTLs and Mapping of Complex Trait Loci. Paul Schliekelman Statistics Department University of Georgia

Reinforcement Unit 3 Resource Book. Meiosis and Mendel KEY CONCEPT Gametes have half the number of chromosomes that body cells have.

UNIT 8 BIOLOGY: Meiosis and Heredity Page 148

Life Cycles, Meiosis and Genetic Variability24/02/2015 2:26 PM

Dropping Your Genes. A Simulation of Meiosis and Fertilization and An Introduction to Probability

When one gene is wild type and the other mutant:

Objectives. Announcements. Comparison of mitosis and meiosis

Use of hidden Markov models for QTL mapping

Meiosis and Sexual Reproduction. Chapter 10. Halving the Chromosome Number. Homologous Pairs

BIOLOGY LTF DIAGNOSTIC TEST MEIOSIS & MENDELIAN GENETICS

CSS 350 Midterm #2, 4/2/01

Causal Model Selection Hypothesis Tests in Systems Genetics

Methods for QTL analysis

1 Errors in mitosis and meiosis can result in chromosomal abnormalities.

Overview. Background

Calculation of IBD probabilities

Section 11 1 The Work of Gregor Mendel

Polyploid data analysis. How to gently transition from software user to software developer. Lindsay Clark, Genetics Graduate Group February 28, 2011

Designer Genes C Test

Name Class Date. Term Definition How I m Going to Remember the Meaning

Linkage Mapping. Reading: Mather K (1951) The measurement of linkage in heredity. 2nd Ed. John Wiley and Sons, New York. Chapters 5 and 6.

Multiple QTL mapping

Lecture WS Evolutionary Genetics Part I 1

Modeling IBD for Pairs of Relatives. Biostatistics 666 Lecture 17

POLYPLOIDS represent a group of plant species that development of QTL-mapping methodologies (Doerge

One-week Course on Genetic Analysis and Plant Breeding January 2013, CIMMYT, Mexico LOD Threshold and QTL Detection Power Simulation

Chapter 10 Sexual Reproduction and Genetics

Lecture 8. QTL Mapping 1: Overview and Using Inbred Lines

Introduction to Genetics

Parts 2. Modeling chromosome segregation

Biology Chapter 10 Test: Sexual Reproduction and Genetics

Chapter 13 Meiosis and Sexual Reproduction

Lecture 2: Genetic Association Testing with Quantitative Traits. Summer Institute in Statistical Genetics 2017

R/qtl workshop. (part 2) Karl Broman. Biostatistics and Medical Informatics University of Wisconsin Madison. kbroman.org

BIOLOGY 321. Answers to text questions th edition: Chapter 2

Eiji Yamamoto 1,2, Hiroyoshi Iwata 3, Takanari Tanabata 4, Ritsuko Mizobuchi 1, Jun-ichi Yonemaru 1,ToshioYamamoto 1* and Masahiro Yano 5,6

CELL BIOLOGY - CLUTCH CH MEIOSIS AND SEXUAL REPRODUCTION.

Biol. 303 EXAM I 9/22/08 Name

Homework Assignment, Evolutionary Systems Biology, Spring Homework Part I: Phylogenetics:

Mapping multiple QTL in experimental crosses

Teaching unit: Meiosis: The Steps to Creating Life

Linkage and Chromosome Mapping

Inferring Causal Phenotype Networks from Segregating Populat

Association Testing with Quantitative Traits: Common and Rare Variants. Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5

Lesson Overview Meiosis

The Chromosomal Basis of Inheritance

Prediction of the Confidence Interval of Quantitative Trait Loci Location

Statistical Genetics I: STAT/BIOST 550 Spring Quarter, 2014

Population Genetics. with implications for Linkage Disequilibrium. Chiara Sabatti, Human Genetics 6357a Gonda

Meiosis and Mendel. Chapter 6

UNIT 3: GENETICS 1. Inheritance and Reproduction Genetics inheritance Heredity parent to offspring chemical code genes specific order traits allele

An applied statistician does probability:

Mapping multiple QTL in experimental crosses

Quantitative Genetics & Evolutionary Genetics

7.014 Problem Set 6 Solutions

10. How many chromosomes are in human gametes (reproductive cells)? 23

Solutions to Even-Numbered Exercises to accompany An Introduction to Population Genetics: Theory and Applications Rasmus Nielsen Montgomery Slatkin

Linkage and Linkage Disequilibrium

Yesterday s Picture UNIT 3D

theta H H H H H H H H H H H K K K K K K K K K K centimorgans

POLYPLOIDY is an important evolutionary force in For allopolyploids derived from the chromosome

Lecture 1 Hardy-Weinberg equilibrium and key forces affecting gene frequency

Problems for 3505 (2011)

The phenotype of this worm is wild type. When both genes are mutant: The phenotype of this worm is double mutant Dpy and Unc phenotype.

Quantile-based permutation thresholds for QTL hotspot analysis: a tutorial

EXERCISES FOR CHAPTER 3. Exercise 3.2. Why is the random mating theorem so important?

Transcription:

Linkage analysis and QTL mapping in autotetraploid species Christine Hackett Biomathematics and Statistics Scotland Dundee DD2 5DA

Collaborators John Bradshaw Zewei Luo Iain Milne Jim McNicol Data and useful biological discussions from Barnaly Pande, Robbie Waugh, Dan Milbourne, Glenn Bryan, Karen McLean and Rhonda Meyer

Outline Part 1 Segregation analysis Cluster analysis Linkage analysis Part 2 QTL analysis

Part 1: Segregation analysis and Linkage analysis Segregation analysis: identify parental genotypes from parent and offspring phenotypes Cluster analysis: partition markers into linkage groups Linkage analysis: estimate the most likely phase between each pair of markers and calculate recombination frequencies and lod scores order markers to form linkage maps establish marker phases for completed linkage group

Inheritance in tetraploids - random chromosomal segregation 1 2 3 4 Parent Pair at random 1 2 3 4 1 3 2 4 1 4 2 3 and or and or and Recombination 1 2 3 4 and Gametes or or or

Segregation Analysis: Gamete formation A parent with 4 alleles abcd can produce gametes ab, ac, ad, bc, bd, cd with equal probability. There is also a small probability of producing gametes aa,bb,cc,dd by double reduction. probability of aa etc = α/4 probability of ab etc = (1 - α)/6, where α is the coefficient of double reduction. When crossed with a second parent efgh, there are 36 offspring genotypes if no double reduction, 100 if double reduction occurs. It is very unusual in practice to have 8 different alleles. Null alleles can also occur.

Theoretical segregation ratios (no double reduction) simplex duplex double-simplex

Outline Part 1 Segregation analysis Cluster analysis Linkage analysis

Use of cluster analysis (a) Cluster analysis of simplex markers to determine homologous chromosomes. Distance between markers related to recombination frequency. Label markers to show their cluster. (b) Cluster analysis of all markers Calculate χ 2 test for independent segregation Distance between markers related to significance Dendrograms based on single linkage and average linkage cluster analysis.

Cluster analysis Groups of simplex markers on homologous chromosomes are pulled together by other markers.

Outline Part 1 Segregation analysis Cluster analysis Linkage analysis

Linkage Analysis Assume no double reduction Calculate recombination frequency and lod score between each pair of markers in a linkage group, for each possible phase. 1 2 3 4 A, B, C simplex markers present on one chromosome of one parent. A B present B absent B A present n 1 n 2 C A absent n 3 n 4 If markers are in coupling phase (eg A,B), r.f = (n 2 +n 3 )/n In repulsion phase, this will give a r.f. > 0.5. Here r.f. = 3(n 1 +n 4 )/n 1 NB for diploid repulsion r.f. = (n 1 +n 4 )/n

Linkage Analysis The most informative situation would be 8 different alleles among the parents at each locus i.e. aa / bb / cc / dd x ee / ff / gg / hh This gives 36 x 36 = 1296 types of offspring! For each phase, we can classify the number of recombinants Parent 1 gametes: 6 of type aa/bb, 0 recombinants 24 of type ac/bb, 1 recombinant 6 of type ac/bd, 2 recombinants Same for parent 2.

Linkage Analysis Offspring from parents aa/bb/cc/dd x ee/ff/gg/hh: The recombination frequency is In general, not all alleles will be different. However we can estimate the frequencies associated with 0-4 recombinants and hence the recombination frequency via an EM algorithm.

Linkage phase The linkage phase must also be taken into consideration. e.g aa/bb/cc/dd x ee/ff/gg/hh ab/ba/cc/dd x ee/ff/gg/hh are different phases, and will give different probabilities for the offspring classes. There are up to 24 phases for each parent = 576 maximum, and we have to estimate the recombination frequency for each, and compare the likelihoods to see which is the most likely phase. The information associated with r also depends on the phase.

Marker ordering After estimation of the recombination frequency between all pairs of markers, these are ordered by optimising a weighted least squares criterion (same as JoinMap 3). Three ordering methods Initial run (based on seriation) Ripple search Simulated annealing search (slow) Recommend a ripple search initially to see if any spurious markers Finally use simulated annealing search to get final order

Reconstructed map of parents P1 P2 L 1 0.0 C B O D E A D C L 2 11.9 C D D A C D A O L 3 20.0 E E D D C O D A L 4 24.1 D C D O E O C E L 5 30.3 C D C O A E D D L 6 35.7 D A D C D B C E L 7 46.9 A O C B E O B B L 8 48.8 B E E A A O A O L 9 67.1 D E E O B C B B L 10 73.3 B E B E E A C O 1 2 3 4 5 6 7 8

Slides of data and software

Part of potato.loc file 228 73 PAGMAGT_205.0 1 1 0 0 0 1 0 1 0 0 1 9 0 1 0 0 0 0 1 0 0 0 0 1 1 1 0 0 0 0 1 1 1 1 1 0 PAGMAGT_174.0 1 1 0 1 1 0 0 0 1 1 1 9 0 1 0 0 1 0 0 0 1 1 1 1 1 0 0 0 0 1 0 1 1 1 1 1 PCAMAGG_114.5 1 1 0 1 1 1 1 0 1 1 1 1 0 0 0 0 0 1 0 0 1 1 0 1 1 1 0 0 0 1 0 1 1 1 0 1 s148_v 4 1 0 9 1 1 1 1 1 1 0 0 1 1 9 1 1 1 1 1 1 1 9 0 1 1 1 1 1 1 1 1 0 0 1 1 1 1 9 1 1 1 1 1 1 1 1 1 1 9 1 1 1 1 1 1 1 9 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 9 1 0 0 0 0 0 0 1 0 0 9 0 0 0 0 1 0 0 9 0 1 1 0 0 0 1 1 0 0 1 1 1 1 0 9 0 1 0 1 0 0 1 1 1 1 9 1 0 0 1 1 0 0 9 1 0 0 0 0 0 0 0 0 1 1 1 0 gp179_vb 1 1 1 0 9 9 9 1 1 0 0 0 9 1 1 9 0 1 1 1 1 1 9 0 1 0 1 1 1 1 1 1 0 0 1 0

Segregation analysis

Summary information

Parent 1 linkages

Select markers dialogue box

Dendrograms

Marker ordering dialogue box

Details of marker ordering

Outline Part 1 Segregation analysis Cluster analysis Linkage analysis Part 2 QTL analysis

Locating a QTL: Preliminary ANOVA For each marker: test for different trait means in different classes TetraploidMap has two options Usual ANOVA Kruskal-Wallis test for differences in trait medians Useful first scan but not fully informative e.g. P1 P2 QTL AOOO x OOOO: 2 classes, presence/absence of A ABCD x EFGH: 36 classes If different QTL alleles occur in P2, they will be detected only by ANOVA for more distant marker. Solution - use all marker information on each chromosome to infer QTL genotypes at each position

QTL genotypes The QTL genotype cannot be observed like a marker We assume 8 different QTL alleles, Q 1 -Q 4 from parent 1, Q 5 -Q 8 from parent 2. There are 36 offspring genotypes, such as Q 1 Q 4 Q 5 Q 6 Each genotype may have a different effect on the trait Most general model is main effects of alleles, two-way interactions etc.

Linear model for trait Let X i be indicator of allele Q i present/absent in a genotype Full model for offspring trait values is: Each offspring receives 2 alleles from parents: constrains model Possible models are effects of each allele, or each QTL genotype

QTL model Too few shared markers to be confident of alignment of parental maps Analyse trait data for each parent separately Fit effect of 6 possible QTL genotypes Compare with reduced models eg Simplex allele Dominant duplex allele Assess significance by permutation test

Reconstruction of offspring: 1 P1 P2 L 1 C B O D E A D C L 2 C D D A C D A O L 3 E E D D C O D A L 4 D C D O E O C E L 5 C D C O A E D D L 6 D A D C D B C E L 7 A O C B E O B B L 8 B E E A A O A O L 9 D E E O B C B B L 10 B E B E E A C O ACD = {1367,1467,1468,3468} ACD CDE CDE CDE ABDE = {1268,2368} AE = {1256} ABE BE BCE 1 2 3 4 5 6 7 8

Reconstruction of offspring: 2 For each offspring: Identify configurations for each marker locus Search for complete chromosome configurations with minimum crossovers (branch and bound algorithm) and compatible with bivalent pairing This is biologically realistic in potato - few crossovers We can represent result as a graphical genotype.

Reconstruction of offspring: 3 These configurations have 6 crossovers 1 4 6 8 L 1 ACD C D A C L 2 ACD C A D C/O L 3 CDE E D O C L 4 CDE D C O E/E 3 areas of uncertainty, giving 8 configurations L 5 CDE C D E D L 6 ABDE D A B E L 7 AE A O O E L 8 ABE B E O/A A L 9 BE E E B B We can trace chromosome sections from parent to offspring. L 10 BCE B E C E 3 2 7 5

Inference of QTL genotype for each configuration L 1 ACD L 2 ACD 1 4 6 8 L 3 CDE L 4 CDE 4 genotypes are possible here L 5 CDE L 6 ABDE L 7 AE L 8 ABE L 9 BE L 10 BCE QTL genotype is 1268 or 1265, with probability 0.5 halfway between loci. QTL genotype is 3275, probability 1. 3 2 7 5

Model fitting (1) In practice, we consider each position along the chromosome in turn and assess the likelihood of a QTL. For individual i, write y i =trait value, o i =marker data, G i =set of compatible configurations, Q i =set of QTL genotypes at that location. Likelihood equation is: A regression of trait value on QTL genotype, weighted by QTL genotype probability.

Lod profile for QTL location

Slides of data and software

Trait data 2 maturity tub_blight% 1 4.977 82.50 2 3.442 85.00 3 5.976 50.71 4 6.842 19.17 5 5.334 28.36 6 2.705 97.50 7 4.025 68.66 8 2.657 90.00 9 4.337-99.0 10 6.667 60.00 11 2.966 70.00 12 6.217 18.66 13 6.211-99.0 14 4.620 92.86 15 4.763 61.11 16 5.763-99.0 17 4.930 21.54 18 5.465 63.33

ANOVA of trait data

QTL analysis of chromosome V

Comparison with simpler model

For further details and references consult the TetraploidMap documentation.