Checkng Parwse Relatonshps Lecture 19 Bostatstcs 666
Last Lecture: Markov Model for Multpont Analyss X X X 1 3 X M P X 1 I P X I P X 3 I P X M I 1 3 M I 1 I I 3 I M P I I P I 3 I P... 1 IBD states along the chromosome are modeled usng a Markov Chan
The Lkelhood of Marker Data L M M P I 1 P I I 1 P X I I1 I IM 1... General, but slow unless there are only a few markers. Combned wth Bayes Theorem allows us to estmate probablty of IBD states at any marker.
Worked Example Consder two loc separated by θ.1 Each loc has two alleles, each wth frequency.5 If two sblngs have the followng genotypes: Sb1 Sb Marker A: 1/1 / Marker B: 1/1 1/1 What s the probablty of IBD at marker B when You consder marker B alone? You consder both markers smultaneously?
Soluton I 1 I PI 1 PI I 1 PX 1 I 1 PX I Prob 1 1 1 1 1 1
Soluton I 1 I PI 1 PI I 1 PX 1 I 1 PX I Prob.5.67.65.65.66 1.5.3.65.15.58.5.3.65.5.13 1.5.15.65. 1 1.5.7.15. 1.5.15.5..5.3.65. 1.5.3.15..5.67.5.
Soluton Takng nto account all avalable genotype data PI 1.9 PI 1 1.4 PI 1.49 Consderng only one marker, the correspondng probabltes would be.44,.44 and.11. Qute a dfference!, but whch value do you expect to be more accurate?
The Lkelhood of Marker Data L M M P I 1 P I I 1 P X I I1 I IM 1... General, but slow unless there are only a few markers. How do we speed thngs up?
Extendng the MLS Method We ust change the defnton for the weghts gven to each confguraton!...... 1 1 1 M I R I L I X P I X X P I X X P I X P w +
Today Checkng accuracy of reported relatonshps Why s ths an mportant problem? Markov Chan for Dfferent Relatve Pars Lkelhood approaches to relatonshp nference
Verfyng relatonshps s crucal Genetc analyses requre relatonshps to be specfed Msspecfyng relatonshps can lead to tests of napproprate sze Inflate Type I error Decrease power
IBS Based Approach Relatve pars wll dffer n terms of ther genetc smlarty One way to contrast dfferent types of relatves s to compare ther overall smlarty, for example, by: Calculatng the mean IBS sharng Calculatng the varance of IBS sharng
Example ~8 marker genome scan Calculated IBS for each set of putatve relatonshps Unrelated pars Sblng pars Parent-offsprng pars
Putatve Unrelated Pars Mean.87 St. Dev..7
Parent-Offsprng Pars Mean 1.7 St. Dev..5
Putatve Sblng Pars Mean 1.3 St. Dev..9
Problem Indvduals Are Outlers Crcled pars are lkely msclassfed
Addtonal Informaton n Standard Devaton of IBS Sharng Standard Devaton of IBS Sharng Mean IBS Sharng
Addtonal Informaton n Standard Devaton of IBS Sharng Standard Devaton of IBS Sharng Mean IBS Sharng
Problems wth IBS Scores Ineffcent Ignore nformaton on allele frequences Ignore correlatons between neghborng markers not too bad f large amounts of data avalable Cannot dstngush some types of relatves
Strategy: Informaton we have: X observed genotypes at each marker p allele frequences at each marker θ - recombnaton fracton between consecutve markers PXR for each possble relatonshp R unrelated, half-sb, sb-pars, MZ twns
Lkelhood Sum over IBD states at each locaton Set of possble I changes wth R 1 1 1 1... I m I m I X P I I P I P L m
Notaton R Hypotheszed Relatonshp Ik Ikm, Ikf Allele sharng at locus k X Genotype par at locus k k α R P X1, X,..., X, I R k k1 k Jont probablty of data at frst k-1 markers and IBD vector I k at marker k
Detals on I Possble nhertance patterns, no sharng 1, share maternal allele,1 share paternal allele 1,1 share both alleles For convenence, separate IBD1 nto maternal and paternal sharng states
Algorthm for Lkelhood Calculaton + M M M k k k k k I X P R L t I X P R R R I P R, 1 1 1 α α α α
Relatonshp between I and R Probablty of I 1,, 1,,,1 and 1,1: MZ Twns,,, 1 Unrelated? Parent-Offsprng? Full sbs ¼, ¼, ¼, ¼ Maternal half sbs ½, ½,, Paternal half sbs?
PXI for pars of ndvduals GENOTYPE PX 1,X I for I X 1 X,,1 or 1, 1,1 4 p 3 p p p 3 p p p p p k p p p k 4p p p p p +p p p k 4p p p k p p p k kl 4p p p k p l
Transton Matrx Full Sbs 1 1 1 1 1 1 1 1 1 1 1 1 1 1,1,1 1,, 1,1,1 1,, θ θ,, 1 1 1,, r r t r +
Transton Matrx Maternal Half Sbs 1 1 1 1,1,1 1,, 1,1,1 1,, θ θ, 1, 1 1 1,, r r t r
Transton Matrx Paternal Half Sbs 1 1 1 1,1,1 1,, 1,1,1 1,, θ θ, 1, 1,, r r t r
Transton Matrx Unrelated, 1,,1 1,1, 1 1,,1 1,1
Transton Matrx MZ twns, 1,,1 1,1, 1,,1 1,1 1
Example I Consder genotypes for one marker X 1 1/1, 1/1 Assume p 1.,.5 or.8 Calculate PXR for each relatonshp MZ twn, Full Sbs, Half-Sbs, Unrelated
Example II Consder genotypes for markers X 1 1/1, / X 1/1, / Assume p 1 p ½ Assume θ.58,.1 θ.5,.5 Calculate PXR for each relatonshp
Smulatons θ.1, M5 Inferred R True R Full Sbs Half Sbs Unrelated Full Sbs.914.85.1 Half Sbs.44.87.81 Unrelated <.1.59.941
Smulatons θ., M5 Inferred R True R Full Sbs Half Sbs Unrelated Full Sbs.948.5 <.1 Half Sbs.38.899.64 Unrelated <.1.6.938
Smulatons θ.1, M4 Inferred R True R Full Sbs Half Sbs Unrelated Full Sbs 1. <.1 <.1 Half Sbs <.1 1. <.1 Unrelated <.1 <.1 1.
Bayesan Approach Alternatve to smply maxmzng PXRr Incorporates pror nformaton on the expected frequency of each relatve par R R X P R Pror r R X P R Pror X r R P
More dstant relatonshps
Problem Consder some genome scan data 38 mcrosatellte markers Consder some par of ndvduals Putatve sblngs Observed Sharng Identcal for 379/38 genotype pars LGRMZ Twns LGRAny other >
Soluton: Allow for Genotypng Errors A small proporton of errors could lead to msclassfcaton Allow for possbly erroneous genotypes ε error rate parameter [ ] ² 1 1 ² 1, 1 1 G X G P X G P I X G P I G P G X P I X P + ε ε ε
Conclusons Lkelhood approach provdes relable manner to nfer relatonshps Can ncorporate multple lnked markers Some dstant relatonshps can only be dscerned by lkelhood approach
Today Checkng of Relatonshps for Pars of Indvduals Multpont algorthm for calculatng lkelhoods for genotype data
Recommended Readng Boehnke and Cox 1997, Am J Hum Genet 61:43-49 Optonal Broman and Weber 1998, AJHG 63:1563-4 McPeek and Sun, AJHG 66:176-94 Epsten et al., AJHG 67:119-31