Multipoint Analysis for Sibling Pairs. Biostatistics 666 Lecture 18

Multpont Analyss for Sblng ars Bostatstcs 666 Lecture 8

revously Lnkage analyss wth pars of ndvduals Non-paraetrc BS Methods Maxu Lkelhood BD Based Method ossble Trangle Constrant

AS Methods Covered So Far ncreasng degrees of sophstcaton and coplexty n each case, only a sngle arker s evaluated

BS Based Lnkage Test χ df LOD [ N E N ] χ ln0 BS E N BS BS Expect counts calculated usng: Allele frequences for arker Relatonshp for affected ndvduals

Lkelhood for a Sngle AS L BD AS Genotypes BD 0 0 z w Rsch 990 defnes w Genotypes BD We only need proportonate w

MLS Lnkage Test 0 4 0 4 0 0 0 0,z,z z the LOD evaluated at the MLEs of The MLS statstc s log,, + + + + w w w w z w z w z LOD w z z z z L

ossble Trangle Constrant For any genetc odel, we expect ASs to be ore slar than unselected pars of sblngs. More precsely, Holans 993 showed that for any genetc odel z ¼ z ½ and z z 0 z 0 ¼

Further proveents All these ethods lose nforaton when a arker s unnforatve n a partcular faly Today, we wll see how to use neghborng arkers to extract ore nforaton about BD.

ntuton For Multpont Analyss BD changes nfrequently along the chroosoe Neghborng arkers can help resolve abgutes about BD sharng n the Rsch approach, they ght ensure that, effectvely, only one w s non-zero

Today Fraework for ultpont calculatons Frst, lkelhood of genotypes for seres of arkers Dscuss applcaton to the MLS lnkage test Later, we wll use t for useful applcatons such as error detecton and relatonshp nference Refresher on BD probabltes Usng a Markov Chan to speed analyses

ngredents 3 M One ngredent wll be the observed genotypes at each arker

ngredents 3 M 3 M 3 M 3 M Another ngredent wll be the possble BD states at each arker

ngredents 3 M 3 M 3 M 3 M 3... The fnal ngredent connects BD states along the chroosoe

The Lkelhood of Marker Data L M M M... General, but slow unless there are only a few arkers. Cobned wth Bayes Theore can estate probablty of each BD state at any arker.

The ngredents robablty of observed genotypes at each arker condtonal on BD state robablty of changes n BD state along chroosoe Hdden Markov Model

ror robablty of BD States

BD robabltes Nuber of alleles dentcal by descent For sblng pars, ust be: 0 Not always deterned by arker data

robablty of Observed Genotypes, Gven BD State

BD Sb CoSb 0 a,b c,d 4p a p b p c p d 0 0 a,a b,c p a p b p c 0 0 a,a b,b p a p b 0 0 a,b a,c 4p a p b p c p a p b p c 0 a,a a,b p 3 a p b p a p b 0 a,b a,b 4p a p b p a p b +p a p b p a p b 4 3 a,a a,a p a p a p a ror robablty ¼ ½ ¼ Note: Assung unordered genotypes

Queston: What to do about ssng data? What happens when soe genotype data s unavalable?

+ Model for Transtons n BD Along Chroosoe

+ Depends on recobnaton fracton θ Ths s a easure of dstance between two loc robablty grand-parental orgn of alleles changes between loc Naturally, leads to probablty of change n BD: ψ θ θ

+ BD state at arker BD State at + 0 0 -ψ ψ-ψ ψ ψ-ψ -ψ +ψ ψ-ψ ψ ψ-ψ -ψ ψ θ θ

+ All the ngredents!

Exaple Consder two loc separated by θ 0. Each loc has two alleles, each wth frequency.50 f two sblngs have the followng genotypes: Sb Sb Marker A: / / Marker B: / / What s the probablty of BD at arker B when You consder arker B alone? You consder both arkers sultaneously?

The Lkelhood of Marker Data L M M M... General, but slow unless there are only a few arkers. How do we speed thngs up?

A Markov Model Re-organze the coputaton slghtly, to avod evaluatng nested su drectly Three coponents: robablty consderng a sngle locaton robablty ncludng left flankng arkers robablty ncludng rght flankng arkers Scale of coputaton ncreases lnearly wth nuber of arkers

A Markov Rearrangeent + + + + 0,, 0,, k last k k LEFT L k k LEFT LEFT BD LEFT Usng ths arrangeent, we calculate the lkelhood by: Evaluatng LEFT functon at the frst poston Evaluatng LEFT functon along chroosoe Each te, re-usng results fro the prevous poston only Requred effort ncreases lnearly wth nuber of arkers Fnal suaton gves overall lkelhood

proveents The prevous arrangeent, quckly gves the lkelhood for any nuber of arkers A ore flexble arrangeent would allow us to quckly calculate condtonal BD probabltes along chroosoe

A More Flexble Arrangeent Sngle Marker Left Condtonal Rght Condtonal Full Lkelhood

The Lkelhood of Marker Data + M R L L...... A dfferent arrangeent of the sae lkelhood The nested suatons are now hdden nsde the L and R functons

Left-Chan robabltes,..., L L L roceed one arker at a te. Coputaton cost ncreases lnearly wth nuber of arkers.

Rght-Chan robabltes,..., + + + + + + + M M M R R R roceed one arker at a te. Coputaton cost ncreases lnearly wth nuber of arkers.

Extendng the MLS Method We ust change the defnton for the weghts gven to each confguraton!...... M R L w +

Soe Extensons We ll Dscuss Modelng error What coponents ght have to change? Modelng other types of relatves What coponents ght have to change? Modelng larger pedgrees

Today Effcent coputatonal fraework for ultpont analyss of sblng pars