Multipoint Analysis for Sibling Pairs. Biostatistics 666 Lecture 18

Similar documents
Checking Pairwise Relationships. Lecture 19 Biostatistics 666

Kinship Coefficients. Biostatistics 666

Excess Error, Approximation Error, and Estimation Error

COS 511: Theoretical Machine Learning

Chapter One Mixture of Ideal Gases

Chapter 12 Lyes KADEM [Thermodynamics II] 2007

System in Weibull Distribution

1 Review From Last Time

Applied Mathematics Letters

BAYESIAN CURVE FITTING USING PIECEWISE POLYNOMIALS. Dariusz Biskup

Computational and Statistical Learning theory Assignment 4

Economics 101. Lecture 4 - Equilibrium and Efficiency

XII.3 The EM (Expectation-Maximization) Algorithm

Chapter 13. Gas Mixtures. Study Guide in PowerPoint. Thermodynamics: An Engineering Approach, 5th edition by Yunus A. Çengel and Michael A.

Departure Process from a M/M/m/ Queue

Designing Fuzzy Time Series Model Using Generalized Wang s Method and Its application to Forecasting Interest Rate of Bank Indonesia Certificate

Fermi-Dirac statistics

Description of the Force Method Procedure. Indeterminate Analysis Force Method 1. Force Method con t. Force Method con t

total If no external forces act, the total linear momentum of the system is conserved. This occurs in collisions and explosions.

Least Squares Fitting of Data

1.3 Hence, calculate a formula for the force required to break the bond (i.e. the maximum value of F)

arxiv: v2 [math.co] 3 Sep 2017

Our focus will be on linear systems. A system is linear if it obeys the principle of superposition and homogenity, i.e.

Solutions for Homework #9

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Course 395: Machine Learning - Lectures

LECTURE :FACTOR ANALYSIS

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement

Gadjah Mada University, Indonesia. Yogyakarta State University, Indonesia Karangmalang Yogyakarta 55281

Least Squares Fitting of Data

Homework Notes Week 7

Hidden Markov Models & The Multivariate Gaussian (10/26/04)

,..., k N. , k 2. ,..., k i. The derivative with respect to temperature T is calculated by using the chain rule: & ( (5) dj j dt = "J j. k i.

1 Definition of Rademacher Complexity

Week 5: Neural Networks

Linear Regression Analysis: Terminology and Notation

Denote the function derivatives f(x) in given points. x a b. Using relationships (1.2), polynomials (1.1) are written in the form

Source-Channel-Sink Some questions

Mean Field / Variational Approximations

Final Exam Solutions, 1998

ITERATIVE ESTIMATION PROCEDURE FOR GEOSTATISTICAL REGRESSION AND GEOSTATISTICAL KRIGING

Topic 4. Orthogonal contrasts [ST&D p. 183]

Several generation methods of multinomial distributed random number Tian Lei 1, a,linxihe 1,b,Zhigang Zhang 1,c

EEE 241: Linear Systems

Revision: December 13, E Main Suite D Pullman, WA (509) Voice and Fax

An Optimal Bound for Sum of Square Roots of Special Type of Integers

Discrete Memoryless Channels

The Parity of the Number of Irreducible Factors for Some Pentanomials

Affected Sibling Pairs. Biostatistics 666

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

18.1 Introduction and Recap

Goodness of fit and Wilks theorem

Chapter 1. Probability

Kernel Methods and SVMs Extension

Hidden Markov Models

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Open Systems: Chemical Potential and Partial Molar Quantities Chemical Potential

A NOTE ON CES FUNCTIONS Drago Bergholt, BI Norwegian Business School 2011

Hidden Markov Model Cheat Sheet

An Application of Fuzzy Hypotheses Testing in Radar Detection

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

Convergence of random processes

International Journal of Mathematical Archive-9(3), 2018, Available online through ISSN

Evaluation for sets of classes

Page 1. SPH4U: Lecture 7. New Topic: Friction. Today s Agenda. Surface Friction... Surface Friction...

x = , so that calculated

Hidden Markov Models

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

Limited Dependent Variables

Determination of the Confidence Level of PSD Estimation with Given D.O.F. Based on WELCH Algorithm

THE ARIMOTO-BLAHUT ALGORITHM FOR COMPUTATION OF CHANNEL CAPACITY. William A. Pearlman. References: S. Arimoto - IEEE Trans. Inform. Thy., Jan.

Finite Vector Space Representations Ross Bannister Data Assimilation Research Centre, Reading, UK Last updated: 2nd August 2003

Linear Momentum. Center of Mass.

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)

Spatial Statistics and Analysis Methods (for GEOG 104 class).

PHYS 1443 Section 002 Lecture #20

LOGIT ANALYSIS. A.K. VASISHT Indian Agricultural Statistics Research Institute, Library Avenue, New Delhi

Markov Chain Monte Carlo Lecture 6

Motion Perception Under Uncertainty. Hongjing Lu Department of Psychology University of Hong Kong

CHAPTER 10 ROTATIONAL MOTION

Hopfield networks and Boltzmann machines. Geoffrey Hinton et al. Presented by Tambet Matiisen

On the Calderón-Zygmund lemma for Sobolev functions

Chapter 1. Theory of Gravitation

Gradient Descent Learning and Backpropagation

Stat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors

/ n ) are compared. The logic is: if the two

Outline. Prior Information and Subjective Probability. Subjective Probability. The Histogram Approach. Subjective Determination of the Prior Density

BIO Lab 2: TWO-LEVEL NORMAL MODELS with school children popularity data

Physics 231. Topic 8: Rotational Motion. Alex Brown October MSU Physics 231 Fall

First Year Examination Department of Statistics, University of Florida

Chapter 10 Sinusoidal Steady-State Power Calculations

Lecture 12: Discrete Laplacian

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

Module 3: Element Properties Lecture 1: Natural Coordinates

Stat 642, Lecture notes for 01/27/ d i = 1 t. n i t nj. n j

PROBABILITY AND STATISTICS Vol. III - Analysis of Variance and Analysis of Covariance - V. Nollau ANALYSIS OF VARIANCE AND ANALYSIS OF COVARIANCE

AN ANALYSIS OF A FRACTAL KINETICS CURVE OF SAVAGEAU

Lecture 19 of 42. MAP and MLE continued, Minimum Description Length (MDL)

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

Transcription:

Multpont Analyss for Sblng ars Bostatstcs 666 Lecture 8

revously Lnkage analyss wth pars of ndvduals Non-paraetrc BS Methods Maxu Lkelhood BD Based Method ossble Trangle Constrant

AS Methods Covered So Far ncreasng degrees of sophstcaton and coplexty n each case, only a sngle arker s evaluated

BS Based Lnkage Test χ df LOD [ N E N ] χ ln0 BS E N BS BS Expect counts calculated usng: Allele frequences for arker Relatonshp for affected ndvduals

Lkelhood for a Sngle AS L BD AS Genotypes BD 0 0 z w Rsch 990 defnes w Genotypes BD We only need proportonate w

MLS Lnkage Test 0 4 0 4 0 0 0 0,z,z z the LOD evaluated at the MLEs of The MLS statstc s log,, + + + + w w w w z w z w z LOD w z z z z L

ossble Trangle Constrant For any genetc odel, we expect ASs to be ore slar than unselected pars of sblngs. More precsely, Holans 993 showed that for any genetc odel z ¼ z ½ and z z 0 z 0 ¼

Further proveents All these ethods lose nforaton when a arker s unnforatve n a partcular faly Today, we wll see how to use neghborng arkers to extract ore nforaton about BD.

ntuton For Multpont Analyss BD changes nfrequently along the chroosoe Neghborng arkers can help resolve abgutes about BD sharng n the Rsch approach, they ght ensure that, effectvely, only one w s non-zero

Today Fraework for ultpont calculatons Frst, lkelhood of genotypes for seres of arkers Dscuss applcaton to the MLS lnkage test Later, we wll use t for useful applcatons such as error detecton and relatonshp nference Refresher on BD probabltes Usng a Markov Chan to speed analyses

ngredents 3 M One ngredent wll be the observed genotypes at each arker

ngredents 3 M 3 M 3 M 3 M Another ngredent wll be the possble BD states at each arker

ngredents 3 M 3 M 3 M 3 M 3... The fnal ngredent connects BD states along the chroosoe

The Lkelhood of Marker Data L M M M... General, but slow unless there are only a few arkers. Cobned wth Bayes Theore can estate probablty of each BD state at any arker.

The ngredents robablty of observed genotypes at each arker condtonal on BD state robablty of changes n BD state along chroosoe Hdden Markov Model

ror robablty of BD States

BD robabltes Nuber of alleles dentcal by descent For sblng pars, ust be: 0 Not always deterned by arker data

robablty of Observed Genotypes, Gven BD State

BD Sb CoSb 0 a,b c,d 4p a p b p c p d 0 0 a,a b,c p a p b p c 0 0 a,a b,b p a p b 0 0 a,b a,c 4p a p b p c p a p b p c 0 a,a a,b p 3 a p b p a p b 0 a,b a,b 4p a p b p a p b +p a p b p a p b 4 3 a,a a,a p a p a p a ror robablty ¼ ½ ¼ Note: Assung unordered genotypes

Queston: What to do about ssng data? What happens when soe genotype data s unavalable?

+ Model for Transtons n BD Along Chroosoe

+ Depends on recobnaton fracton θ Ths s a easure of dstance between two loc robablty grand-parental orgn of alleles changes between loc Naturally, leads to probablty of change n BD: ψ θ θ

+ BD state at arker BD State at + 0 0 -ψ ψ-ψ ψ ψ-ψ -ψ +ψ ψ-ψ ψ ψ-ψ -ψ ψ θ θ

+ All the ngredents!

Exaple Consder two loc separated by θ 0. Each loc has two alleles, each wth frequency.50 f two sblngs have the followng genotypes: Sb Sb Marker A: / / Marker B: / / What s the probablty of BD at arker B when You consder arker B alone? You consder both arkers sultaneously?

The Lkelhood of Marker Data L M M M... General, but slow unless there are only a few arkers. How do we speed thngs up?

A Markov Model Re-organze the coputaton slghtly, to avod evaluatng nested su drectly Three coponents: robablty consderng a sngle locaton robablty ncludng left flankng arkers robablty ncludng rght flankng arkers Scale of coputaton ncreases lnearly wth nuber of arkers

A Markov Rearrangeent + + + + 0,, 0,, k last k k LEFT L k k LEFT LEFT BD LEFT Usng ths arrangeent, we calculate the lkelhood by: Evaluatng LEFT functon at the frst poston Evaluatng LEFT functon along chroosoe Each te, re-usng results fro the prevous poston only Requred effort ncreases lnearly wth nuber of arkers Fnal suaton gves overall lkelhood

proveents The prevous arrangeent, quckly gves the lkelhood for any nuber of arkers A ore flexble arrangeent would allow us to quckly calculate condtonal BD probabltes along chroosoe

A More Flexble Arrangeent Sngle Marker Left Condtonal Rght Condtonal Full Lkelhood

The Lkelhood of Marker Data + M R L L...... A dfferent arrangeent of the sae lkelhood The nested suatons are now hdden nsde the L and R functons

Left-Chan robabltes,..., L L L roceed one arker at a te. Coputaton cost ncreases lnearly wth nuber of arkers.

Rght-Chan robabltes,..., + + + + + + + M M M R R R roceed one arker at a te. Coputaton cost ncreases lnearly wth nuber of arkers.

Extendng the MLS Method We ust change the defnton for the weghts gven to each confguraton!...... M R L w +

Soe Extensons We ll Dscuss Modelng error What coponents ght have to change? Modelng other types of relatves What coponents ght have to change? Modelng larger pedgrees

Today Effcent coputatonal fraework for ultpont analyss of sblng pars