Hidden Markov Models for the Assessment of Chromosomal Alterations using High-throughput SNP Arrays
|
|
- Cody Harrell
- 5 years ago
- Views:
Transcription
1 Hidden Markov Models for the Assessment of Chromosomal Alterations using High-throughput SNP Arrays Department of Biostatistics Johns Hopkins Bloomberg School of Public Health November 18, 2008
2 Acknowledgments Rob Scharpf Giovanni Parmigiani Rafael Irizarry + Benilton Carvalho Jonathan Pevsner + Nate Miller, Eli Roberson, Jason Ting
3 Trisomy
4 Karyotypes General Cytogenetics Information
5 FISH Courtesy of the Pevsner Laboratory
6 SNP chip data
7 Deletion copy number AA AB BB Mb Chr Chr 12
8 Amplification 10 8 AA AB BB copy number 2 1 Mb Chr 7
9 Uniparental Isodisomy 5 4 AA AB BB 3 copy number Mb Chr 1 Chr 2
10 Cancer samples
11 Mosaicism copy number 1 AA AB BB Mb Chr 2
12 Estimation 1 By SNP: Estimate genotype and copy number for each SNP. 2 Within a sample: Borrow strength between SNPs to infer regions of LOH and copy number changes. 3 Between samples: Comparison between normal and disease populations to find chromosomal alterations associated with disease.
13 SNPchip S4 classes and methods
14 The structure of the data we observe At each SNP, we observe a noisy measure of the true copy number and genotype (and possibly also measures of confidence in those estimates).
15 Another HMM...? Novel (and we believe, important) HMM features: 1 Model the observation sequence of genotype calls and copy number jointly (Vanilla) 2 Integrate confidence estimates of the genotype calls and copy number estimates (ICE) QuantiSNP and PennCNV also model genotype and copy number jointly! Colella et. al. (2007), QuantiSNP: an objective Bayes hidden-markov model..., Nucleic Acids Res 35(6): Wang et. al. (2008), PennCNV: An integrated hidden-markov model designed for..., Genome Research 17:
16 The Vanilla HMM components Observations ĈN and ĜT Hidden states Initial state probability distribution Transition probabilities Emission probabilities
17 Hidden states
18 Transition probabilities Following suggestions in the literature, we model the transition probabilities as a function of the distance d between SNPs. Specifically, let θ(d) 1 e 2d denote the probability that SNP i is not informative (I c ) for SNP at i + 1. For example: τ =P { (d) = P { = P { P i+1 i, d } i+1, I i, d } + P i+1, I c i, d i+1 I, i, d } P { I i, d } + i+1 I c, i, d P I c i, d = P { } θ(d).
19 Emission probabilities We assume conditional independence between copy number estimates and the genotype calls. For example: f( c CN, c GT ) = f( c CN ) f( c GT ) n o n o = f ccn ց f cgt = β ց n ccn o β n cgt o.
20 More information The confidence in genotype calls can differ substantially between SNPs! 4 Concordant 2 AA AB BB Discordant called BB (AB) Sense Antisense
21 Integrating confidence estimates for genotype calls Let S cgt be the confidence score for the genotype estimate. We can estimate from Hapmap the following densities: n o n o n f SĤOM ĤOM, HOM, f SĤOM ĤOM, HET, f S HET d HET, d o n HOM, f S HET d HET, d o HET. Note: n o f SĤOM ĤOM, n f S HET d HET, d o n o f SĤOM ĤOM, HOM n f S HET d HET, d o HOM.
22 Emission Probabilities - Loss Recall that f( c CN, c GT ) = f( c CN ) f( c GT ) n o n o = f ccn ց f cgt = β ց n ccn o β n cgt o. If the state for a particular SNP is Loss, we have β n cgt, S cgt o = f n o n cgt f S cgt GT, c o.
23 Emission Probabilities - Retention For retention, the true genotype can be HET or HOM: β n cgt, Sd GT o n o n = f cgt f S GT d GT, c o n o n = f cgt f S GT d, HOM GT, c o n + f S d GT, HET GT, c o n o n = f cgt f S GT d HOM, GT, c o n f HOM GT, c o n + f S GT d HET, GT, c o n f HET GT, c o n o n = f cgt f S GT d HOM, GT c o n f HOM GT, c o n + f S GT d HET, GT c o n f HET GT, c o
24 Copy number and genotype calls for chromosome A D B C E
25 Vanilla and ICE HMMs for genotype and copy number Deletion Normal LOH Amplification A D B C E 2 1 Van ICE A D B E Van ICE Mb
26 A HapMap sample Deletion Normal LOH Amplification 1 Van ICE Van ICE Mb
27 Many HapMap samples
28 SNP Trio
29 HMM for SNP Trio chromosome 10 chromosome 22 BPI BPI UPI F UPI F UPI M UPI M MI D MI D MI S non BPI MI S non BPI BPI BPI position (Mb) position (Mb)
30 De novo deletion
31 Homozygous and hemizygous deletions GF GM M A B S
32 Homozygous and hemizygous deletions GF GM M A B S Grandfather Grandmother Mother Aunt Brother Sister AB 0.01 BB 0.05 AB AB AB 0.06 BB AA 0.27 NC AA AA AB 0.12 BB AB 0.15 NC AA AA AB BB AB NC AA AA AA 0.30 AA BB 0.20 NC NC BB BB 0.22 BB AB 0.03 NC BB BB AB 0.09 AA AB NC BB BB AB AA BB 0.01 NC BB 0.04 BB BB 0.14 NC AB 0.17 NC AA AA AA AA AB 0.01 NC AA AA AA 0.16 AA AB NC BB BB AB 0.13 AA AB 0.01 NC BB BB AB AA BB 0.16 NC BB BB BB 0.10 BB AB 0.06 NC AA AA AB 0.07 BB AA 0.17 NC AA AA AB BB BB 0.02 NC BB BB AB 0.13 AA AA 0.21 NC AA AA AB 0.20 BB BB 0.05 BB 0.11 BB 0.15 BB 0.29 AB 0.13 AB -0.01
33 References Carvalho B, Bengtsson H, Speed TP, Irizarry RA (2007) Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data. Biostatistics, 8(2): Scharpf RB, Ting JC, Pevsner J, Ruczinski I (2007). SNPchip: R classes and methods for SNP array data. Bioinformatics, 23(5): Scharpf RB, Parmigiani G, Pevsner J, Ruczinski I (2008). Hidden Markov models for the assessment of chromosomal alterations using high-throughput SNP arrays. Annals of Applied Statistics, 2(2): Ting J, Ye Y, Thomas G, Ruczinski I, Pevsner J (2006). Analysis and visualization of chromosomal abnormalities in SNP data with SNPscan. BMC Bioinformatics, 7(1):25. Ting J, Roberson E, Miller N, Lysholm A, Stephan D, Capone G, Ruczinski I, Thomas G, Pevsner J (2007). Visualization of uniparental inheritance, Mendelian inconsistencies, deletions and parent of origin effects in single nucleotide polymorphism trio data with SNPtrio. Human Mutation, 28(12):
34 iruczins/
An Integrated Approach for the Assessment of Chromosomal Abnormalities
An Integrated Approach for the Assessment of Chromosomal Abnormalities Department of Biostatistics Johns Hopkins Bloomberg School of Public Health June 26, 2007 Karyotypes Karyotypes General Cytogenetics
More informationAn Integrated Approach for the Assessment of Chromosomal Abnormalities
An Integrated Approach for the Assessment of Chromosomal Abnormalities Department of Biostatistics Johns Hopkins Bloomberg School of Public Health June 6, 2007 Karyotypes Mitosis and Meiosis Meiosis Meiosis
More informationSNP Association Studies with Case-Parent Trios
SNP Association Studies with Case-Parent Trios Department of Biostatistics Johns Hopkins Bloomberg School of Public Health July, 00 Acknowledgments Collaborators: Qing Li, Rob Scharpf, Holger Schwender,
More informationSNP Association Studies with Case-Parent Trios
SNP Association Studies with Case-Parent Trios Department of Biostatistics Johns Hopkins Bloomberg School of Public Health September 3, 2009 Population-based Association Studies Balding (2006). Nature
More informationSome New Methods for Family-Based Association Studies
Some New Methods for Family-Based Association Studies Ingo Ruczinski Department of Biostatistics Johns Hopkins Bloomberg School of Public Health April 8, 20 http: //biostat.jhsph.edu/ iruczins/ Topics
More informationMultiple Change-Point Detection and Analysis of Chromosome Copy Number Variations
Multiple Change-Point Detection and Analysis of Chromosome Copy Number Variations Yale School of Public Health Joint work with Ning Hao, Yue S. Niu presented @Tsinghua University Outline 1 The Problem
More informationCSci 8980: Advanced Topics in Graphical Models Analysis of Genetic Variation
CSci 8980: Advanced Topics in Graphical Models Analysis of Genetic Variation Instructor: Arindam Banerjee November 26, 2007 Genetic Polymorphism Single nucleotide polymorphism (SNP) Genetic Polymorphism
More informationGenotype Imputation. Class Discussion for January 19, 2016
Genotype Imputation Class Discussion for January 19, 2016 Intuition Patterns of genetic variation in one individual guide our interpretation of the genomes of other individuals Imputation uses previously
More informationLearning ancestral genetic processes using nonparametric Bayesian models
Learning ancestral genetic processes using nonparametric Bayesian models Kyung-Ah Sohn October 31, 2011 Committee Members: Eric P. Xing, Chair Zoubin Ghahramani Russell Schwartz Kathryn Roeder Matthew
More informationGenotype Imputation. Biostatistics 666
Genotype Imputation Biostatistics 666 Previously Hidden Markov Models for Relative Pairs Linkage analysis using affected sibling pairs Estimation of pairwise relationships Identity-by-Descent Relatives
More informationRECONSTRUCTING DNA COPY NUMBER BY JOINT SEGMENTATION OF MULTIPLE SEQUENCES. Zhongyang Zhang Kenneth Lange Chiara Sabatti
RECONSTRUCTING DNA COPY NUMBER BY JOINT SEGMENTATION OF MULTIPLE SEQUENCES By Zhongyang Zhang Kenneth Lange Chiara Sabatti Technical Report 261 March 2012 Division of Biostatistics STANFORD UNIVERSITY
More informationSNP-SNP Interactions in Case-Parent Trios
Detection of SNP-SNP Interactions in Case-Parent Trios Department of Biostatistics Johns Hopkins Bloomberg School of Public Health June 2, 2009 Karyotypes http://ghr.nlm.nih.gov/ Single Nucleotide Polymphisms
More informationCurriculum Links. AQA GCE Biology. AS level
Curriculum Links AQA GCE Biology Unit 2 BIOL2 The variety of living organisms 3.2.1 Living organisms vary and this variation is influenced by genetic and environmental factors Causes of variation 3.2.2
More informationUNIT 8 BIOLOGY: Meiosis and Heredity Page 148
UNIT 8 BIOLOGY: Meiosis and Heredity Page 148 CP: CHAPTER 6, Sections 1-6; CHAPTER 7, Sections 1-4; HN: CHAPTER 11, Section 1-5 Standard B-4: The student will demonstrate an understanding of the molecular
More informationExpected complete data log-likelihood and EM
Expected complete data log-likelihood and EM In our EM algorithm, the expected complete data log-likelihood Q is a function of a set of model parameters τ, ie M Qτ = log fb m, r m, g m z m, l m, τ p mz
More informationBig Idea 3B Basic Review. 1. Which disease is the result of uncontrolled cell division? a. Sickle-cell anemia b. Alzheimer s c. Chicken Pox d.
Big Idea 3B Basic Review 1. Which disease is the result of uncontrolled cell division? a. Sickle-cell anemia b. Alzheimer s c. Chicken Pox d. Cancer 2. Cancer cells do not exhibit, which can lead to the
More information2. Map genetic distance between markers
Chapter 5. Linkage Analysis Linkage is an important tool for the mapping of genetic loci and a method for mapping disease loci. With the availability of numerous DNA markers throughout the human genome,
More informationGENETICS - CLUTCH CH.1 INTRODUCTION TO GENETICS.
!! www.clutchprep.com CONCEPT: HISTORY OF GENETICS The earliest use of genetics was through of plants and animals (8000-1000 B.C.) Selective breeding (artificial selection) is the process of breeding organisms
More information1 Springer. Nan M. Laird Christoph Lange. The Fundamentals of Modern Statistical Genetics
1 Springer Nan M. Laird Christoph Lange The Fundamentals of Modern Statistical Genetics 1 Introduction to Statistical Genetics and Background in Molecular Genetics 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
More informationRobust Detection and Identification of Sparse Segments in Ultra-High Dimensional Data Analysis
Robust Detection and Identification of Sparse Segments in Ultra-High Dimensional Data Analysis Hongzhe Li hongzhe@upenn.edu, http://statgene.med.upenn.edu University of Pennsylvania Perelman School of
More informationGene mapping in model organisms
Gene mapping in model organisms Karl W Broman Department of Biostatistics Johns Hopkins University http://www.biostat.jhsph.edu/~kbroman Goal Identify genes that contribute to common human diseases. 2
More informationF1 Parent Cell R R. Name Period. Concept 15.1 Mendelian inheritance has its physical basis in the behavior of chromosomes
Name Period Concept 15.1 Mendelian inheritance has its physical basis in the behavior of chromosomes 1. What is the chromosome theory of inheritance? 2. Explain the law of segregation. Use two different
More informationCalculation of IBD probabilities
Calculation of IBD probabilities David Evans and Stacey Cherny University of Oxford Wellcome Trust Centre for Human Genetics This Session IBD vs IBS Why is IBD important? Calculating IBD probabilities
More informationHumans have two copies of each chromosome. Inherited from mother and father. Genotyping technologies do not maintain the phase
Humans have two copies of each chromosome Inherited from mother and father. Genotyping technologies do not maintain the phase Genotyping technologies do not maintain the phase Recall that proximal SNPs
More informationSupplementary Materials for Integrated study of copy number states and genotype calls using high density SNP arrays
Supplementary Materials for Integrated study of copy number states and genotype calls using high density SNP arrays A HapMap samples Originally, Illumina performed 73 CEU samples, 77 YRI samples, and 75
More informationLecture 24. Ingo Ruczinski. November 24, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University
Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University November 24, 2015 1 2 3 4 5 1 Odds ratios for retrospective studies 2 Odds ratios approximating the
More informationBIOINFORMATICS. Discrepancies in dbsnp confirmation rates and allele frequency distributions from varying genotyping error rates and patterns
BIOINFORMATICS Vol. 2 no. 7 24, pages 122 132 DOI: 1.193/bioinformatics/bth34 Discrepancies in dbsnp confirmation rates and allele frequency distributions from varying genotyping error rates and patterns
More informationBioinformatics. Genotype -> Phenotype DNA. Jason H. Moore, Ph.D. GECCO 2007 Tutorial / Bioinformatics.
Bioinformatics Jason H. Moore, Ph.D. Frank Lane Research Scholar in Computational Genetics Associate Professor of Genetics Adjunct Associate Professor of Biological Sciences Adjunct Associate Professor
More informationESTIMATION OF PARENT SPECIFIC DNA COPY NUMBER IN TUMORS USING HIGH-DENSITY GENOTYPING ARRAYS. Hao Chen Haipeng Xing Nancy Zhang
ESTIMATION OF PARENT SPECIFIC DNA COPY NUMBER IN TUMORS USING HIGH-DENSITY GENOTYPING ARRAYS By Hao Chen Haipeng Xing Nancy Zhang Technical Report 251 March 2010 Division of Biostatistics STANFORD UNIVERSITY
More informationNONPARAMETRIC ADJUSTMENT FOR MEASUREMENT ERROR IN TIME TO EVENT DATA: APPLICATION TO RISK PREDICTION MODELS
BIRS 2016 1 NONPARAMETRIC ADJUSTMENT FOR MEASUREMENT ERROR IN TIME TO EVENT DATA: APPLICATION TO RISK PREDICTION MODELS Malka Gorfine Tel Aviv University, Israel Joint work with Danielle Braun and Giovanni
More informationParts 2. Modeling chromosome segregation
Genome 371, Autumn 2018 Quiz Section 2 Meiosis Goals: To increase your familiarity with the molecular control of meiosis, outcomes of meiosis, and the important role of crossing over in generating genetic
More informationSupplementary Information for Discovery and characterization of indel and point mutations
Supplementary Information for Discovery and characterization of indel and point mutations using DeNovoGear Avinash Ramu 1 Michiel J. Noordam 1 Rachel S. Schwartz 2 Arthur Wuster 3 Matthew E. Hurles 3 Reed
More informationAssociation Testing with Quantitative Traits: Common and Rare Variants. Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5
Association Testing with Quantitative Traits: Common and Rare Variants Timothy Thornton and Katie Kerr Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5 1 / 41 Introduction to Quantitative
More informationUnit 3 - Molecular Biology & Genetics - Review Packet
Name Date Hour Unit 3 - Molecular Biology & Genetics - Review Packet True / False Questions - Indicate True or False for the following statements. 1. Eye color, hair color and the shape of your ears can
More informationNETWORK BIOLOGY AND COMPLEX DISEASES. Ahto Salumets
NETWORK BIOLOGY AND COMPLEX DISEASES Ahto Salumets CENTRAL DOGMA OF BIOLOGY https://en.wikipedia.org/wiki/central_dogma_of_molecular_biology http://www.qaraqalpaq.com/genetics.html CHROMOSOMES SINGLE-NUCLEOTIDE
More informationPage 1. References. Hidden Markov models and multiple sequence alignment. Markov chains. Probability review. Example. Markovian sequence
Page Hidden Markov models and multiple sequence alignment Russ B Altman BMI 4 CS 74 Some slides borrowed from Scott C Schmidler (BMI graduate student) References Bioinformatics Classic: Krogh et al (994)
More informationOutline. P o purple % x white & white % x purple& F 1 all purple all purple. F purple, 224 white 781 purple, 263 white
Outline - segregation of alleles in single trait crosses - independent assortment of alleles - using probability to predict outcomes - statistical analysis of hypotheses - conditional probability in multi-generation
More informationAlleles Notes. 3. In the above table, circle each symbol that represents part of a DNA molecule. Underline each word that is the name of a protein.
Alleles Notes Different versions of the same gene are called alleles. Different alleles give the instructions for making different versions of a protein. This table shows examples for two human genes.
More informationStochastic processes and
Stochastic processes and Markov chains (part II) Wessel van Wieringen w.n.van.wieringen@vu.nl wieringen@vu nl Department of Epidemiology and Biostatistics, VUmc & Department of Mathematics, VU University
More informationComplexity and Approximation of the Minimum Recombination Haplotype Configuration Problem
Complexity and Approximation of the Minimum Recombination Haplotype Configuration Problem Lan Liu 1, Xi Chen 3, Jing Xiao 3, and Tao Jiang 1,2 1 Department of Computer Science and Engineering, University
More informationSAT in Bioinformatics: Making the Case with Haplotype Inference
SAT in Bioinformatics: Making the Case with Haplotype Inference Inês Lynce 1 and João Marques-Silva 2 1 IST/INESC-ID, Technical University of Lisbon, Portugal ines@sat.inesc-id.pt 2 School of Electronics
More informationMeiosis and Fertilization Understanding How Genes Are Inherited 1
Meiosis and Fertilization Understanding How Genes Are Inherited 1 Introduction In this activity, you will learn how you inherited two copies of each gene, one from your mother and one from your father.
More information4/19/10 More complications to Mendel
4/19/10 More complications to Mendel Complications to the relationship between genotype to phenotype Commentary written in response to the release of the first draft of the human genome sequence From Science
More informationAffected Sibling Pairs. Biostatistics 666
Affected Sibling airs Biostatistics 666 Today Discussion of linkage analysis using affected sibling pairs Our exploration will include several components we have seen before: A simple disease model IBD
More informationThe Lander-Green Algorithm. Biostatistics 666 Lecture 22
The Lander-Green Algorithm Biostatistics 666 Lecture Last Lecture Relationship Inferrence Likelihood of genotype data Adapt calculation to different relationships Siblings Half-Siblings Unrelated individuals
More informationLesson 4: Understanding Genetics
Lesson 4: Understanding Genetics 1 Terms Alleles Chromosome Co dominance Crossover Deoxyribonucleic acid DNA Dominant Genetic code Genome Genotype Heredity Heritability Heritability estimate Heterozygous
More informationComparative Gene Finding. BMI/CS 776 Spring 2015 Colin Dewey
Comparative Gene Finding BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 2015 Colin Dewey cdewey@biostat.wisc.edu Goals for Lecture the key concepts to understand are the following: using related genomes
More information- mutations can occur at different levels from single nucleotide positions in DNA to entire genomes.
February 8, 2005 Bio 107/207 Winter 2005 Lecture 11 Mutation and transposable elements - the term mutation has an interesting history. - as far back as the 17th century, it was used to describe any drastic
More informationStatistical Applications in Genetics and Molecular Biology
Statistical Applications in Genetics and Molecular Biology Volume 11, Issue 2 2012 Article 6 COMPUTATIONAL STATISTICAL METHODS FOR GENOMICS AND SYSTEMS BIOLOGY A Family-Based Probabilistic Method for Capturing
More informationMississippi Academic Assessment Program (MAAP) Biology SAMPLE ITEMS
Mississippi Academic Assessment Program (MAAP) Biology SAMPLE ITEMS 1 1. The diagram below shows a piece of DNA and the proteins made from this DNA. Which characteristic of the DNA helps it function as
More informationCalculation of IBD probabilities
Calculation of IBD probabilities David Evans University of Bristol This Session Identity by Descent (IBD) vs Identity by state (IBS) Why is IBD important? Calculating IBD probabilities Lander-Green Algorithm
More informationThe E-M Algorithm in Genetics. Biostatistics 666 Lecture 8
The E-M Algorithm in Genetics Biostatistics 666 Lecture 8 Maximum Likelihood Estimation of Allele Frequencies Find parameter estimates which make observed data most likely General approach, as long as
More informationEM algorithm. Rather than jumping into the details of the particular EM algorithm, we ll look at a simpler example to get the idea of how it works
EM algorithm The example in the book for doing the EM algorithm is rather difficult, and was not available in software at the time that the authors wrote the book, but they implemented a SAS macro to implement
More informationLecture 12 April 25, 2018
Stats 300C: Theory of Statistics Spring 2018 Lecture 12 April 25, 2018 Prof. Emmanuel Candes Scribe: Emmanuel Candes, Chenyang Zhong 1 Outline Agenda: The Knockoffs Framework 1. The Knockoffs Framework
More informationStatistical Analysis of Haplotypes, Untyped SNPs, and CNVs in Genome-Wide Association Studies
Statistical Analysis of Haplotypes, Untyped SNPs, and CNVs in Genome-Wide Association Studies by Yijuan Hu A dissertation submitted to the faculty of the University of North Carolina at Chapel Hill in
More informationOn coding genotypes for genetic markers with multiple alleles in genetic association study of quantitative traits
Wang BMC Genetics 011, 1:8 http://www.biomedcentral.com/171-156/1/8 METHODOLOGY ARTICLE Open Access On coding genotypes for genetic markers with multiple alleles in genetic association study of quantitative
More informationFriday Harbor From Genetics to GWAS (Genome-wide Association Study) Sept David Fardo
Friday Harbor 2017 From Genetics to GWAS (Genome-wide Association Study) Sept 7 2017 David Fardo Purpose: prepare for tomorrow s tutorial Genetic Variants Quality Control Imputation Association Visualization
More informationCloud-scale RNA-sequencing differential expression analysis with Myrna
Cloud-scale RNA-sequencing differential expression analysis with Myrna Jeff Leek Johns Hopkins Bloomberg School of Public Health e: jleek@jhsph.edu t: http://www.twitter.com/leekgroup myrna: http://bowtie-bio.sourceforge.net/myrna/
More informationBIOLOGY LTF DIAGNOSTIC TEST MEIOSIS & MENDELIAN GENETICS
016064 BIOLOGY LTF DIAGNOSTIC TEST MEIOSIS & MENDELIAN GENETICS TEST CODE: 016064 Directions: Each of the questions or incomplete statements below is followed by five suggested answers or completions.
More informationMATHEMATICAL MODELS - Vol. III - Mathematical Modeling and the Human Genome - Hilary S. Booth MATHEMATICAL MODELING AND THE HUMAN GENOME
MATHEMATICAL MODELING AND THE HUMAN GENOME Hilary S. Booth Australian National University, Australia Keywords: Human genome, DNA, bioinformatics, sequence analysis, evolution. Contents 1. Introduction:
More information7.014 Problem Set 6 Solutions
7.014 Problem Set 6 Solutions Question 1 a) Define the following terms: Dominant In genetics, the ability of one allelic form of a gene to determine the phenotype of a heterozygous individual, in which
More informationFast estimation of posterior probabilities in change-point analysis through a constrained hidden Markov model
arxiv:1203.4394v2 [stat.ap] 22 Jan 2013 Fast estimation of posterior probabilities in change-point analysis through a constrained hidden Markov model The Minh Luong, Yves Rozenholc, and Gregory Nuel MAP5,
More informationIntroduction to population genetics & evolution
Introduction to population genetics & evolution Course Organization Exam dates: Feb 19 March 1st Has everybody registered? Did you get the email with the exam schedule Summer seminar: Hot topics in Bioinformatics
More informationTutorial Session 2. MCMC for the analysis of genetic data on pedigrees:
MCMC for the analysis of genetic data on pedigrees: Tutorial Session 2 Elizabeth Thompson University of Washington Genetic mapping and linkage lod scores Monte Carlo likelihood and likelihood ratio estimation
More informationLearning Your Identity and Disease from Research Papers: Information Leaks in Genome-Wide Association Study
Learning Your Identity and Disease from Research Papers: Information Leaks in Genome-Wide Association Study Rui Wang, Yong Li, XiaoFeng Wang, Haixu Tang and Xiaoyong Zhou Indiana University at Bloomington
More informationClassification of SNP genotypes by a Gaussian mixture model in competitive enzymatic assays
Mathematical Statistics Stockholm University Classification of SNP genotypes by a Gaussian mixture model in competitive enzymatic assays Hedvig Norlén, Erik Pettersson, Afshin Ahmadian, Joakim Lundeberg
More informationBiol. 303 EXAM I 9/22/08 Name
Biol. 303 EXAM I 9/22/08 Name -------------------------------------------------------------------------------------------------------------- This exam consists of 40 multiple choice questions worth 2.5
More informationGenetics Notes. Chromosomes and DNA 11/15/2012. Structures that contain DNA, look like worms, can be seen during mitosis = chromosomes.
chromosomes Genetics Notes Chromosomes and Structures that contain, look like worms, can be seen during mitosis = chromosomes. Chromosomes: made of coiled around protiens. Accurate copying of chromosomes
More informationStatistical issues in QTL mapping in mice
Statistical issues in QTL mapping in mice Karl W Broman Department of Biostatistics Johns Hopkins University http://www.biostat.jhsph.edu/~kbroman Outline Overview of QTL mapping The X chromosome Mapping
More informationSolutions to Problem Set 4
Question 1 Solutions to 7.014 Problem Set 4 Because you have not read much scientific literature, you decide to study the genetics of garden peas. You have two pure breeding pea strains. One that is tall
More informationBS 50 Genetics and Genomics Week of Oct 3 Additional Practice Problems for Section. A/a ; B/B ; d/d X A/a ; b/b ; D/d
BS 50 Genetics and Genomics Week of Oct 3 Additional Practice Problems for Section 1. In the following cross, all genes are on separate chromosomes. A is dominant to a, B is dominant to b and D is dominant
More informationUnit 3 Test 2 Study Guide
Unit 3 Test 2 Study Guide How many chromosomes are in the human body cells? 46 How many chromosomes are in the sex cells? 23 What are sex cells also known as? gametes What is fertilization? Union of the
More informationModule 4: Bayesian Methods Lecture 9 A: Default prior selection. Outline
Module 4: Bayesian Methods Lecture 9 A: Default prior selection Peter Ho Departments of Statistics and Biostatistics University of Washington Outline Je reys prior Unit information priors Empirical Bayes
More informationChapter 2: Extensions to Mendel: Complexities in Relating Genotype to Phenotype.
Chapter 2: Extensions to Mendel: Complexities in Relating Genotype to Phenotype. please read pages 38-47; 49-55;57-63. Slide 1 of Chapter 2 1 Extension sot Mendelian Behavior of Genes Single gene inheritance
More informationDecision Theoretic Classification of Copy-Number-Variation in Cancer Genomes
Decision Theoretic Classification of Copy-Number-Variation in Cancer Genomes Christopher Holmes (joint work with Chris Yau) Department of Statistics, & Wellcome Trust Centre for Human Genetics, University
More informationLearning gene regulatory networks Statistical methods for haplotype inference Part I
Learning gene regulatory networks Statistical methods for haplotype inference Part I Input: Measurement of mrn levels of all genes from microarray or rna sequencing Samples (e.g. 200 patients with lung
More informationMachine Learning for Data Science (CS4786) Lecture 19
Machine Learning for Data Science (CS4786) Lecture 19 Hidden Markov Models Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2017fa/ Quiz Quiz Two variables can be marginally independent but not
More informationNature Genetics: doi:0.1038/ng.2768
Supplementary Figure 1: Graphic representation of the duplicated region at Xq28 in each one of the 31 samples as revealed by acgh. Duplications are represented in red and triplications in blue. Top: Genomic
More informationX Chromosome Association Testing in Genome-Wide Association Studies
X Chromosome Association Testing in Genome-Wide Association Studies Honours Thesis November 6, 009 Peter Hickey Department of Mathematics and Statistics, The University of Melbourne Bioinformatics Division,
More informationBustamante et al., Supplementary Nature Manuscript # 1 out of 9 Information #
Bustamante et al., Supplementary Nature Manuscript # 1 out of 9 Details of PRF Methodology In the Poisson Random Field PRF) model, it is assumed that non-synonymous mutations at a given gene are either
More informationCISC 889 Bioinformatics (Spring 2004) Hidden Markov Models (II)
CISC 889 Bioinformatics (Spring 24) Hidden Markov Models (II) a. Likelihood: forward algorithm b. Decoding: Viterbi algorithm c. Model building: Baum-Welch algorithm Viterbi training Hidden Markov models
More informationDiscovery of Genomic Structural Variations with Next-Generation Sequencing Data
Discovery of Genomic Structural Variations with Next-Generation Sequencing Data Advanced Topics in Computational Genomics Slides from Marcel H. Schulz, Tobias Rausch (EMBL), and Kai Ye (Leiden University)
More informationComputational Methods for Learning Population History from Large Scale Genetic Variation Datasets
Computational Methods for Learning Population History from Large Scale Genetic Variation Datasets Ming-Chi Tsai CMU-CB-13-102 July 2, 2013 School of Computer Science Carnegie Mellon University Pittsburgh,
More informationBIOLOGY 321. Answers to text questions th edition: Chapter 2
BIOLOGY 321 SPRING 2013 10 TH EDITION OF GRIFFITHS ANSWERS TO ASSIGNMENT SET #1 I have made every effort to prevent errors from creeping into these answer sheets. But, if you spot a mistake, please send
More informationThe history of Life on Earth reflects an unbroken chain of genetic continuity and transmission of genetic information:
9/26/05 Biology 321 Answers to optional challenge probability questions posed in 9/23/05lecture notes are included at the end of these lecture notes The history of Life on Earth reflects an unbroken chain
More information7.014 Problem Set 6. Question 1. MIT Department of Biology Introductory Biology, Spring 2004
MIT Department of Biology 7.014 Introductory Biology, Spring 2004 Name: 7.014 Problem Set 6 Please print out this problem set and record your answers on the printed copy. Problem sets will not be accepted
More informationParts 2. Modeling chromosome segregation
Genome 371, Autumn 2017 Quiz Section 2 Meiosis Goals: To increase your familiarity with the molecular control of meiosis, outcomes of meiosis, and the important role of crossing over in generating genetic
More informationSupporting Information
Supporting Information Hammer et al. 10.1073/pnas.1109300108 SI Materials and Methods Two-Population Model. Estimating demographic parameters. For each pair of sub-saharan African populations we consider
More informationO 3 O 4 O 5. q 3. q 4. Transition
Hidden Markov Models Hidden Markov models (HMM) were developed in the early part of the 1970 s and at that time mostly applied in the area of computerized speech recognition. They are first described in
More informationA sequential Monte Carlo framework for haplotype inference in CNV/SNP genotype data
Iliadis et al. EURASIP Journal on Bioinformatics and Systems Biology 2014, 2014:7 RESEARCH Open Access A sequential Monte Carlo framework for haplotype inference in CNV/SNP genotype data Alexandros Iliadis,
More informationTheoretical and computational aspects of association tests: application in case-control genome-wide association studies.
Theoretical and computational aspects of association tests: application in case-control genome-wide association studies Mathieu Emily November 18, 2014 Caen mathieu.emily@agrocampus-ouest.fr - Agrocampus
More informationA segmentation-clustering problem for the analysis of array CGH data
A segmentation-clustering problem for the analysis of array CGH data F. Picard, S. Robin, E. Lebarbier, J-J. Daudin UMR INA P-G / ENGREF / INRA MIA 518 APPLIED STOCHASTIC MODELS AND DATA ANALYSIS Brest
More informationCausal Graphical Models in Systems Genetics
1 Causal Graphical Models in Systems Genetics 2013 Network Analysis Short Course - UCLA Human Genetics Elias Chaibub Neto and Brian S Yandell July 17, 2013 Motivation and basic concepts 2 3 Motivation
More informationBioinformatics 2 - Lecture 4
Bioinformatics 2 - Lecture 4 Guido Sanguinetti School of Informatics University of Edinburgh February 14, 2011 Sequences Many data types are ordered, i.e. you can naturally say what is before and what
More informationBTRY 7210: Topics in Quantitative Genomics and Genetics
BTRY 7210: Topics in Quantitative Genomics and Genetics Jason Mezey Biological Statistics and Computational Biology (BSCB) Department of Genetic Medicine jgm45@cornell.edu February 12, 2015 Lecture 3:
More informationBENCHMARK 1 STUDY GUIDE SPRING 2017
BENCHMARK 1 STUDY GUIDE SPRING 2017 Name: There will be semester one content on this benchmark as well. Study your final exam review guide from last semester. New Semester Material: (Chapter 10 Cell Growth
More informationHeredity Composite. Multiple Choice Identify the choice that best completes the statement or answers the question.
Heredity Composite Multiple Choice Identify the choice that best completes the statement or answers the question. 1. When a plant breeder crossed two red roses, 78% of the offspring had red flowers and
More informationgenome a specific characteristic that varies from one individual to another gene the passing of traits from one generation to the next
genetics the study of heredity heredity sequence of DNA that codes for a protein and thus determines a trait genome a specific characteristic that varies from one individual to another gene trait the passing
More informationWeb-based Supplementary Materials for BM-Map: Bayesian Mapping of Multireads for Next-Generation Sequencing Data
Web-based Supplementary Materials for BM-Map: Bayesian Mapping of Multireads for Next-Generation Sequencing Data Yuan Ji 1,, Yanxun Xu 2, Qiong Zhang 3, Kam-Wah Tsui 3, Yuan Yuan 4, Clift Norris 1, Shoudan
More informationHeterozygous BMN lines
Optical density at 80 hours 0.8 0.6 0.4 0.2 0.8 0.6 0.4 0.2 0.8 0.6 0.4 0.2 0.8 0.6 0.4 0.2 a YPD b YPD + 1µM nystatin c YPD + 2µM nystatin d YPD + 4µM nystatin 1 3 5 6 9 13 16 20 21 22 23 25 28 29 30
More information