Computational Approaches to Statistical Genetics

Size: px

Start display at page:

Download "Computational Approaches to Statistical Genetics"

Ginger Blankenship
5 years ago
Views:

1 Computational Approaches to Statistical Genetics GWAS I: Concepts and Probability Theory Christoph Lippert Dr. Oliver Stegle Prof. Dr. Karsten Borgwardt Max-Planck-Institutes Tübingen, Germany Tübingen Summer 2011 Christoph Lippert GWAS I: Concepts and Probability Theory Summer

2 lect the most informative SNPs for genotyping are increasingly Overview irrelevant. By the time that researchers had determined that ag SNPs (a subset of informative SNPs) would suffice to cover liana Genome-wide genome, there was no economic association reason not to studies genotype 0 known high-quality SNPs that were not singletons (which that have been detected only in single individuals and whose power for other SNPs is therefore unknown) 16. Given: ortance of population structure refore, is the prospect of pinpointing individual genes with roaches? It is well known that demography affects linkage dism. One example is that there is more linkage disequilibrium ans than in Africans, reflecting humanity s African origins 12,13. is that for wild A. thaliana, linkage disequilibrium is more in North America than in Europe, consistent with the plant en introduced into North America only after Europeans set- 14,16. In both cases, the probable explanation is that there was eck in colonization, with recombination not yet having had me to whittle down linkage disequilibrium among the alleles n the limited number of founder chromosomes. haps not as widely recognized that, in the presence of popuucture, the genetic architecture of a trait in a sample of indipends on how the sample was assembled. For example, GWA immediately reveals the importance of the gene FRIGIDA in on in flowering time among A. thaliana strains from the northarts of Goal: continental Europe (where common loss-of-function an important determinant of early flowering) but not from sia (where no single loss-of-function allele is particularly fre- If variation in a trait is caused by numerous alleles of a single pposed to a small number of frequently occurring alleles), then rs carrying out a GWA scan using global samples run the risk of g that there is no major locus for this trait (Fig. 2). This is essenther facet of the problem with population structure that was d earlier: the importance of a particular allele always depends erence population, and it is far from clear which population is Genetics for multiple individuals e.g.: Single nucleotyde polymorphisms (SNPs), microsatelite markers,... Phenotypes for the same individuals e.g.: disease, height, gene-expression,... Try to find genetic markers, that explain the variance in the phenotype. F 1 generation F 2 generation Figure 1 GWA mapping is ineffective if there is strong genetic differentiation between subpopulations (that is, if there is structure in the population). In this example, two subpopulations of plants are depicted, one tall and one short (as illustrated and indicated by the numerical measurement), together with a schema of the genotype of each plant. The presence of red alleles increases the height of a plant, whereas blue alleles decrease the height; one locus has a major effect, and two have a minor effect. The many background markers (orange and green) are mostly exclusive to a specific subpopulation but are also strongly associated with height, even though they are not causal. By crossing the plants (shaded area) and generating an experimental population of F 2 generation or recombinant inbred lines, any linkage disequilibrium between background markers and causal markers is broken up, and the causal loci can then easily be mapped, albeit with relatively poor resolution. For maize, by contrast, Ed Buckler and colleagues have pioneered a distinct approach, which is called nested association mapping 26. GWA studies such as those underway in humans and A. thaliana would, at least for the next couple of years, be prohibitively expensive in maize, because its genome is larger than that of humans, is more polymorphic ul from Christoph an evolutionary Lippertperspective. GWAS I: Concepts and and has Probability less-extensive Theory linkage disequilibrium. Instead, Summer 5, recom- 2

3 lect the most informative SNPs for genotyping are increasingly Overview irrelevant. By the time that researchers had determined that ag SNPs (a subset of informative SNPs) would suffice to cover liana Genome-wide genome, there was no economic association reason not to studies genotype 0 known high-quality SNPs that were not singletons (which that have been detected only in single individuals and whose power for other SNPs is therefore unknown) 16. Given: ortance of population structure refore, is the prospect of pinpointing individual genes with roaches? It is well known that demography affects linkage dism. One example is that there is more linkage disequilibrium ans than in Africans, reflecting humanity s African origins 12,13. is that for wild A. thaliana, linkage disequilibrium is more in North America than in Europe, consistent with the plant en introduced into North America only after Europeans set- 14,16. In both cases, the probable explanation is that there was eck in colonization, with recombination not yet having had me to whittle down linkage disequilibrium among the alleles n the limited number of founder chromosomes. haps not as widely recognized that, in the presence of popuucture, the genetic architecture of a trait in a sample of indipends on how the sample was assembled. For example, GWA immediately reveals the importance of the gene FRIGIDA in on in flowering time among A. thaliana strains from the northarts of Goal: continental Europe (where common loss-of-function an important determinant of early flowering) but not from sia (where no single loss-of-function allele is particularly fre- If variation in a trait is caused by numerous alleles of a single pposed to a small number of frequently occurring alleles), then rs carrying out a GWA scan using global samples run the risk of g that there is no major locus for this trait (Fig. 2). This is essenther facet of the problem with population structure that was d earlier: the importance of a particular allele always depends erence population, and it is far from clear which population is Genetics for multiple individuals e.g.: Single nucleotyde polymorphisms (SNPs), microsatelite markers,... Phenotypes for the same individuals e.g.: disease, height, gene-expression,... Try to find genetic markers, that explain the variance in the phenotype. F 1 generation F 2 generation Figure 1 GWA mapping is ineffective if there is strong genetic differentiation between subpopulations (that is, if there is structure in the population). In this example, two subpopulations of plants are depicted, one tall and one short (as illustrated and indicated by the numerical measurement), together with a schema of the genotype of each plant. The presence of red alleles increases the height of a plant, whereas blue alleles decrease the height; one locus has a major effect, and two have a minor effect. The many background markers (orange and green) are mostly exclusive to a specific subpopulation but are also strongly associated with height, even though they are not causal. By crossing the plants (shaded area) and generating an experimental population of F 2 generation or recombinant inbred lines, any linkage disequilibrium between background markers and causal markers is broken up, and the causal loci can then easily be mapped, albeit with relatively poor resolution. For maize, by contrast, Ed Buckler and colleagues have pioneered a distinct approach, which is called nested association mapping 26. GWA studies such as those underway in humans and A. thaliana would, at least for the next couple of years, be prohibitively expensive in maize, because its genome is larger than that of humans, is more polymorphic ul from Christoph an evolutionary Lippertperspective. GWAS I: Concepts and and has Probability less-extensive Theory linkage disequilibrium. Instead, Summer 5, recom- 2

4 lect the most informative SNPs for genotyping are increasingly Overview irrelevant. By the time that researchers had determined that ag SNPs (a subset of informative SNPs) would suffice to cover liana Genome-wide genome, there was no economic association reason not to studies genotype 0 known high-quality SNPs that were not singletons (which that have been detected only in single individuals and whose power for other SNPs is therefore unknown) 16. Given: ortance of population structure refore, is the prospect of pinpointing individual genes with roaches? It is well known that demography affects linkage dism. One example is that there is more linkage disequilibrium ans than in Africans, reflecting humanity s African origins 12,13. is that for wild A. thaliana, linkage disequilibrium is more in North America than in Europe, consistent with the plant en introduced into North America only after Europeans set- 14,16. In both cases, the probable explanation is that there was eck in colonization, with recombination not yet having had me to whittle down linkage disequilibrium among the alleles n the limited number of founder chromosomes. haps not as widely recognized that, in the presence of popuucture, the genetic architecture of a trait in a sample of indipends on how the sample was assembled. For example, GWA immediately reveals the importance of the gene FRIGIDA in on in flowering time among A. thaliana strains from the northarts of Goal: continental Europe (where common loss-of-function an important determinant of early flowering) but not from sia (where no single loss-of-function allele is particularly fre- If variation in a trait is caused by numerous alleles of a single pposed to a small number of frequently occurring alleles), then rs carrying out a GWA scan using global samples run the risk of g that there is no major locus for this trait (Fig. 2). This is essenther facet of the problem with population structure that was d earlier: the importance of a particular allele always depends erence population, and it is far from clear which population is Genetics for multiple individuals e.g.: Single nucleotyde polymorphisms (SNPs), microsatelite markers,... Phenotypes for the same individuals e.g.: disease, height, gene-expression,... Try to find genetic markers, that explain the variance in the phenotype. F 1 generation F 2 generation Figure 1 GWA mapping is ineffective if there is strong genetic differentiation between subpopulations (that is, if there is structure in the population). In this example, two subpopulations of plants are depicted, one tall and one short (as illustrated and indicated by the numerical measurement), together with a schema of the genotype of each plant. The presence of red alleles increases the height of a plant, whereas blue alleles decrease the height; one locus has a major effect, and two have a minor effect. The many background markers (orange and green) are mostly exclusive to a specific subpopulation but are also strongly associated with height, even though they are not causal. By crossing the plants (shaded area) and generating an experimental population of F 2 generation or recombinant inbred lines, any linkage disequilibrium between background markers and causal markers is broken up, and the causal loci can then easily be mapped, albeit with relatively poor resolution. For maize, by contrast, Ed Buckler and colleagues have pioneered a distinct approach, which is called nested association mapping 26. GWA studies such as those underway in humans and A. thaliana would, at least for the next couple of years, be prohibitively expensive in maize, because its genome is larger than that of humans, is more polymorphic ul from Christoph an evolutionary Lippertperspective. GWAS I: Concepts and and has Probability less-extensive Theory linkage disequilibrium. Instead, Summer 5, recom- 2

5 lect the most informative SNPs for genotyping are increasingly Overview irrelevant. By the time that researchers had determined that ag SNPs (a subset of informative SNPs) would suffice to cover liana Genome-wide genome, there was no economic association reason not to studies genotype 0 known high-quality SNPs that were not singletons (which that have been detected only in single individuals and whose power for other SNPs is therefore unknown) 16. Given: ortance of population structure refore, is the prospect of pinpointing individual genes with roaches? It is well known that demography affects linkage dism. One example is that there is more linkage disequilibrium ans than in Africans, reflecting humanity s African origins 12,13. is that for wild A. thaliana, linkage disequilibrium is more in North America than in Europe, consistent with the plant en introduced into North America only after Europeans set- 14,16. In both cases, the probable explanation is that there was eck in colonization, with recombination not yet having had me to whittle down linkage disequilibrium among the alleles n the limited number of founder chromosomes. haps not as widely recognized that, in the presence of popuucture, the genetic architecture of a trait in a sample of indipends on how the sample was assembled. For example, GWA immediately reveals the importance of the gene FRIGIDA in on in flowering time among A. thaliana strains from the northarts of Goal: continental Europe (where common loss-of-function an important determinant of early flowering) but not from sia (where no single loss-of-function allele is particularly fre- If variation in a trait is caused by numerous alleles of a single pposed to a small number of frequently occurring alleles), then rs carrying out a GWA scan using global samples run the risk of g that there is no major locus for this trait (Fig. 2). This is essenther facet of the problem with population structure that was d earlier: the importance of a particular allele always depends erence population, and it is far from clear which population is Genetics for multiple individuals e.g.: Single nucleotyde polymorphisms (SNPs), microsatelite markers,... Phenotypes for the same individuals e.g.: disease, height, gene-expression,... Try to find genetic markers, that explain the variance in the phenotype. F 1 generation F 2 generation Figure 1 GWA mapping is ineffective if there is strong genetic differentiation between subpopulations (that is, if there is structure in the population). In this example, two subpopulations of plants are depicted, one tall and one short (as illustrated and indicated by the numerical measurement), together with a schema of the genotype of each plant. The presence of red alleles increases the height of a plant, whereas blue alleles decrease the height; one locus has a major effect, and two have a minor effect. The many background markers (orange and green) are mostly exclusive to a specific subpopulation but are also strongly associated with height, even though they are not causal. By crossing the plants (shaded area) and generating an experimental population of F 2 generation or recombinant inbred lines, any linkage disequilibrium between background markers and causal markers is broken up, and the causal loci can then easily be mapped, albeit with relatively poor resolution. For maize, by contrast, Ed Buckler and colleagues have pioneered a distinct approach, which is called nested association mapping 26. GWA studies such as those underway in humans and A. thaliana would, at least for the next couple of years, be prohibitively expensive in maize, because its genome is larger than that of humans, is more polymorphic ul from Christoph an evolutionary Lippertperspective. GWAS I: Concepts and and has Probability less-extensive Theory linkage disequilibrium. Instead, Summer 5, recom- 2

6 Overview Some definitions Genotype denotes the genetic state of an individual. Usually denoted by x n for individual n. Phenotype denotes the state of a trait of an individual. Usually denoted by y n for individual n. A Locus is a position or limited region in the genome. Usually denoted by xs for locus (or SNP) s. An allele is the genetic state of a locus. Christoph Lippert GWAS I: Concepts and Probability Theory Summer

7 Overview Some definitions Genotype denotes the genetic state of an individual. Usually denoted by x n for individual n. Phenotype denotes the state of a trait of an individual. Usually denoted by y n for individual n. A Locus is a position or limited region in the genome. Usually denoted by xs for locus (or SNP) s. An allele is the genetic state of a locus. Christoph Lippert GWAS I: Concepts and Probability Theory Summer

8 Overview Some definitions Genotype denotes the genetic state of an individual. Usually denoted by x n for individual n. Phenotype denotes the state of a trait of an individual. Usually denoted by y n for individual n. A Locus is a position or limited region in the genome. Usually denoted by xs for locus (or SNP) s. An allele is the genetic state of a locus. Christoph Lippert GWAS I: Concepts and Probability Theory Summer

Overview Some definitions Genotype denotes the genetic state of an individual. Usually denoted by x n for individual n. Phenotype denotes the state of a trait of an individual.

9 Overview Some definitions Genotype denotes the genetic state of an individual. Usually denoted by x n for individual n. Phenotype denotes the state of a trait of an individual. Usually denoted by y n for individual n. A Locus is a position or limited region in the genome. Usually denoted by xs for locus (or SNP) s. An allele is the genetic state of a locus. Christoph Lippert GWAS I: Concepts and Probability Theory Summer

10 Overview More definitions An organism/cell is haploid if it only has one chromosome set or identical chromosome sets. e.g. A. thaliana, sperm cells or inbred lab strains An organism/cell is diploid if it has two separately inherited homologous chromosomes. e.g. human An organism/cell is polyploid if it has more than two homologous chromosomes. e.g. sugar cane is hexaploid. Christoph Lippert GWAS I: Concepts and Probability Theory Summer

11 Overview More definitions An organism/cell is haploid if it only has one chromosome set or identical chromosome sets. e.g. A. thaliana, sperm cells or inbred lab strains An organism/cell is diploid if it has two separately inherited homologous chromosomes. e.g. human An organism/cell is polyploid if it has more than two homologous chromosomes. e.g. sugar cane is hexaploid. Christoph Lippert GWAS I: Concepts and Probability Theory Summer

12 Overview More definitions An organism/cell is haploid if it only has one chromosome set or identical chromosome sets. e.g. A. thaliana, sperm cells or inbred lab strains An organism/cell is diploid if it has two separately inherited homologous chromosomes. e.g. human An organism/cell is polyploid if it has more than two homologous chromosomes. e.g. sugar cane is hexaploid. Christoph Lippert GWAS I: Concepts and Probability Theory Summer

13 Overview Even more definitions Haplotype denotes an individual s state of a single set of chromosomes (paternal or maternal). A locus is heterozygous if it differs between paternal and maternal haplotypes. heterozygous allele usually encoded as 1 A locus is homozygous if it matches between paternal and maternal haplotypes. homozygous major allele usually encoded as 0 homozygous minor allele usually encoded as 2 Christoph Lippert GWAS I: Concepts and Probability Theory Summer

14 Overview Even more definitions Haplotype denotes an individual s state of a single set of chromosomes (paternal or maternal). A locus is heterozygous if it differs between paternal and maternal haplotypes. heterozygous allele usually encoded as 1 A locus is homozygous if it matches between paternal and maternal haplotypes. homozygous major allele usually encoded as 0 homozygous minor allele usually encoded as 2 Christoph Lippert GWAS I: Concepts and Probability Theory Summer

15 Overview Even more definitions Haplotype denotes an individual s state of a single set of chromosomes (paternal or maternal). A locus is heterozygous if it differs between paternal and maternal haplotypes. heterozygous allele usually encoded as 1 A locus is homozygous if it matches between paternal and maternal haplotypes. homozygous major allele usually encoded as 0 homozygous minor allele usually encoded as 2 Christoph Lippert GWAS I: Concepts and Probability Theory Summer

16 Overview Columns In statistics, association is any relationship between two measured quantities that renders them statistically dependent. Direct association Indirect association Can be beneficial e.g.: Linkage Can be harmful e.g.: Population structure Oxford Dictionary of Statistics correlation statistical dependence Christoph Lippert GWAS I: Concepts and Probability Theory Summer

17 Overview Columns In statistics, association is any relationship between two measured quantities that renders them statistically dependent. Direct association Indirect association Can be beneficial e.g.: Linkage Can be harmful e.g.: Population structure Oxford Dictionary of Statistics correlation statistical dependence Christoph Lippert GWAS I: Concepts and Probability Theory Summer

18 Overview Columns In statistics, association is any relationship between two measured quantities that renders them statistically dependent. Direct association Indirect association Can be beneficial e.g.: Linkage Can be harmful e.g.: Population structure Oxford Dictionary of Statistics correlation statistical dependence Christoph Lippert GWAS I: Concepts and Probability Theory Summer

19 Overview Columns In statistics, association is any relationship between two measured quantities that renders them statistically dependent. Direct association Indirect association Can be beneficial e.g.: Linkage Can be harmful e.g.: Population structure Oxford Dictionary of Statistics correlation statistical dependence Christoph Lippert GWAS I: Concepts and Probability Theory Summer

20 Overview Columns In statistics, association is any relationship between two measured quantities that renders them statistically dependent. Direct association Indirect association Can be beneficial e.g.: Linkage Can be harmful e.g.: Population structure Oxford Dictionary of Statistics correlation statistical dependence Christoph Lippert GWAS I: Concepts and Probability Theory Summer

21 Overview Columns In statistics, association is any relationship between two measured quantities that renders them statistically dependent. Direct association Indirect association Can be beneficial e.g.: Linkage Can be harmful e.g.: Population structure Oxford Dictionary of Statistics correlation statistical dependence Christoph Lippert GWAS I: Concepts and Probability Theory Summer

22 Overview Linkage Disequilibrium Gametic Phase Disequilibrium Association between two loci. Deviation from random co-inheritance between loci. LD can be caused by recombination, population structure, epistasis Measures of LD between two loci x 1 and x 2 are D and r 2. D = f AA f.a f A.. D 2 r 2 = f AA f AB f BA f BB D 0 and r 2 0 are indicators of LD. Christoph Lippert GWAS I: Concepts and Probability Theory Summer A1 x1 A2 B2

23 Overview Linkage Disequilibrium Gametic Phase Disequilibrium Association between two loci. Deviation from random co-inheritance between loci. LD can be caused by recombination, population structure, epistasis Measures of LD between two loci x 1 and x 2 are D and r 2. D = faa f.a f A.. D 2 r 2 = f AA f AB f BA f BB D 0 and r 2 0 are indicators of LD. Christoph Lippert GWAS I: Concepts and Probability Theory Summer A1 x1 A2 B2

24 Overview Linkage Disequilibrium Gametic Phase Disequilibrium Association between two loci. Deviation from random co-inheritance between loci. LD can be caused by recombination, population structure, epistasis Measures of LD between two loci x 1 and x 2 are D and r 2. D = f AA f.a f A.. D 2 r 2 = f AA f AB f BA f BB D 0 and r 2 0 are indicators of LD. Christoph Lippert GWAS I: Concepts and Probability Theory Summer A1 x1 A2 B2

25 Overview Linkage Disequilibrium Gametic Phase Disequilibrium Association between two loci. Deviation from random co-inheritance between loci. LD can be caused by recombination, population structure, epistasis Measures of LD between two loci x 1 and x 2 are D and r 2. D = f AA f.a f A.. D 2 r 2 = f AA f AB f BA f BB D 0 and r 2 0 are indicators of LD. A1 x1 x 2 = A 2 x 2 = B 2 x 1 = A 1 f AA f AB f A. x 1 = B 1 f BA f BB f B. f.a f.b Christoph Lippert GWAS I: Concepts and Probability Theory Summer A2 B2

26 Overview Linkage Disequilibrium Gametic Phase Disequilibrium Association between two loci. Deviation from random co-inheritance between loci. LD can be caused by recombination, population structure, epistasis Measures of LD between two loci x 1 and x 2 are D and r 2. D = faa f.a f A.. D 2 r 2 = f AA f AB f BA f BB D 0 and r 2 0 are indicators of LD. A1 x1 x 2 = A 2 x 2 = B 2 x 1 = A 1 f AA f AB f A. x 1 = B 1 f BA f BB f B. f.a f.b Christoph Lippert GWAS I: Concepts and Probability Theory Summer A2 B2

27 Overview Linkage Disequilibrium Gametic Phase Disequilibrium Association between two loci. Deviation from random co-inheritance between loci. LD can be caused by recombination, population structure, epistasis Measures of LD between two loci x 1 and x 2 are D and r 2. D = faa f.a f A.. D 2 r 2 = f AA f AB f BA f BB D 0 and r 2 0 are indicators of LD. A1 x1 x 2 = A 2 x 2 = B 2 x 1 = A 1 f AA f AB f A. x 1 = B 1 f BA f BB f B. f.a f.b Christoph Lippert GWAS I: Concepts and Probability Theory Summer A2 B2

28 Overview Linkage Disequilibrium Gametic Phase Disequilibrium Association between two loci. Deviation from random co-inheritance between loci. LD can be caused by recombination, population structure, epistasis Measures of LD between two loci x 1 and x 2 are D and r 2. D = faa f.a f A.. D 2 r 2 = f AA f AB f BA f BB D 0 and r 2 0 are indicators of LD. A1 x1 x 2 = A 2 x 2 = B 2 x 1 = A 1 f AA f AB f A. x 1 = B 1 f BA f BB f B. f.a f.b Christoph Lippert GWAS I: Concepts and Probability Theory Summer A2 B2

29 Overview Linkage Disequilibrium (LD) Physical LD Recombination causes LD between loci. LD is not uniform along the chromosome. Recombination hotspots on the chromosome lead to conserved haplotype blocks in strong LD. Physical LD can be used to chose tag-snps to cover all linked regions. Tradeoff between resolution and genotyping cost. Christoph Lippert GWAS I: Concepts and Probability Theory Summer

30 Overview Linkage Disequilibrium (LD) Physical LD Recombination causes LD between loci. LD is not uniform along the chromosome. Recombination hotspots on the chromosome lead to conserved haplotype blocks in strong LD. Physical LD can be used to chose tag-snps to cover all linked regions. Tradeoff between resolution and genotyping cost. Christoph Lippert GWAS I: Concepts and Probability Theory Summer

31 Overview Linkage Disequilibrium (LD) Physical LD I Recombination causes LD between loci. I LD is not uniform along the chromosome. I Recombination hotspots on the chromosome lead to conserved haplotype blocks in strong LD. Physical LD can be used to chose tag-snps to cover all linked regions. I I Tradeoff between resolution and genotyping cost. Christoph Lippert GWAS I: Concepts and Probability Theory Summer

32 Overview Linkage Disequilibrium (LD) Physical LD I Recombination causes LD between loci. I LD is not uniform along the chromosome. I Recombination hotspots on the chromosome lead to conserved haplotype blocks in strong LD. Physical LD can be used to chose tag-snps to cover all linked regions. I I Tradeoff between resolution and genotyping cost. Christoph Lippert GWAS I: Concepts and Probability Theory Summer

33 Overview Linkage Disequilibrium (LD) Physical LD Recombination causes LD between loci. LD is not uniform along the chromosome. Recombination hotspots on the chromosome lead to conserved haplotype blocks in strong LD. Physical LD can be used to chose tag-snps to cover all linked regions. Tradeoff between resolution and genotyping cost. Christoph Lippert GWAS I: Concepts and Probability Theory Summer

34 Outline Outline Christoph Lippert GWAS I: Concepts and Probability Theory Summer

35 Motivation Outline Overview Motivation Prerequisites Probability Theory Parameter Inference for the Gaussian Summary Christoph Lippert GWAS I: Concepts and Probability Theory Summer

36 Motivation Why probabilistic modeling? Inferences from data are intrinsically uncertain. Probability theory: model uncertainty instead of ignoring it! Applications are not limited to statistical genetics: Machine Learning, Data Mining, Pattern Recognition, etc. Goal of this part of the course Overview on probabilistic modeling Key concepts Focus on Applications in statistical genetics Christoph Lippert GWAS I: Concepts and Probability Theory Summer

37 Motivation Why probabilistic modeling? Inferences from data are intrinsically uncertain. Probability theory: model uncertainty instead of ignoring it! Applications are not limited to statistical genetics: Machine Learning, Data Mining, Pattern Recognition, etc. Goal of this part of the course Overview on probabilistic modeling Key concepts Focus on Applications in statistical genetics Christoph Lippert GWAS I: Concepts and Probability Theory Summer

38 Motivation Why probabilistic modeling? Inferences from data are intrinsically uncertain. Probability theory: model uncertainty instead of ignoring it! Applications are not limited to statistical genetics: Machine Learning, Data Mining, Pattern Recognition, etc. Goal of this part of the course Overview on probabilistic modeling Key concepts Focus on Applications in statistical genetics Christoph Lippert GWAS I: Concepts and Probability Theory Summer

39 Motivation Further reading, useful material Christopher M. Bishop: Pattern Recognition and Machine learning. Good background, covers most of the machine learning used in this course and much more! Substantial parts of this tutorial borrow figures and ideas from this book. David J.C. MacKay: Information Theory, Learning and Inference Very worthwhile reading, not quite the same quality of overlap with the lecture synopsis. Freely available online. Christoph Lippert GWAS I: Concepts and Probability Theory Summer

40 Motivation Lecture overview 1. An Introduction to probabilistic modeling 2. Linear models 3. Hypothesis testing 4. Principle Components Analysis 5. Linear Mixed Models Christoph Lippert GWAS I: Concepts and Probability Theory Summer

41 Outline Outline Christoph Lippert GWAS I: Concepts and Probability Theory Summer

42 Prerequisites Outline Overview Motivation Prerequisites Probability Theory Parameter Inference for the Gaussian Summary Christoph Lippert GWAS I: Concepts and Probability Theory Summer

43 Prerequisites Key concepts Data Let D denote a dataset, consisting of N datapoints D = { x n, y n } N n=1. }{{} Inputs }{{} Outputs Typical (this course) x = {x1,..., x S } multivariate, spanning S features for each observation (SNPs, markers, etc.). y univariate (phenotype, disease status, expression level etc.). Notation: Scalars are printed as y. Vectors are printed in bold: x. Matrices are printed in capital bold: Σ. Christoph Lippert GWAS I: Concepts and Probability Theory Summer

44 Prerequisites Key concepts Data Let D denote a dataset, consisting of N datapoints D = { x n, y n } N n=1. }{{} Inputs }{{} Outputs Typical (this course) x = {x1,..., x S } multivariate, spanning S features for each observation (SNPs, markers, etc.). y univariate (phenotype, disease status, expression level etc.). Notation: Scalars are printed as y. Vectors are printed in bold: x. Matrices are printed in capital bold: Σ. Christoph Lippert GWAS I: Concepts and Probability Theory Summer Y X

45 Prerequisites Key concepts Data Let D denote a dataset, consisting of N datapoints D = { x n, y n } N n=1. }{{} Inputs }{{} Outputs Typical (this course) x = {x1,..., x S } multivariate, spanning S features for each observation (SNPs, markers, etc.). y univariate (phenotype, disease status, expression level etc.). Notation: Scalars are printed as y. Vectors are printed in bold: x. Matrices are printed in capital bold: Σ. Christoph Lippert GWAS I: Concepts and Probability Theory Summer Y X

46 Prerequisites Key concepts Predictions Observed dataset D = { x n }{{} Inputs, y n }{{} Outputs } N n=1. Given D, what can we say about y at an unseen test input x? Y X Christoph Lippert GWAS I: Concepts and Probability Theory Summer

47 Prerequisites Key concepts Predictions Observed dataset D = { x n }{{} Inputs, y n }{{} Outputs } N n=1. Given D, what can we say about y at an unseen test input x? Y? X x* Christoph Lippert GWAS I: Concepts and Probability Theory Summer

48 Prerequisites Key concepts Model Observed dataset D = { x n }{{} Inputs, y n }{{} Outputs } N n=1. Given D, what can we say about y at an unseen test input x? To make predictions we need to make assumptions. A model H encodes these assumptions and often depends on some parameters θ. Curve fitting: the model relates x to y, y = f(x θ) = θ 0 + θ 1 x }{{} example, a linear model Y X? x* Christoph Lippert GWAS I: Concepts and Probability Theory Summer

49 Prerequisites Key concepts Model Observed dataset D = { x n }{{} Inputs, y n }{{} Outputs } N n=1. Given D, what can we say about y at an unseen test input x? To make predictions we need to make assumptions. A model H encodes these assumptions and often depends on some parameters θ. Curve fitting: the model relates x to y, y = f(x θ) = θ 0 + θ 1 x }{{} example, a linear model Y X x* Christoph Lippert GWAS I: Concepts and Probability Theory Summer

50 Prerequisites Key concepts Uncertainty Virtually in all steps there is uncertainty Measurement uncertainty (D) Parameter uncertainty (θ) Uncertainty regarding the correct model (H) Uncertainty can occur in both inputs and outputs. How to represent uncertainty? Y X Christoph Lippert GWAS I: Concepts and Probability Theory Summer

51 Prerequisites Key concepts Uncertainty Virtually in all steps there is uncertainty Measurement uncertainty (D) Parameter uncertainty (θ) Uncertainty regarding the correct model (H) Uncertainty can occur in both inputs and outputs. How to represent uncertainty? Y X Christoph Lippert GWAS I: Concepts and Probability Theory Summer

52 Prerequisites Key concepts Uncertainty Virtually in all steps there is uncertainty Measurement uncertainty (D) Parameter uncertainty (θ) Uncertainty regarding the correct model (H) Measurement uncertainty Uncertainty can occur in both inputs and outputs. How to represent uncertainty? Y X Christoph Lippert GWAS I: Concepts and Probability Theory Summer

53 Probability Theory Outline Overview Motivation Prerequisites Probability Theory Parameter Inference for the Gaussian Summary Christoph Lippert GWAS I: Concepts and Probability Theory Summer

54 Probability Theory Probabilities Let X be a random variable, defined over a set X or measurable space. P (X = x) denotes the probability that X takes value x, short p(x). Probabilities are positive, P (X = x) 0 Probabilities sum to one p(x)dx = 1 p(x) = 1 x X x X Special case: no uncertainty p(x) = δ(x ˆx). Christoph Lippert GWAS I: Concepts and Probability Theory Summer

55 Probability Theory Probability Theory Marginal Probability P (X = x i ) = c i N Conditional Probability Joint Probability P (X = x i, Y = y j ) = n i,j N P (Y = y j X = x i ) = n i,j c i (C.M. Bishop, Pattern Recognition and Machine Learning) Christoph Lippert GWAS I: Concepts and Probability Theory Summer

56 Probability Theory Probability Theory Marginal Probability P (X = x i ) = c i N Conditional Probability Product Rule P (X = x i, Y = y j ) = n i,j N = n i,j ci c i N = P (Y = y j X = x i )P (X = x i ) P (Y = y j X = x i ) = n i,j c i (C.M. Bishop, Pattern Recognition and Machine Learning) Christoph Lippert GWAS I: Concepts and Probability Theory Summer

57 Probability Theory Probability Theory Sum Rule Product Rule P (X = x i ) = c i N = 1 N = j L j=1 n i,j P (X = x i, Y = y j ) P (X = x i, Y = y j ) = n i,j N = n i,j ci c i N = P (Y = y j X = x i )P (X = x i ) (C.M. Bishop, Pattern Recognition and Machine Learning) Christoph Lippert GWAS I: Concepts and Probability Theory Summer

58 Probability Theory The Rules of Probability Sum & Product Rule Sum Rule p(x) = y p(x, y) Product Rule p(x, y) = p(y x)p(x) Christoph Lippert GWAS I: Concepts and Probability Theory Summer

59 Probability Theory The Rules of Probability Bayes Theorem Using the product rule we obtain p(y x) = p(x) = y p(x y)p(y) p(x) p(x y)p(y) Christoph Lippert GWAS I: Concepts and Probability Theory Summer

60 Probability Theory Bayesian probability calculus Bayes rule is the basis for inference and learning. Assume we have a model with parameters θ, e.g. Y y = θ 0 + θ 1 x Goal: learn parameters θ given Data D. X x* p(θ D) = p(d θ) p(θ) p(d) Posterior Likelihood Prior Christoph Lippert GWAS I: Concepts and Probability Theory Summer

61 Probability Theory Bayesian probability calculus Bayes rule is the basis for inference and learning. Assume we have a model with parameters θ, e.g. Y y = θ 0 + θ 1 x Goal: learn parameters θ given Data D. p(d θ) p(θ) p(θ D) = p(d) posterior likelihood prior X x* Posterior Likelihood Prior Christoph Lippert GWAS I: Concepts and Probability Theory Summer

62 Probability Theory Information and Entropy Information is the reduction of uncertainty. Entropy H(X) is the quantitative description of uncertainty H(X) = 0: certainty about X. H(X) maximal if all possibilities are equal probable. Uncertainty and information are additive. These conditions are fulfilled by the entropy function: H(X) = x X P (X = x) log P (X = x) Christoph Lippert GWAS I: Concepts and Probability Theory Summer

63 Probability Theory Information and Entropy Information is the reduction of uncertainty. Entropy H(X) is the quantitative description of uncertainty H(X) = 0: certainty about X. H(X) maximal if all possibilities are equal probable. Uncertainty and information are additive. These conditions are fulfilled by the entropy function: H(X) = x X P (X = x) log P (X = x) Christoph Lippert GWAS I: Concepts and Probability Theory Summer

64 Probability Theory Definitions related to entropy and information Entropy is the average surprise Conditional entropy H(X Y ) = Mutual information H(X) = x X P (X = x) ( log P (X = x)) }{{} surprise x X,y Y P (X = x, Y = y) log P (X = x Y = y) I(X : Y ) = H(X) H(X Y ) = H(Y ) H(Y X) H(X) + H(Y ) H(X, Y ) Independence of X and Y, p(x, y) = p(x)p(y). Christoph Lippert GWAS I: Concepts and Probability Theory Summer

65 Probability Theory Definitions related to entropy and information Entropy is the average surprise Conditional entropy H(X Y ) = Mutual information H(X) = x X P (X = x) ( log P (X = x)) }{{} surprise x X,y Y P (X = x, Y = y) log P (X = x Y = y) I(X : Y ) = H(X) H(X Y ) = H(Y ) H(Y X) H(X) + H(Y ) H(X, Y ) Independence of X and Y, p(x, y) = p(x)p(y). Christoph Lippert GWAS I: Concepts and Probability Theory Summer

66 Probability Theory Definitions related to entropy and information Entropy is the average surprise Conditional entropy H(X Y ) = Mutual information H(X) = x X P (X = x) ( log P (X = x)) }{{} surprise x X,y Y P (X = x, Y = y) log P (X = x Y = y) I(X : Y ) = H(X) H(X Y ) = H(Y ) H(Y X) H(X) + H(Y ) H(X, Y ) Independence of X and Y, p(x, y) = p(x)p(y). Christoph Lippert GWAS I: Concepts and Probability Theory Summer

67 Probability Theory Definitions related to entropy and information Entropy is the average surprise Conditional entropy H(X Y ) = Mutual information H(X) = x X P (X = x) ( log P (X = x)) }{{} surprise x X,y Y P (X = x, Y = y) log P (X = x Y = y) I(X : Y ) = H(X) H(X Y ) = H(Y ) H(Y X) H(X) + H(Y ) H(X, Y ) Independence of X and Y, p(x, y) = p(x)p(y). Christoph Lippert GWAS I: Concepts and Probability Theory Summer

68 Probability Theory Entropy in action The optimal weighting problem Given 12 balls, all equal except for one that is lighter or heavier. What is the ideal weighting strategy and how many weightings are needed to identify the odd ball? Christoph Lippert GWAS I: Concepts and Probability Theory Summer

69 Probability Theory Probability distributions Gaussian p(x µ, σ 2 ) = N (x µ, σ) = 1 2πσ 2 e 1 2σ 2 (x µ) Multivariate Gaussian p(x µ, Σ) = N (x µ, Σ) [ 1 = exp 1 ] 2πΣ 2 (x µ)t Σ 1 (x µ) Christoph Lippert GWAS I: Concepts and Probability Theory Summer

70 Probability Theory Probability distributions Gaussian p(x µ, σ 2 ) = N (x µ, σ) = 1 2πσ 2 e 1 2σ 2 (x µ) Multivariate Gaussian p(x µ, Σ) = N (x µ, Σ) [ 1 = exp 1 ] 2πΣ 2 (x µ)t Σ 1 (x µ) Σ = Christoph Lippert GWAS I: Concepts and Probability Theory Summer

71 Probability Theory Probability distributions continued... Bernoulli Gamma p(x θ) = θ x (1 θ) 1 x p(x a, b) = ba Γ(a) xa 1 e bx Christoph Lippert GWAS I: Concepts and Probability Theory Summer

72 Probability Theory Probability distributions continued... Bernoulli Gamma p(x θ) = θ x (1 θ) 1 x p(x a, b) = ba Γ(a) xa 1 e bx p(x a = 1, b = 1) x Christoph Lippert GWAS I: Concepts and Probability Theory Summer

73 Probability Theory Probability distributions The Gaussian revisited Gaussian PDF N ( x µ, σ 2 ) = 1 2πσ 2 e 1 2σ 2 (x µ)2 Positive: N ( x µ, σ 2 ) > 0 Normalized: Expectation: < x >= + + N (x µ, σ) dx = 1 (check) N ( x µ, σ 2 ) xdx = µ Variance: Var[x] =< x 2 > < x > 2 = µ 2 + σ 2 µ 2 = σ Christoph Lippert GWAS I: Concepts and Probability Theory Summer

74 Probability Theory Probability distributions The Gaussian revisited Gaussian PDF N ( x µ, σ 2 ) = 1 2πσ 2 e 1 2σ 2 (x µ)2 Positive: N ( x µ, σ 2 ) > 0 Normalized: Expectation: < x >= + + N (x µ, σ) dx = 1 (check) N ( x µ, σ 2 ) xdx = µ Variance: Var[x] =< x 2 > < x > 2 = µ 2 + σ 2 µ 2 = σ Christoph Lippert GWAS I: Concepts and Probability Theory Summer

75 Parameter Inference for the Gaussian Outline Overview Motivation Prerequisites Probability Theory Parameter Inference for the Gaussian Summary Christoph Lippert GWAS I: Concepts and Probability Theory Summer

76 Parameter Inference for the Gaussian Inference for the Gaussian Ingredients Data D = {x 1,..., x N } Model H Gauss Gaussian PDF N ( x µ, σ 2 ) = Likelihood p(d θ) = 1 2πσ 2 e 1 2σ 2 (x µ)2 θ = {µ, σ 2 } N N ( x n µ, σ 2) n=1 Christoph Lippert GWAS I: Concepts and Probability Theory Summer

77 Parameter Inference for the Gaussian Inference for the Gaussian Ingredients Data D = {x 1,..., x N } Model H Gauss Gaussian PDF N ( x µ, σ 2 ) = Likelihood p(d θ) = 1 2πσ 2 e 1 2σ 2 (x µ)2 θ = {µ, σ 2 } N N ( x n µ, σ 2) n= Christoph Lippert GWAS I: Concepts and Probability Theory Summer

78 Parameter Inference for the Gaussian Inference for the Gaussian Ingredients Data D = {x 1,..., x N } Model H Gauss Gaussian PDF p(x) N ( x µ, σ 2 ) = 1 2πσ 2 e 1 2σ 2 (x µ)2 N (x n µ, σ 2 ) θ = {µ, σ 2 } Likelihood N p(d θ) = N ( x n µ, σ 2) n=1 x n (C.M. Bishop, Pattern Recognition and Machine Learning) x Christoph Lippert GWAS I: Concepts and Probability Theory Summer

79 Parameter Inference for the Gaussian Inference for the Gaussian Maximum likelihood Likelihood N p(d θ) = N ( x n µ, σ 2) n=1 p(x) N (x n µ, σ 2 ) Maximum likelihood ˆθ = argmax p(d θ) θ x n (C.M. Bishop, Pattern Recognition and Machine Learning) x Christoph Lippert GWAS I: Concepts and Probability Theory Summer

80 Parameter Inference for the Gaussian Inference for the Gaussian Maximum likelihood ˆθ = argmax p(d θ) θ = argmax θ N n=1 1 2πσ 2 e 1 2σ 2 (xn µ)2 Christoph Lippert GWAS I: Concepts and Probability Theory Summer

81 Parameter Inference for the Gaussian Inference for the Gaussian Maximum likelihood ˆθ = argmax θ ln p(d θ) = argmax θ ln N n=1 1 2πσ 2 e 1 2σ 2 (xn µ)2 Christoph Lippert GWAS I: Concepts and Probability Theory Summer

82 Parameter Inference for the Gaussian Inference for the Gaussian Maximum likelihood ˆθ = argmax ln p(d θ) = argmax θ θ [ N 2 ln(2π) N 2 ln σ2 1 2σ 2 ] N (x n µ) 2 n=1 Christoph Lippert GWAS I: Concepts and Probability Theory Summer

83 Parameter Inference for the Gaussian Inference for the Gaussian Maximum likelihood ˆθ = argmax ln p(d θ) = argmax θ θ [ N 2 ln(2π) N 2 ln σ2 1 2σ 2 ˆµ : d µ ln p(d µ) = 0 ˆσ2 : ] N (x n µ) 2 n=1 d σ 2 ln p(d σ2 ) = 0 Christoph Lippert GWAS I: Concepts and Probability Theory Summer

84 Parameter Inference for the Gaussian Inference for the Gaussian Maximum likelihood Christoph Lippert GWAS I: Concepts and Probability Theory Summer

85 Parameter Inference for the Gaussian Inference for the Gaussian Maximum likelihood Maximum likelihood solutions µ ML = 1 N σ 2 ML = 1 N N n=1 x n N (x n µ ML ) 2 n=1 Equivalent to common mean and variance estimators (almost). Maximum likelihood ignores parameter uncertainty Think of the ML solution for a single observed datapoint x1 µ ML1 = x 1 σml1 2 = (x 1 µ ML1 ) 2 = 0 How about Bayesian inference? Christoph Lippert GWAS I: Concepts and Probability Theory Summer

86 Parameter Inference for the Gaussian Inference for the Gaussian Maximum likelihood Maximum likelihood solutions µ ML = 1 N σ 2 ML = 1 N N n=1 x n N (x n µ ML ) 2 n=1 Equivalent to common mean and variance estimators (almost). Maximum likelihood ignores parameter uncertainty Think of the ML solution for a single observed datapoint x1 µ ML1 = x 1 σml1 2 = (x 1 µ ML1 ) 2 = 0 How about Bayesian inference? Christoph Lippert GWAS I: Concepts and Probability Theory Summer

87 Parameter Inference for the Gaussian Inference for the Gaussian Maximum likelihood Maximum likelihood solutions µ ML = 1 N σ 2 ML = 1 N N n=1 x n N (x n µ ML ) 2 n=1 Equivalent to common mean and variance estimators (almost). Maximum likelihood ignores parameter uncertainty Think of the ML solution for a single observed datapoint x1 µ ML1 = x 1 σml1 2 = (x 1 µ ML1 ) 2 = 0 How about Bayesian inference? Christoph Lippert GWAS I: Concepts and Probability Theory Summer

88 Parameter Inference for the Gaussian Bayesian Inference for the Gaussian Ingredients Data D = {x 1,..., x N } Model H Gauss Gaussian PDF N ( x µ, σ 2 ) = Likelihood θ = {µ} 1 2πσ 2 e 1 2σ 2 (x µ)2 For simplicity: assume variance σ 2 is known. p(d µ) = N N ( x n µ, σ 2) n=1 Christoph Lippert GWAS I: Concepts and Probability Theory Summer

89 Parameter Inference for the Gaussian Bayesian Inference for the Gaussian Ingredients Data D = {x 1,..., x N } Model H Gauss Gaussian PDF N ( x µ, σ 2 ) = Likelihood θ = {µ} 1 2πσ 2 e 1 2σ 2 (x µ)2 For simplicity: assume variance σ 2 is known N p(d µ) = N ( x n µ, σ 2) n=1 Christoph Lippert GWAS I: Concepts and Probability Theory Summer

90 Parameter Inference for the Gaussian Bayesian Inference for the Gaussian Ingredients Data D = {x 1,..., x N } Model H Gauss Gaussian PDF N ( x µ, σ 2 ) = Likelihood θ = {µ} 1 2πσ 2 e 1 2σ 2 (x µ)2 For simplicity: assume variance σ 2 is known. p(d µ) = N N ( x n µ, σ 2) n=1 p(x) N (x n µ, σ 2 ) x n (C.M. Bishop, Pattern Recognition and Machine Learning) x Christoph Lippert GWAS I: Concepts and Probability Theory Summer

91 Parameter Inference for the Gaussian Bayesian Inference for the Gaussian Bayes rule Combine likelihood with a Gaussian prior over µ p(µ) = N ( µ m0, s 2 ) 0 The posterior is proportional to p(µ D, σ 2 ) p(d µ, σ 2 )p(µ) Christoph Lippert GWAS I: Concepts and Probability Theory Summer

92 Parameter Inference for the Gaussian Bayesian Inference for the Gaussian p(µ D, σ 2 ) p(d µ)p(µ) [ N ] 1 = 1 1 2πσ 2 e 2σ 2 (xn µ)2 e 1 2s 2 (µ m 0) 2 0 2πs 2 0 = n=1 N 1 1 exp 2πσ 2 2πs 2 }{{ 0 } = C2 exp C1 [ 1 2 [ ( 1 s 2 + N ) 0 σ }{{ 2 } 1/ˆσ 1 2s 2 (µ 2 2µm 0 + m 2 0) 1 0 2σ 2 ( µ 2 2µ ˆσ( 1 s 2 m σ 2 N n=1 ] N (µ 2 2µx n + x 2 n) n=1 x n ) } {{ } ˆµ Posterior parameters follow as the new coefficients. ) ] + C3 Note: All the constants we dropped on the way yield the model evidence: p(µ D, σ 2 p(d µ)p(µ) ) = Z Christoph Lippert GWAS I: Concepts and Probability Theory Summer

93 Parameter Inference for the Gaussian Bayesian Inference for the Gaussian p(µ D, σ 2 ) p(d µ)p(µ) [ N ] 1 = 1 1 2πσ 2 e 2σ 2 (xn µ)2 e 1 2s 2 (µ m 0) 2 0 2πs 2 0 = n=1 N 1 1 exp 2πσ 2 2πs 2 }{{ 0 } = C2 exp C1 [ 1 2 [ ( 1 s 2 + N ) 0 σ }{{ 2 } 1/ˆσ 1 2s 2 (µ 2 2µm 0 + m 2 0) 1 0 2σ 2 ( µ 2 2µ ˆσ( 1 s 2 m σ 2 N n=1 ] N (µ 2 2µx n + x 2 n) n=1 x n ) } {{ } ˆµ Posterior parameters follow as the new coefficients. ) ] + C3 Note: All the constants we dropped on the way yield the model evidence: p(µ D, σ 2 p(d µ)p(µ) ) = Z Christoph Lippert GWAS I: Concepts and Probability Theory Summer

94 Parameter Inference for the Gaussian Bayesian Inference for the Gaussian p(µ D, σ 2 ) p(d µ)p(µ) [ N ] 1 = 1 1 2πσ 2 e 2σ 2 (xn µ)2 e 1 2s 2 (µ m 0) 2 0 2πs 2 0 = n=1 N 1 1 exp 2πσ 2 2πs 2 }{{ 0 } = C2 exp C1 [ 1 2 [ ( 1 s 2 + N ) 0 σ }{{ 2 } 1/ˆσ 1 2s 2 (µ 2 2µm 0 + m 2 0) 1 0 2σ 2 ( µ 2 2µ ˆσ( 1 s 2 m σ 2 N n=1 ] N (µ 2 2µx n + x 2 n) n=1 x n ) } {{ } ˆµ Posterior parameters follow as the new coefficients. ) ] + C3 Note: All the constants we dropped on the way yield the model evidence: p(µ D, σ 2 p(d µ)p(µ) ) = Z Christoph Lippert GWAS I: Concepts and Probability Theory Summer

95 Parameter Inference for the Gaussian Bayesian Inference for the Gaussian Posterior of the mean: p(µ D, σ 2 ) N (µ ˆµ, ˆσ), after some rewriting ˆµ = σ 2 Ns σ2 m 0 + Ns2 0 Ns σ2 µ ML, 1 ˆσ 2 = 1 s 2 + N 0 σ 2 µ ML = 1 N N n=1 x n Limiting cases for no and infinite amount of data N = 0 N ˆµ m 0 µ ML ˆσ 2 s Christoph Lippert GWAS I: Concepts and Probability Theory Summer

GWAS IV: Bayesian linear (variance component) models

GWAS IV: Bayesian linear (variance component) models Dr. Oliver Stegle Christoh Lippert Prof. Dr. Karsten Borgwardt Max-Planck-Institutes Tübingen, Germany Tübingen Summer 2011 Oliver Stegle GWAS IV: Bayesian