Variable Selection in Structured High-dimensional Covariate Spaces

Size: px
Start display at page:

Download "Variable Selection in Structured High-dimensional Covariate Spaces"

Transcription

1 Variable Selection in Structured High-dimensional Covariate Spaces Fan Li 1 Nancy Zhang 2 1 Department of Health Care Policy Harvard University 2 Department of Statistics Stanford University May

2 Problem Formulation Consider the standard linear regression problem: Y = Xβ + ɛ, (1) where Y is n 1 response, X is n p covariates, ɛ is n 1 independent errors,

3 Problem Formulation Consider the standard linear regression problem: Y = Xβ + ɛ, (1) where Y is n 1 response, X is n p covariates, ɛ is n 1 independent errors, Many problems in genomics fall into the scenario: 1. Known structure exists in the covariate space. 2. p large (possibly > n), want sparse estimation of β.

4 Example 1: Chromosome Copy Number Data X i,j : noisy measure of copy number at location j in person i. Y i : quantitative trait for person i.

5 Example 1: Chromosome Copy Number Data X i,j : noisy measure of copy number at location j in person i. Y i : quantitative trait for person i. Scale: i in the tens or hundreds, j in the thousands.

6 Example 1: Chromosome Copy Number Data X i,j : noisy measure of copy number at location j in person i. Y i : quantitative trait for person i. Scale: i in the tens or hundreds, j in the thousands. Example of X i,: :

7 Example 1: Chromosome Copy Number Data X i,j : noisy measure of copy number at location j in person i. Y i : quantitative trait for person i. Scale: i in the tens or hundreds, j in the thousands. Example of X i,: : At each location, we observe a noisy measurement of the underlying copy number.

8 Example 1: Chromosome Copy Number Pollack et al. (2002) breast cancer data set:

9 Expression of key oncogenes: ERBB2 has elevated expression in 30% of breast cancers. It is correlated with more aggressive cancer. What controls the expression of ERBB2?

10 Example 2: Biological Motif Analysis Data X w,j : count of occurrence of word w upstream of gene i. Y i : expression of gene i.

11 Example 2: Biological Motif Analysis Data X w,j : count of occurrence of word w upstream of gene i. Y i : expression of gene i. Scale: i in the thousands, w in the thousands.

12 Example 2: Biological Motif Analysis Data X w,j : count of occurrence of word w upstream of gene i. Y i : expression of gene i. Scale: i in the thousands, w in the thousands. Examples of w: ACGCGTT ACGCGTG TCGCGTA TCGCGGA The motifs for a single transcription factor tend to be clustered. More on this later...

13 Review in Structured Variable Selection Lasso (L 1 penalty) type 1. Fused-Lasso (Tibshirani et al., 2005): 1-d smoothing. 2. Grouped Lasso (Yuan and Lin, 2006): variables added and dropped in groups.

14 Review in Structured Variable Selection Lasso (L 1 penalty) type 1. Fused-Lasso (Tibshirani et al., 2005): 1-d smoothing. 2. Grouped Lasso (Yuan and Lin, 2006): variables added and dropped in groups. Bayesian variable selection framework (next slide): 1. Gibbs sampler, George and McCulloch, (1993) 2. Many improvements since then... We apply BVS in structured settings to genomic analysis.

15 Review: Latent Variable Model Define latent variables: γ i {0, 1}, i = 1,..., p.

16 Review: Latent Variable Model Define latent variables: Conditioned on γ i : γ i {0, 1}, i = 1,..., p. β i γ i (1 γ i )N(0, τ 2 i ) + γ i N(0, σ 2 c 2 i τ 2 i ). (2)

17 Review: Latent Variable Model Define latent variables: Conditioned on γ i : Special case: γ i {0, 1}, i = 1,..., p. β i γ i (1 γ i )N(0, τ 2 i ) + γ i N(0, σ 2 c 2 i τ 2 i ). (2) β i γ i (1 γ i )I 0 + γ i N(0, ν 2 i ). (3)

18 Review: Latent Variable Model Define latent variables: Conditioned on γ i : Special case: γ i {0, 1}, i = 1,..., p. β i γ i (1 γ i )N(0, τ 2 i ) + γ i N(0, σ 2 c 2 i τ 2 i ). (2) Conjugate prior for variance: β i γ i (1 γ i )I 0 + γ i N(0, ν 2 i ). (3) σ 2 γ IG(ν γ /2, ν γ λ γ /2)

19 Review: Latent Variable Model Define latent variables: Conditioned on γ i : Special case: γ i {0, 1}, i = 1,..., p. β i γ i (1 γ i )N(0, τ 2 i ) + γ i N(0, σ 2 c 2 i τ 2 i ). (2) Conjugate prior for variance: Likelihood for observed data: β i γ i (1 γ i )I 0 + γ i N(0, ν 2 i ). (3) σ 2 γ IG(ν γ /2, ν γ λ γ /2) Y β, X N(Xβ, σ 2 I).

20 First, a simple Markov prior Transition matrix P = ( p 1 p 1 q q Assume that γ 1 π, where π = ( 1 q ). 2 p q, stationary distribution with regards to P. ) 1 p 2 p q is the

21 First, a simple Markov prior Transition matrix P = ( p 1 p 1 q q Assume that γ 1 π, where π = ( 1 q ). 2 p q, stationary distribution with regards to P. ) 1 p 2 p q is the More interpretable to re-parameterize the prior as: r = π 1 π 0 = (1 p) q (1 q) 1 p : prior prob ratio of inclusion of a variable, w = : Fold change in probability of inclusion of next variable.

22 Two strategies for Gibbs sampler Auxiliary chain: f (γ Y) is sampled from the auxiliary Markov chain β 0, σ 0, γ 0, β 1, σ 1, γ 1,... Direct chain: f (γ Y) is directly sampled from γ 0, γ 1, γ 2,...

23 Auxiliary Chain Posterior distribution of β: β j N p (A γ j 1X Y, A γ j 1), (4) where A γ j 1 = [X X + D 1 γ j 1 R 1 D 1 γ j 1 ] 1. Hierarchical structure of the model γ i β i Y i implies f (γ j Y j 1, β j 1, σ j 1 ) = f (γ j β j 1 ). Posterior f (γ j β j 1 ) is an inhomogeneous Markov chain. Posterior distribution of σ j σ j IG ( n + νγ j 1 2, Y Xβj 2 + ν γ j 1λ γ j 1 2 ).

24 Auxiliary Chain: Computational Notes Two computational intensive tasks: 1. inverting a p p matrix to obtain A γ. 2. computing square root of A γ. Traditionally done by Cholesky-type decomposition, O(p 3 ) computation per sweep.

25 Auxiliary Chain: Computational Notes Two computational intensive tasks: 1. inverting a p p matrix to obtain A γ. 2. computing square root of A γ. Traditionally done by Cholesky-type decomposition, O(p 3 ) computation per sweep. Key Observation: very few γ changes in two consecutive sweeps. Key idea: 1. low-rank update of matrix inversion (e.g., SMW formula); 2. low-rank update of Cholesky decomposition (e.g., algorithm C1-C4 in Gill et. al (1974)). Combining the two, get a fast update algorithm of O(lp 2 ) computation, l is average no. of changed γ per sweep.

26 Direct Chain Build on the special Guassian mixture prior (3). γ i γ ( i) is sampled based on: P(γ i = 1 γ ( i), Y ) = P(γ i = 1 γ ( i) ) P(γ i = 1 γ ( i) ) + BF P(γ i = 0 γ ( i) ), BF is Bayes factor: BF = P(Y γ i =0,γ ( i) ) P(Y γ i =1,γ ( i) ). Collapsing over σ and β, BF = v Γ( n n i +1+ν ) 2 Γ( n n i +ν A 1 ( i) 2 ) 2 A i 1 2 Y Y Y X Ii A 1 X i I 1 n n i +ν Y +νλ 2 i A 2 0 Y Y Y X B I( i) A 1 1 ( i) X Y +νλ I ( i) 2 A n n i +1+ν 2. A i = X I i X Ii + D 2 I i for conjugate setup.

27 Direct Chain: Computational Notes Two main computation tasks: 1. inverting n i n i (n i n i ) matrix A i (A ( i) ). 2. computing determinant of A i and A i (equivalent to decomposition). Both required for each γ in every sweep, O( n 3 p) computation per sweep by standard Cholesky, n average model size.

28 Direct Chain: Computational Notes Two main computation tasks: 1. inverting n i n i (n i n i ) matrix A i (A ( i) ). 2. computing determinant of A i and A i (equivalent to decomposition). Both required for each γ in every sweep, O( n 3 p) computation per sweep by standard Cholesky, n average model size. Further speed-up: 1. A 1 i is updated from A 1 ( i) using block matrix inversion, computation O(n 2 i ). 2. A 1 2 i is updated from A 1 2 ( i) via low-rank update of Cholesky factor, computation O(ni 2). 3. One of A i and A ( i) is always same as A i 1 or A (i 1). Overall, our fast update algorithm is of O( n 2 p). For sparse models, this becomes doable for large p.

29 Simulation Study: design p = 200 predictors, 2 blocks of γ = 1. X i.i.d. N(0, 1), Y = Xβ γ + N(0, σ 2 ɛ ) iterations, at various levels of the smoothing parameter w. Interested in low signal/noise situations: σ 2 ɛ = 1, β = 0.8.

30 Simulation Results: Structure in γ is used to obtain better estimates

31 Finding regulators of ERBB2 gene p = 6000, n = 41, adjusted sparsity r so that average model size random restarts, 100,000 sweeps per round of Monte Carlo. For this data set, not much difference between w = 1 and w = 10. However, a few low signals did jump out consistently for w = 10.

32 Finding regulators of ERBB2 gene Gene PTPRN2 NR1H3 STAT1 MINK MLN64 ERBB2 Notes Tumor suppressor gene, target of methylation in human cancers. Transcription factors that regulate cell growth, sizeable body of data implicate these factors in oncogenesis of breast cancer. Co-regulated with ERBB2, Alpy et al, 2003, Oncogene Locus for ERBB2

33 Finding regulators of ERBB2 gene Gene PTPRN2 NR1H3 STAT1 MINK MLN64 ERBB2 Notes Tumor suppressor gene, target of methylation in human cancers. Transcription factors that regulate cell growth, sizeable body of data implicate these factors in oncogenesis of breast cancer. Co-regulated with ERBB2, Alpy et al, 2003, Oncogene Locus for ERBB2

34 Biological Motif Detection

35 Biological Motif Detection

36 Transcription Regulation Transcription factors regulate gene expression by helping (or inhibiting) transcription initiation.

37 Transcription Regulation Transcription factors regulate gene expression by helping (or inhibiting) transcription initiation. 2. bind to DNA in a sequence specific manner.

38 Transcription Regulation Transcription factors regulate gene expression by helping (or inhibiting) transcription initiation. 2. bind to DNA in a sequence specific manner. 3. play an important part in the much larger picture of expression regulation.

39 Transcription Regulation Transcription factors regulate gene expression by helping (or inhibiting) transcription initiation. 2. bind to DNA in a sequence specific manner. 3. play an important part in the much larger picture of expression regulation. Ultimate goal: Learn the grammar of transcription regulation.

40 Data description For each gene g: Promoter sequence S g Expression Y g

41 Data description For each gene g: Promoter sequence S g Expression Y g

42 Regression Model Bussemaker, Li, and Siggia (2001): Y g = β 0 + M β m X g (m) + error, m=1 X g (m) is the count of word w in S g. Y g is log expression of gene g.

43 Regression Model Bussemaker, Li, and Siggia (2001): Y g = β 0 + M β m X g (m) + error, m=1 X g (m) is the count of word w in S g. Y g is log expression of gene g. MCB (ACGCGT), 21 minutes

44 Regression Model Bussemaker, Li, and Siggia (2001): Y g = β 0 + M β m X g (m) + error, m=1 X g (m) is the count of word w in S g. Y g is log expression of gene g. SCB (TTTCGCG), 21 minutes

45 Regression Model Bussemaker, Li, and Siggia (2001): Y g = β 0 + M β m X g (m) + error, m=1 X g (m) is the count of word w in S g. Y g is log expression of gene g. Arbitrary (TGATATC), 21 minutes

46 Modeling motif degeneracy 1. Inexact matches are allowed.

47 Modeling motif degeneracy 1. Inexact matches are allowed. 2. Not all positions created equal.

48 Information Content of Positions Pattern observed for most transcription factors: Dimeric binding: two such peaks separated by a short distance.

49 Information Content of Positions Pattern observed for most transcription factors: Dimeric binding: two such peaks separated by a short distance. This information has been noted by several studies: 1. Eisen, 2005, Genome Biology. 2. Kechris et al., Keles et al., 2002.

50 Hypercube model We consider all words of length L = 6, 7 to lie in a graph.

51 Hypercube model We consider all words of length L = 6, 7 to lie in a graph. There is an edge between words w 1, w 2 if d Hamming (w 1, w 2 ) = 1.

52 Hypercube model We consider all words of length L = 6, 7 to lie in a graph. There is an edge between words w 1, w 2 if d Hamming (w 1, w 2 ) = 1. The weight on the edge depends on the position of the differing letter.

53 Hypercube model We consider all words of length L = 6, 7 to lie in a graph. There is an edge between words w 1, w 2 if d Hamming (w 1, w 2 ) = 1. The weight on the edge depends on the position of the differing letter. Hard to draw, here s a 2-D simplification:

54 A more general model: Ising prior for γ i P(γ) = e α γ+γ Bγ ψ(α,b), where α = (α 1,..., α p ), B = (b i,j ) p p are hyperparameters, and ψ(α, B) is the normalizing constant: ψ(α, B) = γ {0,1} p e α γ+γ Bγ.

55 A more general model: Ising prior for γ i P(γ) = e α γ+γ Bγ ψ(α,b), where α = (α 1,..., α p ), B = (b i,j ) p p are hyperparameters, and ψ(α, B) is the normalizing constant: ψ(α, B) = γ {0,1} p e α γ+γ Bγ. For each i, the conditional distribution P P(γ i γ ( i) ) = eγ i (α i + j I β ij γ j ) ( i) 1 + e γ i (α i + P j I ( i) β ij γ j ). can be efficiently computed for sparse B.

56 Ising Prior: Posterior Computation For each i, the conditional distribution P P(γ i γ ( i) ) = eγ i (α i + j I β ij γ j ) ( i) 1 + e γ i (α i + P j I ( i) β ij γ j ). can be efficiently computed for sparse B.

57 Ising Prior: Posterior Computation For each i, the conditional distribution P P(γ i γ ( i) ) = eγ i (α i + j I β ij γ j ) ( i) 1 + e γ i (α i + P j I ( i) β ij γ j ). can be efficiently computed for sparse B. Apply to structured model selection: P(γ i = 1 γ ( i), Y ) = P(γ i = 1 γ ( i) ) P(γ i = 1 γ ( i) ) + BF P(γ i = 0 γ ( i) ),

58 Example: Regulatory Motifs for Yeast Cell Cycle

59 Periodic time series

60 PCA of cell cycle experiment

61 Motif Analysis Results I Motif length 6. B i,j = { 1, positions 1,6; 2, positions 2,3,4,5. Distance between top 100 motifs found by our model:

62 Motif Analysis Results II Signals that were found (posterior probability > 0.05) with smoothing that were lost without: Words ACGCGT TCGCGT TCGCGA GCGCGT CCGCGT Description TGCTGG GGCTGG ACGGGT TCGCGG TCGGGT 16 new motifs total.

63 Motif Analysis Results II Signals that were found (posterior probability > 0.05) with smoothing that were lost without: Words ACGCGT TCGCGT TCGCGA GCGCGT CCGCGT Description MCB binding site in CLN1 TGCTGG GGCTGG ACGGGT TCGCGG TCGGGT 16 new motifs total.

64 Motif Analysis Results II Signals that were found (posterior probability > 0.05) with smoothing that were lost without: Words ACGCGT TCGCGT TCGCGA GCGCGT CCGCGT TGCTGG GGCTGG Description SWI5 binding site ACGGGT TCGCGG TCGGGT 16 new motifs total.

65 Motif Analysis Results II Signals that were found (posterior probability > 0.05) with smoothing that were lost without: Words ACGCGT TCGCGT TCGCGA GCGCGT CCGCGT Description TGCTGG GGCTGG ACGGGT MCM1 binding site in CLN3 TCGCGG TCGGGT 16 new motifs total.

66 Motif Analysis Results II Signals that were found (posterior probability > 0.05) with smoothing that were lost without: Words ACGCGT TCGCGT TCGCGA GCGCGT CCGCGT Description TGCTGG GGCTGG ACGGGT TCGCGG TCGGGT 16 new motifs total. REB1 binding sites

67 Notes on Hyperparameter Selection For the motif model, we so far arbitrarily selected hyperparameters for good computational properties. For 1-d lattice: (r, w) code for sparsity and smoothness. Given w, can choose r analytically to set desired model size. Since algorithm is O(p n 2 ), this is practically very important.

68 Notes on Hyperparameter Selection For the motif model, we so far arbitrarily selected hyperparameters for good computational properties. For 1-d lattice: (r, w) code for sparsity and smoothness. Given w, can choose r analytically to set desired model size. Since algorithm is O(p n 2 ), this is practically very important. For general graphs: model size is no longer as easy to control. Phase transition behavior. Asymmetry in graph: α should not be constant. Interpretation of hyperparameters not as straightforward.

69 Notes on Hyperparameter Selection For the motif model, we so far arbitrarily selected hyperparameters for good computational properties. For 1-d lattice: (r, w) code for sparsity and smoothness. Given w, can choose r analytically to set desired model size. Since algorithm is O(p n 2 ), this is practically very important. For general graphs: model size is no longer as easy to control. Phase transition behavior. Asymmetry in graph: α should not be constant. Interpretation of hyperparameters not as straightforward. Favor a priori certain structures, not certain variables.

70 Conclusions and Extensions General Ising model for variable selection in structured covariate spaces.

71 Conclusions and Extensions General Ising model for variable selection in structured covariate spaces. 1-d lattice: reduces to simple Markov model which is applicable to problems in genomic profiling studies.

72 Conclusions and Extensions General Ising model for variable selection in structured covariate spaces. 1-d lattice: reduces to simple Markov model which is applicable to problems in genomic profiling studies. L-d hypercube: a natural model for motif detection.

73 Conclusions and Extensions General Ising model for variable selection in structured covariate spaces. 1-d lattice: reduces to simple Markov model which is applicable to problems in genomic profiling studies. L-d hypercube: a natural model for motif detection. Computationally feasible for p > 1000.

74 Conclusions and Extensions General Ising model for variable selection in structured covariate spaces. 1-d lattice: reduces to simple Markov model which is applicable to problems in genomic profiling studies. L-d hypercube: a natural model for motif detection. Computationally feasible for p > Extensions: Hyperparameter selection for biological motif discovery.

75 Conclusions and Extensions General Ising model for variable selection in structured covariate spaces. 1-d lattice: reduces to simple Markov model which is applicable to problems in genomic profiling studies. L-d hypercube: a natural model for motif detection. Computationally feasible for p > Extensions: Hyperparameter selection for biological motif discovery. Convergence speed-up.

76 Conclusions and Extensions General Ising model for variable selection in structured covariate spaces. 1-d lattice: reduces to simple Markov model which is applicable to problems in genomic profiling studies. L-d hypercube: a natural model for motif detection. Computationally feasible for p > Extensions: Hyperparameter selection for biological motif discovery. Convergence speed-up. Nonlinear regression models. Thank you!

Bayesian Variable Selection in Structured High-Dimensional Covariate Spaces with Applications in Genomics

Bayesian Variable Selection in Structured High-Dimensional Covariate Spaces with Applications in Genomics Bayesian Variable Selection in Structured High-Dimensional Covariate Spaces with Applications in Genomics Fan Li Department of Statistical Science, Duke University Durham, NC 27708-0251, USA fli@stat.duke.edu

More information

Default Priors and Effcient Posterior Computation in Bayesian

Default Priors and Effcient Posterior Computation in Bayesian Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature

More information

Bayes methods for categorical data. April 25, 2017

Bayes methods for categorical data. April 25, 2017 Bayes methods for categorical data April 25, 2017 Motivation for joint probability models Increasing interest in high-dimensional data in broad applications Focus may be on prediction, variable selection,

More information

An Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models

An Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models Proceedings 59th ISI World Statistics Congress, 25-30 August 2013, Hong Kong (Session CPS023) p.3938 An Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models Vitara Pungpapong

More information

Nonparametric Bayesian Methods (Gaussian Processes)

Nonparametric Bayesian Methods (Gaussian Processes) [70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent

More information

STA414/2104 Statistical Methods for Machine Learning II

STA414/2104 Statistical Methods for Machine Learning II STA414/2104 Statistical Methods for Machine Learning II Murat A. Erdogdu & David Duvenaud Department of Computer Science Department of Statistical Sciences Lecture 3 Slide credits: Russ Salakhutdinov Announcements

More information

25 : Graphical induced structured input/output models

25 : Graphical induced structured input/output models 10-708: Probabilistic Graphical Models 10-708, Spring 2013 25 : Graphical induced structured input/output models Lecturer: Eric P. Xing Scribes: Meghana Kshirsagar (mkshirsa), Yiwen Chen (yiwenche) 1 Graph

More information

Bayesian Grouped Horseshoe Regression with Application to Additive Models

Bayesian Grouped Horseshoe Regression with Application to Additive Models Bayesian Grouped Horseshoe Regression with Application to Additive Models Zemei Xu 1,2, Daniel F. Schmidt 1, Enes Makalic 1, Guoqi Qian 2, John L. Hopper 1 1 Centre for Epidemiology and Biostatistics,

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 25: Markov Chain Monte Carlo (MCMC) Course Review and Advanced Topics Many figures courtesy Kevin

More information

Stochastic processes and

Stochastic processes and Stochastic processes and Markov chains (part II) Wessel van Wieringen w.n.van.wieringen@vu.nl wieringen@vu nl Department of Epidemiology and Biostatistics, VUmc & Department of Mathematics, VU University

More information

Likelihood NIPS July 30, Gaussian Process Regression with Student-t. Likelihood. Jarno Vanhatalo, Pasi Jylanki and Aki Vehtari NIPS-2009

Likelihood NIPS July 30, Gaussian Process Regression with Student-t. Likelihood. Jarno Vanhatalo, Pasi Jylanki and Aki Vehtari NIPS-2009 with with July 30, 2010 with 1 2 3 Representation Representation for Distribution Inference for the Augmented Model 4 Approximate Laplacian Approximation Introduction to Laplacian Approximation Laplacian

More information

Expression Data Exploration: Association, Patterns, Factors & Regression Modelling

Expression Data Exploration: Association, Patterns, Factors & Regression Modelling Expression Data Exploration: Association, Patterns, Factors & Regression Modelling Exploring gene expression data Scale factors, median chip correlation on gene subsets for crude data quality investigation

More information

Sparse Linear Models (10/7/13)

Sparse Linear Models (10/7/13) STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine

More information

Integrated Anlaysis of Genomics Data

Integrated Anlaysis of Genomics Data Integrated Anlaysis of Genomics Data Elizabeth Jennings July 3, 01 Abstract In this project, we integrate data from several genomic platforms in a model that incorporates the biological relationships between

More information

Nearest Neighbor Gaussian Processes for Large Spatial Data

Nearest Neighbor Gaussian Processes for Large Spatial Data Nearest Neighbor Gaussian Processes for Large Spatial Data Abhi Datta 1, Sudipto Banerjee 2 and Andrew O. Finley 3 July 31, 2017 1 Department of Biostatistics, Bloomberg School of Public Health, Johns

More information

Machine Learning Techniques for Computer Vision

Machine Learning Techniques for Computer Vision Machine Learning Techniques for Computer Vision Part 2: Unsupervised Learning Microsoft Research Cambridge x 3 1 0.5 0.2 0 0.5 0.3 0 0.5 1 ECCV 2004, Prague x 2 x 1 Overview of Part 2 Mixture models EM

More information

MCMC: Markov Chain Monte Carlo

MCMC: Markov Chain Monte Carlo I529: Machine Learning in Bioinformatics (Spring 2013) MCMC: Markov Chain Monte Carlo Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2013 Contents Review of Markov

More information

STAT 518 Intro Student Presentation

STAT 518 Intro Student Presentation STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible

More information

Learning Sequence Motif Models Using Expectation Maximization (EM) and Gibbs Sampling

Learning Sequence Motif Models Using Expectation Maximization (EM) and Gibbs Sampling Learning Sequence Motif Models Using Expectation Maximization (EM) and Gibbs Sampling BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 009 Mark Craven craven@biostat.wisc.edu Sequence Motifs what is a sequence

More information

Consistent high-dimensional Bayesian variable selection via penalized credible regions

Consistent high-dimensional Bayesian variable selection via penalized credible regions Consistent high-dimensional Bayesian variable selection via penalized credible regions Howard Bondell bondell@stat.ncsu.edu Joint work with Brian Reich Howard Bondell p. 1 Outline High-Dimensional Variable

More information

Structural Learning and Integrative Decomposition of Multi-View Data

Structural Learning and Integrative Decomposition of Multi-View Data Structural Learning and Integrative Decomposition of Multi-View Data, Department of Statistics, Texas A&M University JSM 2018, Vancouver, Canada July 31st, 2018 Dr. Gen Li, Columbia University, Mailman

More information

Regularization Parameter Selection for a Bayesian Multi-Level Group Lasso Regression Model with Application to Imaging Genomics

Regularization Parameter Selection for a Bayesian Multi-Level Group Lasso Regression Model with Application to Imaging Genomics Regularization Parameter Selection for a Bayesian Multi-Level Group Lasso Regression Model with Application to Imaging Genomics arxiv:1603.08163v1 [stat.ml] 7 Mar 016 Farouk S. Nathoo, Keelin Greenlaw,

More information

Nonparametric Bayes tensor factorizations for big data

Nonparametric Bayes tensor factorizations for big data Nonparametric Bayes tensor factorizations for big data David Dunson Department of Statistical Science, Duke University Funded from NIH R01-ES017240, R01-ES017436 & DARPA N66001-09-C-2082 Motivation Conditional

More information

MCMC algorithms for fitting Bayesian models

MCMC algorithms for fitting Bayesian models MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models

More information

Contents. Part I: Fundamentals of Bayesian Inference 1

Contents. Part I: Fundamentals of Bayesian Inference 1 Contents Preface xiii Part I: Fundamentals of Bayesian Inference 1 1 Probability and inference 3 1.1 The three steps of Bayesian data analysis 3 1.2 General notation for statistical inference 4 1.3 Bayesian

More information

Introduction to Bioinformatics

Introduction to Bioinformatics CSCI8980: Applied Machine Learning in Computational Biology Introduction to Bioinformatics Rui Kuang Department of Computer Science and Engineering University of Minnesota kuang@cs.umn.edu History of Bioinformatics

More information

Bayesian Linear Regression

Bayesian Linear Regression Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective

More information

On Bayesian Computation

On Bayesian Computation On Bayesian Computation Michael I. Jordan with Elaine Angelino, Maxim Rabinovich, Martin Wainwright and Yun Yang Previous Work: Information Constraints on Inference Minimize the minimax risk under constraints

More information

Gibbs Sampling in Linear Models #2

Gibbs Sampling in Linear Models #2 Gibbs Sampling in Linear Models #2 Econ 690 Purdue University Outline 1 Linear Regression Model with a Changepoint Example with Temperature Data 2 The Seemingly Unrelated Regressions Model 3 Gibbs sampling

More information

STA 4273H: Sta-s-cal Machine Learning

STA 4273H: Sta-s-cal Machine Learning STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 2 In our

More information

Group lasso for genomic data

Group lasso for genomic data Group lasso for genomic data Jean-Philippe Vert Mines ParisTech / Curie Institute / Inserm Machine learning: Theory and Computation workshop, IMA, Minneapolis, March 26-3, 22 J.P Vert (ParisTech) Group

More information

Bayesian Regression Linear and Logistic Regression

Bayesian Regression Linear and Logistic Regression When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we

More information

Hierarchical Modeling for Univariate Spatial Data

Hierarchical Modeling for Univariate Spatial Data Hierarchical Modeling for Univariate Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Spatial Domain 2 Geography 890 Spatial Domain This

More information

Physician Performance Assessment / Spatial Inference of Pollutant Concentrations

Physician Performance Assessment / Spatial Inference of Pollutant Concentrations Physician Performance Assessment / Spatial Inference of Pollutant Concentrations Dawn Woodard Operations Research & Information Engineering Cornell University Johns Hopkins Dept. of Biostatistics, April

More information

Bayesian construction of perceptrons to predict phenotypes from 584K SNP data.

Bayesian construction of perceptrons to predict phenotypes from 584K SNP data. Bayesian construction of perceptrons to predict phenotypes from 584K SNP data. Luc Janss, Bert Kappen Radboud University Nijmegen Medical Centre Donders Institute for Neuroscience Introduction Genetic

More information

Bayesian Sparse Correlated Factor Analysis

Bayesian Sparse Correlated Factor Analysis Bayesian Sparse Correlated Factor Analysis 1 Abstract In this paper, we propose a new sparse correlated factor model under a Bayesian framework that intended to model transcription factor regulation in

More information

Shrinkage Methods: Ridge and Lasso

Shrinkage Methods: Ridge and Lasso Shrinkage Methods: Ridge and Lasso Jonathan Hersh 1 Chapman University, Argyros School of Business hersh@chapman.edu February 27, 2019 J.Hersh (Chapman) Ridge & Lasso February 27, 2019 1 / 43 1 Intro and

More information

Density Estimation. Seungjin Choi

Density Estimation. Seungjin Choi Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

Bayesian Inference of Multiple Gaussian Graphical Models

Bayesian Inference of Multiple Gaussian Graphical Models Bayesian Inference of Multiple Gaussian Graphical Models Christine Peterson,, Francesco Stingo, and Marina Vannucci February 18, 2014 Abstract In this paper, we propose a Bayesian approach to inference

More information

Metropolis Hastings. Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601. Module 9

Metropolis Hastings. Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601. Module 9 Metropolis Hastings Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601 Module 9 1 The Metropolis-Hastings algorithm is a general term for a family of Markov chain simulation methods

More information

Markov Chain Monte Carlo Algorithms for Gaussian Processes

Markov Chain Monte Carlo Algorithms for Gaussian Processes Markov Chain Monte Carlo Algorithms for Gaussian Processes Michalis K. Titsias, Neil Lawrence and Magnus Rattray School of Computer Science University of Manchester June 8 Outline Gaussian Processes Sampling

More information

25 : Graphical induced structured input/output models

25 : Graphical induced structured input/output models 10-708: Probabilistic Graphical Models 10-708, Spring 2016 25 : Graphical induced structured input/output models Lecturer: Eric P. Xing Scribes: Raied Aljadaany, Shi Zong, Chenchen Zhu Disclaimer: A large

More information

Ages of stellar populations from color-magnitude diagrams. Paul Baines. September 30, 2008

Ages of stellar populations from color-magnitude diagrams. Paul Baines. September 30, 2008 Ages of stellar populations from color-magnitude diagrams Paul Baines Department of Statistics Harvard University September 30, 2008 Context & Example Welcome! Today we will look at using hierarchical

More information

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain

More information

Based on slides by Richard Zemel

Based on slides by Richard Zemel CSC 412/2506 Winter 2018 Probabilistic Learning and Reasoning Lecture 3: Directed Graphical Models and Latent Variables Based on slides by Richard Zemel Learning outcomes What aspects of a model can we

More information

Bayesian Linear Models

Bayesian Linear Models Bayesian Linear Models Sudipto Banerjee September 03 05, 2017 Department of Biostatistics, Fielding School of Public Health, University of California, Los Angeles Linear Regression Linear regression is,

More information

Bayesian Linear Models

Bayesian Linear Models Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics, School of Public

More information

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu Lecture: Gaussian Process Regression STAT 6474 Instructor: Hongxiao Zhu Motivation Reference: Marc Deisenroth s tutorial on Robot Learning. 2 Fast Learning for Autonomous Robots with Gaussian Processes

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

Exam: high-dimensional data analysis January 20, 2014

Exam: high-dimensional data analysis January 20, 2014 Exam: high-dimensional data analysis January 20, 204 Instructions: - Write clearly. Scribbles will not be deciphered. - Answer each main question not the subquestions on a separate piece of paper. - Finish

More information

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability

More information

Bayesian linear regression

Bayesian linear regression Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding

More information

Integrated Non-Factorized Variational Inference

Integrated Non-Factorized Variational Inference Integrated Non-Factorized Variational Inference Shaobo Han, Xuejun Liao and Lawrence Carin Duke University February 27, 2014 S. Han et al. Integrated Non-Factorized Variational Inference February 27, 2014

More information

Machine Learning for Economists: Part 4 Shrinkage and Sparsity

Machine Learning for Economists: Part 4 Shrinkage and Sparsity Machine Learning for Economists: Part 4 Shrinkage and Sparsity Michal Andrle International Monetary Fund Washington, D.C., October, 2018 Disclaimer #1: The views expressed herein are those of the authors

More information

Approximate Bayesian Computation

Approximate Bayesian Computation Approximate Bayesian Computation Michael Gutmann https://sites.google.com/site/michaelgutmann University of Helsinki and Aalto University 1st December 2015 Content Two parts: 1. The basics of approximate

More information

Hierarchical Modelling for Univariate Spatial Data

Hierarchical Modelling for Univariate Spatial Data Hierarchical Modelling for Univariate Spatial Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department

More information

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision The Particle Filter Non-parametric implementation of Bayes filter Represents the belief (posterior) random state samples. by a set of This representation is approximate. Can represent distributions that

More information

Bayesian Linear Models

Bayesian Linear Models Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Semi-Penalized Inference with Direct FDR Control

Semi-Penalized Inference with Direct FDR Control Jian Huang University of Iowa April 4, 2016 The problem Consider the linear regression model y = p x jβ j + ε, (1) j=1 where y IR n, x j IR n, ε IR n, and β j is the jth regression coefficient, Here p

More information

Inferring Transcriptional Regulatory Networks from High-throughput Data

Inferring Transcriptional Regulatory Networks from High-throughput Data Inferring Transcriptional Regulatory Networks from High-throughput Data Lectures 9 Oct 26, 2011 CSE 527 Computational Biology, Fall 2011 Instructor: Su-In Lee TA: Christopher Miles Monday & Wednesday 12:00-1:20

More information

November 2002 STA Random Effects Selection in Linear Mixed Models

November 2002 STA Random Effects Selection in Linear Mixed Models November 2002 STA216 1 Random Effects Selection in Linear Mixed Models November 2002 STA216 2 Introduction It is common practice in many applications to collect multiple measurements on a subject. Linear

More information

Gaussian Mixture Model

Gaussian Mixture Model Case Study : Document Retrieval MAP EM, Latent Dirichlet Allocation, Gibbs Sampling Machine Learning/Statistics for Big Data CSE599C/STAT59, University of Washington Emily Fox 0 Emily Fox February 5 th,

More information

MCMC Sampling for Bayesian Inference using L1-type Priors

MCMC Sampling for Bayesian Inference using L1-type Priors MÜNSTER MCMC Sampling for Bayesian Inference using L1-type Priors (what I do whenever the ill-posedness of EEG/MEG is just not frustrating enough!) AG Imaging Seminar Felix Lucka 26.06.2012 , MÜNSTER Sampling

More information

Lecture 13 Fundamentals of Bayesian Inference

Lecture 13 Fundamentals of Bayesian Inference Lecture 13 Fundamentals of Bayesian Inference Dennis Sun Stats 253 August 11, 2014 Outline of Lecture 1 Bayesian Models 2 Modeling Correlations Using Bayes 3 The Universal Algorithm 4 BUGS 5 Wrapping Up

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Part 8: GLMs and Hierarchical LMs and GLMs

Part 8: GLMs and Hierarchical LMs and GLMs Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course

More information

Bayesian Grouped Horseshoe Regression with Application to Additive Models

Bayesian Grouped Horseshoe Regression with Application to Additive Models Bayesian Grouped Horseshoe Regression with Application to Additive Models Zemei Xu, Daniel F. Schmidt, Enes Makalic, Guoqi Qian, and John L. Hopper Centre for Epidemiology and Biostatistics, Melbourne

More information

Statistics for high-dimensional data: Group Lasso and additive models

Statistics for high-dimensional data: Group Lasso and additive models Statistics for high-dimensional data: Group Lasso and additive models Peter Bühlmann and Sara van de Geer Seminar für Statistik, ETH Zürich May 2012 The Group Lasso (Yuan & Lin, 2006) high-dimensional

More information

Recent Advances in Bayesian Inference Techniques

Recent Advances in Bayesian Inference Techniques Recent Advances in Bayesian Inference Techniques Christopher M. Bishop Microsoft Research, Cambridge, U.K. research.microsoft.com/~cmbishop SIAM Conference on Data Mining, April 2004 Abstract Bayesian

More information

Or How to select variables Using Bayesian LASSO

Or How to select variables Using Bayesian LASSO Or How to select variables Using Bayesian LASSO x 1 x 2 x 3 x 4 Or How to select variables Using Bayesian LASSO x 1 x 2 x 3 x 4 Or How to select variables Using Bayesian LASSO On Bayesian Variable Selection

More information

A temporal hidden Markov regression model for the analysis. of gene regulatory networks

A temporal hidden Markov regression model for the analysis. of gene regulatory networks A temporal hidden Markov regression model for the analysis of gene regulatory networks Mayetri Gupta, Pingping Qu, and Joseph G. Ibrahim February 20, 2007 Abstract We propose a novel hierarchical hidden

More information

Models for spatial data (cont d) Types of spatial data. Types of spatial data (cont d) Hierarchical models for spatial data

Models for spatial data (cont d) Types of spatial data. Types of spatial data (cont d) Hierarchical models for spatial data Hierarchical models for spatial data Based on the book by Banerjee, Carlin and Gelfand Hierarchical Modeling and Analysis for Spatial Data, 2004. We focus on Chapters 1, 2 and 5. Geo-referenced data arise

More information

Stat 516, Homework 1

Stat 516, Homework 1 Stat 516, Homework 1 Due date: October 7 1. Consider an urn with n distinct balls numbered 1,..., n. We sample balls from the urn with replacement. Let N be the number of draws until we encounter a ball

More information

Latent Variable models for GWAs

Latent Variable models for GWAs Latent Variable models for GWAs Oliver Stegle Machine Learning and Computational Biology Research Group Max-Planck-Institutes Tübingen, Germany September 2011 O. Stegle Latent variable models for GWAs

More information

Hierarchical Modelling for Univariate Spatial Data

Hierarchical Modelling for Univariate Spatial Data Spatial omain Hierarchical Modelling for Univariate Spatial ata Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A.

More information

Large-scale Ordinal Collaborative Filtering

Large-scale Ordinal Collaborative Filtering Large-scale Ordinal Collaborative Filtering Ulrich Paquet, Blaise Thomson, and Ole Winther Microsoft Research Cambridge, University of Cambridge, Technical University of Denmark ulripa@microsoft.com,brmt2@cam.ac.uk,owi@imm.dtu.dk

More information

Learning in Bayesian Networks

Learning in Bayesian Networks Learning in Bayesian Networks Florian Markowetz Max-Planck-Institute for Molecular Genetics Computational Molecular Biology Berlin Berlin: 20.06.2002 1 Overview 1. Bayesian Networks Stochastic Networks

More information

Simultaneous inference for multiple testing and clustering via a Dirichlet process mixture model

Simultaneous inference for multiple testing and clustering via a Dirichlet process mixture model Simultaneous inference for multiple testing and clustering via a Dirichlet process mixture model David B Dahl 1, Qianxing Mo 2 and Marina Vannucci 3 1 Texas A&M University, US 2 Memorial Sloan-Kettering

More information

Inferring Transcriptional Regulatory Networks from Gene Expression Data II

Inferring Transcriptional Regulatory Networks from Gene Expression Data II Inferring Transcriptional Regulatory Networks from Gene Expression Data II Lectures 9 Oct 26, 2011 CSE 527 Computational Biology, Fall 2011 Instructor: Su-In Lee TA: Christopher Miles Monday & Wednesday

More information

Stochastic processes and Markov chains (part II)

Stochastic processes and Markov chains (part II) Stochastic processes and Markov chains (part II) Wessel van Wieringen w.n.van.wieringen@vu.nl Department of Epidemiology and Biostatistics, VUmc & Department of Mathematics, VU University Amsterdam, The

More information

Introduction to Bayesian methods in inverse problems

Introduction to Bayesian methods in inverse problems Introduction to Bayesian methods in inverse problems Ville Kolehmainen 1 1 Department of Applied Physics, University of Eastern Finland, Kuopio, Finland March 4 2013 Manchester, UK. Contents Introduction

More information

Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geo-statistical Datasets

Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geo-statistical Datasets Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geo-statistical Datasets Abhirup Datta 1 Sudipto Banerjee 1 Andrew O. Finley 2 Alan E. Gelfand 3 1 University of Minnesota, Minneapolis,

More information

Study Notes on the Latent Dirichlet Allocation

Study Notes on the Latent Dirichlet Allocation Study Notes on the Latent Dirichlet Allocation Xugang Ye 1. Model Framework A word is an element of dictionary {1,,}. A document is represented by a sequence of words: =(,, ), {1,,}. A corpus is a collection

More information

Proteomics and Variable Selection

Proteomics and Variable Selection Proteomics and Variable Selection p. 1/55 Proteomics and Variable Selection Alex Lewin With thanks to Paul Kirk for some graphs Department of Epidemiology and Biostatistics, School of Public Health, Imperial

More information

Supplement to Bayesian inference for high-dimensional linear regression under the mnet priors

Supplement to Bayesian inference for high-dimensional linear regression under the mnet priors The Canadian Journal of Statistics Vol. xx No. yy 0?? Pages?? La revue canadienne de statistique Supplement to Bayesian inference for high-dimensional linear regression under the mnet priors Aixin Tan

More information

Part 2: Multivariate fmri analysis using a sparsifying spatio-temporal prior

Part 2: Multivariate fmri analysis using a sparsifying spatio-temporal prior Chalmers Machine Learning Summer School Approximate message passing and biomedicine Part 2: Multivariate fmri analysis using a sparsifying spatio-temporal prior Tom Heskes joint work with Marcel van Gerven

More information

Bayesian Nonparametric Regression for Diabetes Deaths

Bayesian Nonparametric Regression for Diabetes Deaths Bayesian Nonparametric Regression for Diabetes Deaths Brian M. Hartman PhD Student, 2010 Texas A&M University College Station, TX, USA David B. Dahl Assistant Professor Texas A&M University College Station,

More information

MULTILEVEL IMPUTATION 1

MULTILEVEL IMPUTATION 1 MULTILEVEL IMPUTATION 1 Supplement B: MCMC Sampling Steps and Distributions for Two-Level Imputation This document gives technical details of the full conditional distributions used to draw regression

More information

Advances and Applications in Perfect Sampling

Advances and Applications in Perfect Sampling and Applications in Perfect Sampling Ph.D. Dissertation Defense Ulrike Schneider advisor: Jem Corcoran May 8, 2003 Department of Applied Mathematics University of Colorado Outline Introduction (1) MCMC

More information

Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods

Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods Prof. Daniel Cremers 14. Sampling Methods Sampling Methods Sampling Methods are widely used in Computer Science as an approximation of a deterministic algorithm to represent uncertainty without a parametric

More information

27: Case study with popular GM III. 1 Introduction: Gene association mapping for complex diseases 1

27: Case study with popular GM III. 1 Introduction: Gene association mapping for complex diseases 1 10-708: Probabilistic Graphical Models, Spring 2015 27: Case study with popular GM III Lecturer: Eric P. Xing Scribes: Hyun Ah Song & Elizabeth Silver 1 Introduction: Gene association mapping for complex

More information

Introduction to Probabilistic Machine Learning

Introduction to Probabilistic Machine Learning Introduction to Probabilistic Machine Learning Piyush Rai Dept. of CSE, IIT Kanpur (Mini-course 1) Nov 03, 2015 Piyush Rai (IIT Kanpur) Introduction to Probabilistic Machine Learning 1 Machine Learning

More information

Approximate Bayesian Computation and Particle Filters

Approximate Bayesian Computation and Particle Filters Approximate Bayesian Computation and Particle Filters Dennis Prangle Reading University 5th February 2014 Introduction Talk is mostly a literature review A few comments on my own ongoing research See Jasra

More information

Statistical Learning with the Lasso, spring The Lasso

Statistical Learning with the Lasso, spring The Lasso Statistical Learning with the Lasso, spring 2017 1 Yeast: understanding basic life functions p=11,904 gene values n number of experiments ~ 10 Blomberg et al. 2003, 2010 The Lasso fmri brain scans function

More information

Motivation Sparse Signal Recovery is an interesting area with many potential applications. Methods developed for solving sparse signal recovery proble

Motivation Sparse Signal Recovery is an interesting area with many potential applications. Methods developed for solving sparse signal recovery proble Bayesian Methods for Sparse Signal Recovery Bhaskar D Rao 1 University of California, San Diego 1 Thanks to David Wipf, Zhilin Zhang and Ritwik Giri Motivation Sparse Signal Recovery is an interesting

More information

Probabilistic Graphical Models

Probabilistic Graphical Models 2016 Robert Nowak Probabilistic Graphical Models 1 Introduction We have focused mainly on linear models for signals, in particular the subspace model x = Uθ, where U is a n k matrix and θ R k is a vector

More information

Approximate Inference

Approximate Inference Approximate Inference Simulation has a name: sampling Sampling is a hot topic in machine learning, and it s really simple Basic idea: Draw N samples from a sampling distribution S Compute an approximate

More information

Coupled Hidden Markov Models: Computational Challenges

Coupled Hidden Markov Models: Computational Challenges .. Coupled Hidden Markov Models: Computational Challenges Louis J. M. Aslett and Chris C. Holmes i-like Research Group University of Oxford Warwick Algorithms Seminar 7 th March 2014 ... Hidden Markov

More information

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin 1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)

More information