Topics in Graphical Models. Statistical Machine Learning Ryan Haunfelder

Size: px

Start display at page:

Download "Topics in Graphical Models. Statistical Machine Learning Ryan Haunfelder"

Shannon Mills
5 years ago
Views:

1 Topics in Statistical Machine Learning

2 Loglinear Models When all of the measured variables are discrete we can create a contingency table and model expected counts using log-linear models. Recall that for three variables X, Y, and Z a (saturated) log-linear model is, log F ijk = λ + λ X + λ Y + λ Z + λ XY + λ XZ + λ YZ + λ XYZ where F ijk = expected cell count in cell ijk. All log-linear models relate the log expected cell count to model components A hierarchical log-linear model is one that contains all lower order effects when a higher order term is included.

3 Hierarchical Log-linear Models as Consider the logistic model. By examining the odds ratio between Y and Z in the model we see, ( ) Fijk /F ijk log = λ YZ jk λ YZ j F ij k/f k λyz jk + λyz j k ij k and hence the odds ratio are independent if and only if λ YZ = 0. The hierarchical log-linear model is completely determined by it s two-factor terms. This can also be seen by noting that the independence graph of the hierarchical log-linear model is global, local, and pairwise Markov.

4 Discrete Graphical Model Estimation To compute maximum likelihood estimates we will need to enumerate over a normalizing constant that contains 2 p 2 terms. For small p we can either, 1 Use Poisson log-linear modeling as the introduction suggests by utilizing glm in R. 2 Find the gradient and use gradient descent. 3 Use iterative proportional fitting. For large p (> 30) we can approximate the gradient using, 1 Mean field approximation 2 Gibbs sampling

5 Sparse Discrete Graphical Model Estimation Typically a lars algorithm is used to find a sparse solution to a discrete graphical model. The dglars package in R implements this algorithm and allows for cross-validation. lars=dglars(count~.+.*.,family="poisson",data=df) lars.cv=cvdglars(count~.+.*.,family="poisson",data=df)

6 Biomarker Example Data include 4 homozygous markers form 339 recombinant inbred lines.

7 Biomarker Example Int. m1 m2 m3 m4 m1:m2 Estimate m1:m3 m1:m4 m2:m3 m2:m4 m3:m4 Estimate

8 Gaussian Recall that in a Gaussian graphical model our goal was to estimate the precision matrix (Ω) to infer conditional independence between variables. We had two methods to achieve this, 1 Global- Estimate Ω through maximum likelihood. 2 Local- Take advantage of the fact that the conditional expectation of each variable can be written in terms of the other variables and use regression techniques. In the case where the data is not Gaussian we cannot use the global approach. Furthermore, there is no guarantee that the conditional means of the variables are linear.

SPArse Conditional Estimation using JAMs (SPACE JAM!) Voorman et. al. (2014) suggested the use of jointly additive models (JAMs) for conditional means of variables.

9 SPArse Conditional Estimation using JAMs (SPACE JAM!) Voorman et. al. (2014) suggested the use of jointly additive models (JAMs) for conditional means of variables. That is, x j {x k, k j} = k j f jk (x k ) + ɛ j where f j k( ) F for some function space F. After choosing a desired basis function the resulting problem is, 1 d min f j k F,1 j,k d j 2n j=1 x f jk (x k ) λ k j k>j ( f jk (x k ) 2 ) 1/2 2 + f kj (x j )

10 SPACE JAM with Psychology Example In a study of personality traits, 839 twins are given a series of psychological tests and a score is given for traits 36 different traits. BIC lambda Edges

11 SPACE JAM with Psychology Example The independence graph chosen by BIC is shown below. Maturity Conformance Self-Control Socialization Well-being Good Impression Social Desirability Communality Acquiescence Tolerance Neuroticism Feminity Control Flexibility Psychological-Mindedness Responsibility Managerial Artistic Orientation Social Orientaiton Rigidity Status Independence Dominance Agression Intelectual Sociability Realistic Masculinity-Femininity Orientation Social Presence Self-Acceptence Extraversion Intellectual Orientation Infrequency Total like Status Enterprising Orientation Conventional Extraversion

12 Quantile Regression Myung Hee Lee has suggested using quantile regression to estimate conditonal probabilities. That is, P (X i β 0i + ) j : (i, j) Eβ ij (τ)x j X {i} = τ where τ (0, 1). To obtain a sparse solution she proposes two methods. The first uses composite quantile(cq) parameter estimates β CQ ij for several τ values and solves, min β CQ ij L n l=1 k=1 ρ τl X ik β i0 (τ l ) j i β CQ ij X jk + λ j i f ( β CQ ij ) The other allows the parameters to vary and is known as joint quantile(jq).

13 Simulation Results Simulation 1-Multivariate normal data Simulation 2-Gaussian copula data, where marginals are Gamma with shape parameters Unif(1,5) and rate parameters U(0,10).

14 References I Luigi Augugliaro, Angelo M Mineo, and Ernst C Wit. Differential geometric least angle regression: A differential geometric approach to sparse generalized linear models. Journal of the Royal Statistical Society. Series B: Statistical Methodology, 75: , Bradley Efron, Trevor Hastie, Iain Johnstone, Robert Tibshirani, Hemant Ishwaran, Keith Knight, Jean Michel Loubes, Pascal Massart, David Madigan, Greg Ridgeway, Saharon Rosset, J. I. Zhu, Robert A. Stine, Berwin A. Turlach, Sanford Weisberg, Trevor Hastie, Iain Johnstone, and Robert Tibshirani. Least angle regression. Annals of Statistics, 32: , Arend Voorman, Ali Shojaie, and Daniela Witten. Graph estimation with joint additive models. Biometrika, 101:85 101, 2014.

Graphical Model Selection

May 6, 2013 Trevor Hastie, Stanford Statistics 1 Graphical Model Selection Trevor Hastie Stanford University joint work with Jerome Friedman, Rob Tibshirani, Rahul Mazumder and Jason Lee May 6, 2013 Trevor