Uncertainty quantification and visualization for functional random variables

Size: px
Start display at page:

Download "Uncertainty quantification and visualization for functional random variables"

Transcription

1 Uncertainty quantification and visualization for functional random variables MascotNum Workshop 2014 S. Nanty 1,3 C. Helbert 2 A. Marrel 1 N. Pérot 1 C. Prieur 3 1 CEA, DEN/DER/SESI/LSMR, F-13108, Saint-Paul-lez-Durance, France 2 Ecole Centrale de Lyon, France 3 Universite Joseph Fourier and INRIA, France April 23rd 2014

2 Introduction Identify/characterize the statistical properties of functional random variables. The variables are dependent and linked to a scalar (or vectorial) covariate. Propose a methodology of uncertainty characterization in order to: get an estimate of the joint probability density function of the variables, simulate new samples according to the estimated distribution, adapt visualization tools to identify uncertainty characteristics of dependent functional variables. Nanty, Helbert, Marrel, Pérot, Prieur MascotNum 2014 April 23rd / 33

3 Problem description Let f 1,..., f m : I Ω R be dependent functional random variables. Let Y be a random variable, called covariate. Let M be a computer code/simulator such that Y = M(f 1,..., f m ). Let fj i be the i th realization of the j th functional random variable, for 1 i n, 1 j m. Nanty, Helbert, Marrel, Pérot, Prieur MascotNum 2014 April 23rd / 33

4 Proposed methodology Two main steps: 1. Decomposition on a reduced functional basis, taking into account the covariate 2. Modeling of the probability density function of the decomposition coefficients Functional variables realizations Functional decomposition Covariate Selection of d components Density estimation for the d coefficients Nanty, Helbert, Marrel, Pérot, Prieur MascotNum 2014 April 23rd / 33

5 Table of Contents Dimension reduction by functional decomposition Estimation of coefficient probability density function Illustration on an analytical example Application on a nuclear safety test case Associated uncertainty visualization tool Conclusion Nanty, Helbert, Marrel, Pérot, Prieur MascotNum 2014 April 23rd / 33

6 Decomposition on a functional basis Definition Let f : I R, x I. f (x) = + k=1 α k φ k (x) α k coefficients, φ k basis functions Nanty, Helbert, Marrel, Pérot, Prieur MascotNum 2014 April 23rd / 33

7 Decomposition on a functional basis Definition Let f : I R, x I. d ˆf (x) = α k φ k (x) k=1 α k coefficients, φ k basis functions, d basis size Nanty, Helbert, Marrel, Pérot, Prieur MascotNum 2014 April 23rd / 33

8 Partial Least Squares regression Let X (n p) and Y (n q) data matrices of respectively observable and predicted variables. X and Y are centered and standardized. Principle: linear regression between the projections of X and Y in a new space, called latent variables, whose correlation is maximal. Algorithm of PLS regression [Wold, 1975] Initialization: X 0 = X, Y 0 = Y At each step h, we are seeking for the latent variables α h = X h 1 u h and ω h = Y h 1 v h solutions of max cov(x h 1u h, Y h 1 v h ). u h = v h =1 "Deflation": X h = X h 1 α h φ T h, with φ h = X h 1 α h /(α T h α h) Nanty, Helbert, Marrel, Pérot, Prieur MascotNum 2014 April 23rd / 33

9 Partial Least Squares regression Let X (n p) and Y (n q) data matrices of respectively observable and predicted variables. X and Y are centered and standardized. Principle: linear regression between the projections of X and Y in a new space, called latent variables, whose correlation is maximal. Algorithm of PLS regression [Wold, 1975] Initialization: X 0 = X, Y 0 = Y At each step h, we are seeking for the latent variables α h = X h 1 u h and ω h = Y h 1 v h solutions of max cov(x h 1u h, Y h 1 v h ). u h = v h =1 "Deflation": X h = X h 1 α h φ T h, with φ h = X h 1 α h /(α T h α h) Nanty, Helbert, Marrel, Pérot, Prieur MascotNum 2014 April 23rd / 33

10 Partial Least Squares decomposition It can be deduced from the deflation step that X can be written as follows: X = AΦ T + ɛ where the column vectors of A and Φ are respectively α h and φ h and ɛ are the residuals. Let the column vectors of X be functions discretized on p points and Y be the covariate. A is the matrix of coefficients of the decomposition. Φ is the matrix of basis functions. Basis functions are fitted to data, and adjusted to maximize the correlation between the functions and the covariate. Nanty, Helbert, Marrel, Pérot, Prieur MascotNum 2014 April 23rd / 33

11 Partial Least Squares decomposition It can be deduced from the deflation step that X can be written as follows: X = AΦ T + ɛ where the column vectors of A and Φ are respectively α h and φ h and ɛ are the residuals. Let the column vectors of X be functions discretized on p points and Y be the covariate. A is the matrix of coefficients of the decomposition. Φ is the matrix of basis functions. Basis functions are fitted to data, and adjusted to maximize the correlation between the functions and the covariate. Nanty, Helbert, Marrel, Pérot, Prieur MascotNum 2014 April 23rd / 33

12 Partial Least Squares decomposition It can be deduced from the deflation step that X can be written as follows: X = AΦ T + ɛ where the column vectors of A and Φ are respectively α h and φ h and ɛ are the residuals. Let the column vectors of X be functions discretized on p points and Y be the covariate. A is the matrix of coefficients of the decomposition. Φ is the matrix of basis functions. Basis functions are fitted to data, and adjusted to maximize the correlation between the functions and the covariate. Nanty, Helbert, Marrel, Pérot, Prieur MascotNum 2014 April 23rd / 33

13 Simultaneous PLS decomposition Objective: extend PLS decomposition to deal with multiple functional dependent variables simultaneously We suppose that functions f 1... f m are correlated and have common reduction directions. Let t 1 < < t p I Let f i = (f i (t 1 ),..., f i (t p )) be the discretized version of f i, i = 1,..., m. Let each column vector of X be: (f 1,..., f m ) R dm Simultaneous PLS decomposition consists in applying the PLS decomposition to the previously defined matrix X. SPLS decomposition Nanty, Helbert, Marrel, Pérot, Prieur MascotNum 2014 April 23rd / 33

14 Objectives Estimate the probability density function (pdf) of d coefficients from SPLS decomposition High dimension: d > 10 kernel density estimation not adapted Solution: Gaussian mixture model Nanty, Helbert, Marrel, Pérot, Prieur MascotNum 2014 April 23rd / 33

15 Gaussian Mixture Probability density function of a Gaussian mixture: g(α µ 1, Σ 1,..., µ G, Σ G ) = G τ k φ(α µ k, Σ k ), α R d k=1 G clusters n sample points φ: Gaussian probability density function τ k, µ k, Σ k : proportion, mean and covariance matrix of cluster k Advantages / drawbacks + Fast algorithm for parameter estimation + Very fast simulation of a new realization + Can be used in dimension d > 10 parametric model: modeling hypothesis Number of clusters to be determined Nanty, Helbert, Marrel, Pérot, Prieur MascotNum 2014 April 23rd / 33

16 Gaussian Mixture Probability density function of a Gaussian mixture: g(α µ 1, Σ 1,..., µ G, Σ G ) = G τ k φ(α µ k, Σ k ), α R d k=1 G clusters n sample points φ: Gaussian probability density function τ k, µ k, Σ k : proportion, mean and covariance matrix of cluster k Advantages / drawbacks + Fast algorithm for parameter estimation + Very fast simulation of a new realization + Can be used in dimension d > 10 parametric model: modeling hypothesis Number of clusters to be determined Nanty, Helbert, Marrel, Pérot, Prieur MascotNum 2014 April 23rd / 33

17 EM algorithm Expectation-Maximization algorithm (EM) [Dempster et al., 1977] estimates the parameters of the Gaussian mixture model. Let us introduce z ik, the probability of the i th point to be in the k th cluster. Expectation Minimization algorithm: 1. Initialize parameters τ (0) k, µ (0) k et Σ (0) k 2. Expectation Step: Compute z (j) ik 3. Maximization Step: Compute τ (j+1) k, µ (j+1) k, Σ (j+1) k 4. Repeat steps 2 3 until convergence Nanty, Helbert, Marrel, Pérot, Prieur MascotNum 2014 April 23rd / 33

18 Number of parameters reduction Total number of Gaussian mixture parameters: N T = G 1 + Gd + G d(d+1) 2 G: number of clusters in the model N T increases quickly with the dimension d Solution: sparse covariance matrices estimation Two methods sem method: penalizing the inverses of covariance matrices [Krishnamurthy, 2011] sem2 method: penalizing the covariance matrices Nanty, Helbert, Marrel, Pérot, Prieur MascotNum 2014 April 23rd / 33

19 sem method Penalizing the inverses of covariance matrices A lasso penalization on the inverses of the covariance matrices is added in the maximization step: ) ˆΣ k = argmax Σk (l) ˆΣk = argmax Σk (l λ Σ 1 k 1 M 1 = p i,j=1 M i,j. The penalization parameter λ is chosen bycross-validation. The penalized maximization is solved by [Friedman et al., 2008] coordinate descent-based algorithm. Nanty, Helbert, Marrel, Pérot, Prieur MascotNum 2014 April 23rd / 33

20 sem method Penalizing the inverses of covariance matrices sem algorithm [Krishnamurthy, 2011] 1. Initialize parameters τ (0) k, µ (0) k et Σ (0) k 2. Expectation Step: Compute z (j) ik 3. Maximization Step: Compute τ (j+1) k, µ (j+1) k 4. Σ (j+1) ( k argmax Σ l λ Σ 1 ) 1 5. Repeat steps 2 4 until convergence Nanty, Helbert, Marrel, Pérot, Prieur MascotNum 2014 April 23rd / 33

21 sem2 method Penalizing the covariance matrices A lasso penalization on the covariance matrices is added in the maximization step: stands for Hadamard product. P: penalization matrix. ˆΣ k = argmax Σk (l λ P Σ k 1 ) The penalization parameter λ is chosen by cross-validation. The penalized maximization is solved by [Wang, 2013] coordinate descent-based algorithm. Nanty, Helbert, Marrel, Pérot, Prieur MascotNum 2014 April 23rd / 33

22 sem2 method Penalizing the covariance matrices Several proposed matrices P: sem2.1: Equal weights to all matrix elements. All elements are penalized in the same way. P ij = 1 sem2.2: Diagonal elements are not penalized. All others are penalized equally. { 1 if i j P ij = 0 if i = j sem2.3: The lower the off-diagonal element, the more penalized. Diagonal elements are not penalized. P ij = { 1 Σ ij if i j 0 if i = j Nanty, Helbert, Marrel, Pérot, Prieur MascotNum 2014 April 23rd / 33

23 sem2 method Penalizing the covariance matrices Several proposed matrices P: sem2.1: Equal weights to all matrix elements. All elements are penalized in the same way. P ij = 1 sem2.2: Diagonal elements are not penalized. All others are penalized equally. { 1 if i j P ij = 0 if i = j sem2.3: The lower the off-diagonal element, the more penalized. Diagonal elements are not penalized. P ij = { 1 Σ ij if i j 0 if i = j Nanty, Helbert, Marrel, Pérot, Prieur MascotNum 2014 April 23rd / 33

24 sem2 method Penalizing the covariance matrices Several proposed matrices P: sem2.1: Equal weights to all matrix elements. All elements are penalized in the same way. P ij = 1 sem2.2: Diagonal elements are not penalized. All others are penalized equally. { 1 if i j P ij = 0 if i = j sem2.3: The lower the off-diagonal element, the more penalized. Diagonal elements are not penalized. P ij = { 1 Σ ij if i j 0 if i = j Nanty, Helbert, Marrel, Pérot, Prieur MascotNum 2014 April 23rd / 33

25 sem2 method Penalizing the covariance matrices sem2 algorithm 1. Initialize parameters τ (0) k, µ (0) k et Σ (0) k 2. Expectation Step: Compute z (j) ik 3. Maximization Step: Compute τ (j+1) k, µ (j+1) k 4. Σ (j+1) k argmax Σ (l λ P Σ 1 ) 5. Repeat steps 2 4 until convergence Nanty, Helbert, Marrel, Pérot, Prieur MascotNum 2014 April 23rd / 33

26 Illustration on an analytical example 2 temporal functional random variables f 1 et f 2 depending on three random variables a 1, a 2, a 3. a 1, a 2, a 3 have uniform distributions. Let define the covariate Y (a 1, a 2, a 3 ) = 1 0 (f 1 (t, a 1, a 2, a 3 ) + f 2 (t, a 1, a 2, a 3 )) dt. Nanty, Helbert, Marrel, Pérot, Prieur MascotNum 2014 April 23rd / 33

27 Illustration on an analytical example Hypothesis Learning dataset: 600 curves Test dataset: 1000 curves SPLS decomposition + Gaussian mixture model Optimal number G of clusters chosen with Bayesian Information Criterion (BIC) Proposed criteria to select the basis size d and assess the quality of the characterization method: Criterion C1: Goodness-of-fit of estimated coefficients pdf and the real pdf with1 [Fromont et al., 2012] test. Criterion C2: Goodness-of-fit of estimated covariate pdf and the pdf computed with known covariates with Kolmogorov-Smirnov (KS) test. Criterion C3: Relative mean square between correlation on estimated functions and realizations of the variables. First step: use of EM algorithm Nanty, Helbert, Marrel, Pérot, Prieur MascotNum 2014 April 23rd / 33

28 Illustration on an analytical example Criterion C1: Comparison of coefficients densities Number of components Acceptation rate (%) Acceptation rate (coefficients) Maximal median at d = 8 components. After d = 8, model quality decreases. Nanty, Helbert, Marrel, Pérot, Prieur MascotNum 2014 April 23rd / 33

29 Illustration on an analytical example Criterion C2: Comparison of covariates densities Number of components Acceptation rate (%) Acceptation rate (covariables) Maximal median at d = 6, 8, 10, 18 components. Low variance for d = 8. Very close acceptance rates for all basis sizes. Nanty, Helbert, Marrel, Pérot, Prieur MascotNum 2014 April 23rd / 33

30 Illustration on an analytical example Criterion C3: Comparison of correlations between variables Relative error 1000 Relative error (%) logarithmic scale Number of components The correlation decrease is very even. Relative error is about 40% for d = 8. Nanty, Helbert, Marrel, Pérot, Prieur MascotNum 2014 April 23rd / 33

31 Illustration on an analytical example: conclusions Based on the three criteria optimal basis size d = 8: Good acceptance rates are obtained with EM algorithm. The relative errors on correlation are still quite high. The same criteria have been computed for other estimation algorithms similar results obtained (same d and criteria values). For the analytical example, the EM algorithm seems to be the best choice (efficient, easy and fast): as the number of parameters is quite low in this example (n = 89 for d = 8), the use of sparse algorithms does not improve the estimation. In practice, if no test basis is available, criteria C1, C2 and C3 are computed by cross-validation. Nanty, Helbert, Marrel, Pérot, Prieur MascotNum 2014 April 23rd / 33

32 A nuclear safety test case (1) Scalar parameters } 31 random variables Thermal-hydraulic computer code } Computation time : several hours Scalar parameters Functional Variables Themal-mechanical computer code Safety criterion } Functional random variables } Computation time : 5 minutes } Scalar covariate Nanty, Helbert, Marrel, Pérot, Prieur MascotNum 2014 April 23rd / 33

33 A nuclear safety test case (2) Dataset: 3 functional random variables depending on time Scalar covariate: a safety criterion Learning sample: 400 samples Logarithmic transformation of the sample (positivity constraint) Methodology: SPLS decomposition + Gaussian mixture model + EM algorithm Optimal G determined by BIC Criteria C1, C2 and C3 computed by cross-validation Optimal d chosen by the analysis of the three criteria Nanty, Helbert, Marrel, Pérot, Prieur MascotNum 2014 April 23rd / 33

34 A nuclear safety test case (3) Criterion 1: optimal d = 4 acceptance rates under 80% for d > 8 fast decrease of acceptance rates for d 10 Acceptance rates Decomposition basis size Criterion 2: low acceptance rates for all basis sizes Nanty, Helbert, Marrel, Pérot, Prieur MascotNum 2014 April 23rd / 33

35 A nuclear safety test case (4) Criterion 3: quite good approximation of functional variable correlations for d = 4. corrélation T P corrélation T H corrélation P H données simulations temps temps temps Nanty, Helbert, Marrel, Pérot, Prieur MascotNum 2014 April 23rd / 33

36 Visualization: High Density Region boxplot (HDR) Proposed by [Hyndman and Shang, 2010] and based on Principal Component Analysis First two basis functions selected Kernel density estimation HDR Plot PC score 1 PC score 2 o Explained Variance : % Application on the analytical example: Black curve: functional median Colored curves: outliers Dark (resp. light) gray zone: 50% (resp. 95%) highest density region Nanty, Helbert, Marrel, Pérot, Prieur MascotNum 2014 April 23rd / 33

37 Visualization: Modified HDR boxplot Combination of the HDR boxplot and our proposed characterization methodology (SPLS + Gaussian mixture model) Simultaneous visualization of multiple functions Taking into account a covariate Decomposition on higher basis Illustration on the analytical example: f_1 f_2 f_ f_ Time Time Nanty, Helbert, Marrel, Pérot, Prieur MascotNum 2014 April 23rd / 33

38 Conclusion and perspectives Development of a global methodology to simultaneously characterize dependent functional random variables linked to a covariate. Simultaneous PLS decomposition + Gaussian mixture with sparse covariance matrices Estimation of probabilities for the variables to exceed a threshold. Simulation according to the estimated pdf. Visualization of the uncertainty of the variables. Different proposed criteria to assess the methodology efficiency: Application on an analytical example: good results Application on a nuclear safety test case: functions and correlations quite well reproduced but the covariate pdf not well fitted Perspectives: Computing probabilities and quantiles to exceed a threshold. Using this methodology to run uncertainty propagation and sensitivity analysis studies. Nanty, Helbert, Marrel, Pérot, Prieur MascotNum 2014 April 23rd / 33

39 References Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B (Methodological), pages Friedman, J., Hastie, T., and Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9(3): Fromont, M., Laurent, B., Lerasle, M., and Reynaud-Bouret, P. (2012). Kernels based tests with non-asymptotic bootstrap approaches for two-sample problem. In 25th Annual Conference on Learning Theory, volume 23, pages Hyndman, R. J. and Shang, H. L. (2010). Rainbow plots, bagplots, and boxplots for functional data. Journal of Computational and Graphical Statistics, 19(1): Krishnamurthy, A. (2011). High-dimensional clustering with sparse gaussian mixture models. Wang, H. (2013). Coordinate descent algorithm for covariance graphical lasso. Statistics and Computing, pages 1 9. Wold, H. (1975). Path models with latent variables: The nipals approach. Nanty, Helbert, Quantitative Marrel, Pérot, sociology: Prieur internationalmascotnum perspectives 2014 on mathematical andapril statistical 23rd / 33

40 Appendix Analytical example definition f 1 (t, a 1, a 2, a 3 ) = 0.8a 2 BB(t) + a 1 + c 1 (t) + h(t) f 2 (t, a 1, a 2, a 3 ) = a 2 BB(t) + a 1 + c 2 (t) with a 1 U(0, 0.05) ; a 2 U(0.05, 0.2) ; a 3 U(2, 3) c 1 (t) = c 2 (t) = { t 1 if t < t otherwise h(t) = t if t < a 3 0.5t if 0.5 < t < a t otherwise ( ) 1 t 100a 3 60 Nanty, Helbert, Marrel, Pérot, Prieur MascotNum 2014 April 23rd / 33

10708 Graphical Models: Homework 2

10708 Graphical Models: Homework 2 10708 Graphical Models: Homework 2 Due Monday, March 18, beginning of class Feburary 27, 2013 Instructions: There are five questions (one for extra credit) on this assignment. There is a problem involves

More information

An Introduction to Graphical Lasso

An Introduction to Graphical Lasso An Introduction to Graphical Lasso Bo Chang Graphical Models Reading Group May 15, 2015 Bo Chang (UBC) Graphical Lasso May 15, 2015 1 / 16 Undirected Graphical Models An undirected graph, each vertex represents

More information

High Dimensional Discriminant Analysis

High Dimensional Discriminant Analysis High Dimensional Discriminant Analysis Charles Bouveyron 1,2, Stéphane Girard 1, and Cordelia Schmid 2 1 LMC IMAG, BP 53, Université Grenoble 1, 38041 Grenoble cedex 9 France (e-mail: charles.bouveyron@imag.fr,

More information

Learning Gaussian Graphical Models with Unknown Group Sparsity

Learning Gaussian Graphical Models with Unknown Group Sparsity Learning Gaussian Graphical Models with Unknown Group Sparsity Kevin Murphy Ben Marlin Depts. of Statistics & Computer Science Univ. British Columbia Canada Connections Graphical models Density estimation

More information

Regression, Ridge Regression, Lasso

Regression, Ridge Regression, Lasso Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.

More information

Lecture 3. Linear Regression II Bastian Leibe RWTH Aachen

Lecture 3. Linear Regression II Bastian Leibe RWTH Aachen Advanced Machine Learning Lecture 3 Linear Regression II 02.11.2015 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de/ leibe@vision.rwth-aachen.de This Lecture: Advanced Machine Learning Regression

More information

Chapter 17: Undirected Graphical Models

Chapter 17: Undirected Graphical Models Chapter 17: Undirected Graphical Models The Elements of Statistical Learning Biaobin Jiang Department of Biological Sciences Purdue University bjiang@purdue.edu October 30, 2014 Biaobin Jiang (Purdue)

More information

Mixtures of Rasch Models

Mixtures of Rasch Models Mixtures of Rasch Models Hannah Frick, Friedrich Leisch, Achim Zeileis, Carolin Strobl http://www.uibk.ac.at/statistics/ Introduction Rasch model for measuring latent traits Model assumption: Item parameters

More information

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms François Caron Department of Statistics, Oxford STATLEARN 2014, Paris April 7, 2014 Joint work with Adrien Todeschini,

More information

Kriging models with Gaussian processes - covariance function estimation and impact of spatial sampling

Kriging models with Gaussian processes - covariance function estimation and impact of spatial sampling Kriging models with Gaussian processes - covariance function estimation and impact of spatial sampling François Bachoc former PhD advisor: Josselin Garnier former CEA advisor: Jean-Marc Martinez Department

More information

Gaussian Graphical Models and Graphical Lasso

Gaussian Graphical Models and Graphical Lasso ELE 538B: Sparsity, Structure and Inference Gaussian Graphical Models and Graphical Lasso Yuxin Chen Princeton University, Spring 2017 Multivariate Gaussians Consider a random vector x N (0, Σ) with pdf

More information

Parameter Estimation in the Spatio-Temporal Mixed Effects Model Analysis of Massive Spatio-Temporal Data Sets

Parameter Estimation in the Spatio-Temporal Mixed Effects Model Analysis of Massive Spatio-Temporal Data Sets Parameter Estimation in the Spatio-Temporal Mixed Effects Model Analysis of Massive Spatio-Temporal Data Sets Matthias Katzfuß Advisor: Dr. Noel Cressie Department of Statistics The Ohio State University

More information

An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss

An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss arxiv:1811.04545v1 [stat.co] 12 Nov 2018 Cheng Wang School of Mathematical Sciences, Shanghai Jiao

More information

Probabilistic Models of Past Climate Change

Probabilistic Models of Past Climate Change Probabilistic Models of Past Climate Change Julien Emile-Geay USC College, Department of Earth Sciences SIAM International Conference on Data Mining Anaheim, California April 27, 2012 Probabilistic Models

More information

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Devin Cornell & Sushruth Sastry May 2015 1 Abstract In this article, we explore

More information

Variable selection and machine learning methods in causal inference

Variable selection and machine learning methods in causal inference Variable selection and machine learning methods in causal inference Debashis Ghosh Department of Biostatistics and Informatics Colorado School of Public Health Joint work with Yeying Zhu, University of

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 Exam policy: This exam allows two one-page, two-sided cheat sheets; No other materials. Time: 2 hours. Be sure to write your name and

More information

Structure estimation for Gaussian graphical models

Structure estimation for Gaussian graphical models Faculty of Science Structure estimation for Gaussian graphical models Steffen Lauritzen, University of Copenhagen Department of Mathematical Sciences Minikurs TUM 2016 Lecture 3 Slide 1/48 Overview of

More information

Bootstrap & Confidence/Prediction intervals

Bootstrap & Confidence/Prediction intervals Bootstrap & Confidence/Prediction intervals Olivier Roustant Mines Saint-Étienne 2017/11 Olivier Roustant (EMSE) Bootstrap & Confidence/Prediction intervals 2017/11 1 / 9 Framework Consider a model with

More information

Classification of high dimensional data: High Dimensional Discriminant Analysis

Classification of high dimensional data: High Dimensional Discriminant Analysis Classification of high dimensional data: High Dimensional Discriminant Analysis Charles Bouveyron, Stephane Girard, Cordelia Schmid To cite this version: Charles Bouveyron, Stephane Girard, Cordelia Schmid.

More information

Sparse Linear Models (10/7/13)

Sparse Linear Models (10/7/13) STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine

More information

Accelerating the EM Algorithm for Mixture Density Estimation

Accelerating the EM Algorithm for Mixture Density Estimation Accelerating the EM Algorithm ICERM Workshop September 4, 2015 Slide 1/18 Accelerating the EM Algorithm for Mixture Density Estimation Homer Walker Mathematical Sciences Department Worcester Polytechnic

More information

Mixtures of Gaussians with Sparse Regression Matrices. Constantinos Boulis, Jeffrey Bilmes

Mixtures of Gaussians with Sparse Regression Matrices. Constantinos Boulis, Jeffrey Bilmes Mixtures of Gaussians with Sparse Regression Matrices Constantinos Boulis, Jeffrey Bilmes {boulis,bilmes}@ee.washington.edu Dept of EE, University of Washington Seattle WA, 98195-2500 UW Electrical Engineering

More information

MATH 829: Introduction to Data Mining and Analysis Graphical Models III - Gaussian Graphical Models (cont.)

MATH 829: Introduction to Data Mining and Analysis Graphical Models III - Gaussian Graphical Models (cont.) 1/12 MATH 829: Introduction to Data Mining and Analysis Graphical Models III - Gaussian Graphical Models (cont.) Dominique Guillot Departments of Mathematical Sciences University of Delaware May 6, 2016

More information

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall Machine Learning Gaussian Mixture Models Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall 2012 1 The Generative Model POV We think of the data as being generated from some process. We assume

More information

Mixture Models and EM

Mixture Models and EM Mixture Models and EM Goal: Introduction to probabilistic mixture models and the expectationmaximization (EM) algorithm. Motivation: simultaneous fitting of multiple model instances unsupervised clustering

More information

High-dimensional regression modeling

High-dimensional regression modeling High-dimensional regression modeling David Causeur Department of Statistics and Computer Science Agrocampus Ouest IRMAR CNRS UMR 6625 http://www.agrocampus-ouest.fr/math/causeur/ Course objectives Making

More information

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability

More information

BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage

BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage Lingrui Gan, Naveen N. Narisetty, Feng Liang Department of Statistics University of Illinois at Urbana-Champaign Problem Statement

More information

Chapter 10. Semi-Supervised Learning

Chapter 10. Semi-Supervised Learning Chapter 10. Semi-Supervised Learning Wei Pan Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 Email: weip@biostat.umn.edu PubH 7475/8475 c Wei Pan Outline

More information

Mixtures of Gaussians with Sparse Structure

Mixtures of Gaussians with Sparse Structure Mixtures of Gaussians with Sparse Structure Costas Boulis 1 Abstract When fitting a mixture of Gaussians to training data there are usually two choices for the type of Gaussians used. Either diagonal or

More information

Sparse Permutation Invariant Covariance Estimation: Motivation, Background and Key Results

Sparse Permutation Invariant Covariance Estimation: Motivation, Background and Key Results Sparse Permutation Invariant Covariance Estimation: Motivation, Background and Key Results David Prince Biostat 572 dprince3@uw.edu April 19, 2012 David Prince (UW) SPICE April 19, 2012 1 / 11 Electronic

More information

EM Algorithm II. September 11, 2018

EM Algorithm II. September 11, 2018 EM Algorithm II September 11, 2018 Review EM 1/27 (Y obs, Y mis ) f (y obs, y mis θ), we observe Y obs but not Y mis Complete-data log likelihood: l C (θ Y obs, Y mis ) = log { f (Y obs, Y mis θ) Observed-data

More information

Nonstationary spatial process modeling Part II Paul D. Sampson --- Catherine Calder Univ of Washington --- Ohio State University

Nonstationary spatial process modeling Part II Paul D. Sampson --- Catherine Calder Univ of Washington --- Ohio State University Nonstationary spatial process modeling Part II Paul D. Sampson --- Catherine Calder Univ of Washington --- Ohio State University this presentation derived from that presented at the Pan-American Advanced

More information

STA414/2104. Lecture 11: Gaussian Processes. Department of Statistics

STA414/2104. Lecture 11: Gaussian Processes. Department of Statistics STA414/2104 Lecture 11: Gaussian Processes Department of Statistics www.utstat.utoronto.ca Delivered by Mark Ebden with thanks to Russ Salakhutdinov Outline Gaussian Processes Exam review Course evaluations

More information

On Mixture Regression Shrinkage and Selection via the MR-LASSO

On Mixture Regression Shrinkage and Selection via the MR-LASSO On Mixture Regression Shrinage and Selection via the MR-LASSO Ronghua Luo, Hansheng Wang, and Chih-Ling Tsai Guanghua School of Management, Peing University & Graduate School of Management, University

More information

MS-C1620 Statistical inference

MS-C1620 Statistical inference MS-C1620 Statistical inference 10 Linear regression III Joni Virta Department of Mathematics and Systems Analysis School of Science Aalto University Academic year 2018 2019 Period III - IV 1 / 32 Contents

More information

BEST ESTIMATE PLUS UNCERTAINTY SAFETY STUDIES AT THE CONCEPTUAL DESIGN PHASE OF THE ASTRID DEMONSTRATOR

BEST ESTIMATE PLUS UNCERTAINTY SAFETY STUDIES AT THE CONCEPTUAL DESIGN PHASE OF THE ASTRID DEMONSTRATOR BEST ESTIMATE PLUS UNCERTAINTY SAFETY STUDIES AT THE CONCEPTUAL DESIGN PHASE OF THE ASTRID DEMONSTRATOR M. Marquès CEA, DEN, DER F-13108, Saint-Paul-lez-Durance, France Advanced simulation in support to

More information

Sparse Nonparametric Density Estimation in High Dimensions Using the Rodeo

Sparse Nonparametric Density Estimation in High Dimensions Using the Rodeo Sparse Nonparametric Density Estimation in High Dimensions Using the Rodeo Han Liu John Lafferty Larry Wasserman Statistics Department Computer Science Department Machine Learning Department Carnegie Mellon

More information

Learning Multiple Tasks with a Sparse Matrix-Normal Penalty

Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Yi Zhang and Jeff Schneider NIPS 2010 Presented by Esther Salazar Duke University March 25, 2011 E. Salazar (Reading group) March 25, 2011 1

More information

Mixtures of Negative Binomial distributions for modelling overdispersion in RNA-Seq data

Mixtures of Negative Binomial distributions for modelling overdispersion in RNA-Seq data Mixtures of Negative Binomial distributions for modelling overdispersion in RNA-Seq data Cinzia Viroli 1 joint with E. Bonafede 1, S. Robin 2 & F. Picard 3 1 Department of Statistical Sciences, University

More information

Linear Regression Models. Based on Chapter 3 of Hastie, Tibshirani and Friedman

Linear Regression Models. Based on Chapter 3 of Hastie, Tibshirani and Friedman Linear Regression Models Based on Chapter 3 of Hastie, ibshirani and Friedman Linear Regression Models Here the X s might be: p f ( X = " + " 0 j= 1 X j Raw predictor variables (continuous or coded-categorical

More information

Shrinkage Tuning Parameter Selection in Precision Matrices Estimation

Shrinkage Tuning Parameter Selection in Precision Matrices Estimation arxiv:0909.1123v1 [stat.me] 7 Sep 2009 Shrinkage Tuning Parameter Selection in Precision Matrices Estimation Heng Lian Division of Mathematical Sciences School of Physical and Mathematical Sciences Nanyang

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Linear Regression Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574 1

More information

Final Overview. Introduction to ML. Marek Petrik 4/25/2017

Final Overview. Introduction to ML. Marek Petrik 4/25/2017 Final Overview Introduction to ML Marek Petrik 4/25/2017 This Course: Introduction to Machine Learning Build a foundation for practice and research in ML Basic machine learning concepts: max likelihood,

More information

The Variational Gaussian Approximation Revisited

The Variational Gaussian Approximation Revisited The Variational Gaussian Approximation Revisited Manfred Opper Cédric Archambeau March 16, 2009 Abstract The variational approximation of posterior distributions by multivariate Gaussians has been much

More information

Sparse Nonparametric Density Estimation in High Dimensions Using the Rodeo

Sparse Nonparametric Density Estimation in High Dimensions Using the Rodeo Outline in High Dimensions Using the Rodeo Han Liu 1,2 John Lafferty 2,3 Larry Wasserman 1,2 1 Statistics Department, 2 Machine Learning Department, 3 Computer Science Department, Carnegie Mellon University

More information

Machine Learning for Data Science (CS4786) Lecture 12

Machine Learning for Data Science (CS4786) Lecture 12 Machine Learning for Data Science (CS4786) Lecture 12 Gaussian Mixture Models Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016fa/ Back to K-means Single link is sensitive to outliners We

More information

x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ).

x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ). .8.6 µ =, σ = 1 µ = 1, σ = 1 / µ =, σ =.. 3 1 1 3 x Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ ). The Gaussian distribution Probably the most-important distribution in all of statistics

More information

A spatio-temporal model for extreme precipitation simulated by a climate model

A spatio-temporal model for extreme precipitation simulated by a climate model A spatio-temporal model for extreme precipitation simulated by a climate model Jonathan Jalbert Postdoctoral fellow at McGill University, Montréal Anne-Catherine Favre, Claude Bélisle and Jean-François

More information

Machine Learning for OR & FE

Machine Learning for OR & FE Machine Learning for OR & FE Regression II: Regularization and Shrinkage Methods Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com

More information

Transformations The bias-variance tradeoff Model selection criteria Remarks. Model selection I. Patrick Breheny. February 17

Transformations The bias-variance tradeoff Model selection criteria Remarks. Model selection I. Patrick Breheny. February 17 Model selection I February 17 Remedial measures Suppose one of your diagnostic plots indicates a problem with the model s fit or assumptions; what options are available to you? Generally speaking, you

More information

Regularized Regression A Bayesian point of view

Regularized Regression A Bayesian point of view Regularized Regression A Bayesian point of view Vincent MICHEL Director : Gilles Celeux Supervisor : Bertrand Thirion Parietal Team, INRIA Saclay Ile-de-France LRI, Université Paris Sud CEA, DSV, I2BM,

More information

Machine Learning (BSMC-GA 4439) Wenke Liu

Machine Learning (BSMC-GA 4439) Wenke Liu Machine Learning (BSMC-GA 4439) Wenke Liu 02-01-2018 Biomedical data are usually high-dimensional Number of samples (n) is relatively small whereas number of features (p) can be large Sometimes p>>n Problems

More information

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis. 401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis

More information

STATISTICAL LEARNING SYSTEMS

STATISTICAL LEARNING SYSTEMS STATISTICAL LEARNING SYSTEMS LECTURE 8: UNSUPERVISED LEARNING: FINDING STRUCTURE IN DATA Institute of Computer Science, Polish Academy of Sciences Ph. D. Program 2013/2014 Principal Component Analysis

More information

MATH 829: Introduction to Data Mining and Analysis Graphical Models II - Gaussian Graphical Models

MATH 829: Introduction to Data Mining and Analysis Graphical Models II - Gaussian Graphical Models 1/13 MATH 829: Introduction to Data Mining and Analysis Graphical Models II - Gaussian Graphical Models Dominique Guillot Departments of Mathematical Sciences University of Delaware May 4, 2016 Recall

More information

Clustering by Mixture Models. General background on clustering Example method: k-means Mixture model based clustering Model estimation

Clustering by Mixture Models. General background on clustering Example method: k-means Mixture model based clustering Model estimation Clustering by Mixture Models General bacground on clustering Example method: -means Mixture model based clustering Model estimation 1 Clustering A basic tool in data mining/pattern recognition: Divide

More information

Estimation of cumulative distribution function with spline functions

Estimation of cumulative distribution function with spline functions INTERNATIONAL JOURNAL OF ECONOMICS AND STATISTICS Volume 5, 017 Estimation of cumulative distribution function with functions Akhlitdin Nizamitdinov, Aladdin Shamilov Abstract The estimation of the cumulative

More information

STA414/2104 Statistical Methods for Machine Learning II

STA414/2104 Statistical Methods for Machine Learning II STA414/2104 Statistical Methods for Machine Learning II Murat A. Erdogdu & David Duvenaud Department of Computer Science Department of Statistical Sciences Lecture 3 Slide credits: Russ Salakhutdinov Announcements

More information

Standard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j

Standard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j Standard Errors & Confidence Intervals β β asy N(0, I( β) 1 ), where I( β) = [ 2 l(β, φ; y) ] β i β β= β j We can obtain asymptotic 100(1 α)% confidence intervals for β j using: β j ± Z 1 α/2 se( β j )

More information

Linear Models for Regression CS534

Linear Models for Regression CS534 Linear Models for Regression CS534 Example Regression Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict

More information

Approximating the Covariance Matrix with Low-rank Perturbations

Approximating the Covariance Matrix with Low-rank Perturbations Approximating the Covariance Matrix with Low-rank Perturbations Malik Magdon-Ismail and Jonathan T. Purnell Department of Computer Science Rensselaer Polytechnic Institute Troy, NY 12180 {magdon,purnej}@cs.rpi.edu

More information

Lecture 6: Methods for high-dimensional problems

Lecture 6: Methods for high-dimensional problems Lecture 6: Methods for high-dimensional problems Hector Corrada Bravo and Rafael A. Irizarry March, 2010 In this Section we will discuss methods where data lies on high-dimensional spaces. In particular,

More information

Chapter 3. Linear Models for Regression

Chapter 3. Linear Models for Regression Chapter 3. Linear Models for Regression Wei Pan Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 Email: weip@biostat.umn.edu PubH 7475/8475 c Wei Pan Linear

More information

ONLINE LEARNING OF GAUSSIAN MIXTURE MODELS: A TWO-LEVEL APPROACH

ONLINE LEARNING OF GAUSSIAN MIXTURE MODELS: A TWO-LEVEL APPROACH ONLINE LEARNING OF GAUSSIAN MIXTURE MODELS: A TWO-LEVEL APPROACH Arnaud Declercq, Justus H. Piater Montefiore Institute, University of Liège, B- Liège, Belgium Arnaud.Declercq@ULg.ac.be, Justus.Piater@ULg.ac.be

More information

Regularized Parameter Estimation in High-Dimensional Gaussian Mixture Models

Regularized Parameter Estimation in High-Dimensional Gaussian Mixture Models LETTER Communicated by Clifford Lam Regularized Parameter Estimation in High-Dimensional Gaussian Mixture Models Lingyan Ruan lruan@gatech.edu Ming Yuan ming.yuan@isye.gatech.edu School of Industrial and

More information

Mining Big Data Using Parsimonious Factor and Shrinkage Methods

Mining Big Data Using Parsimonious Factor and Shrinkage Methods Mining Big Data Using Parsimonious Factor and Shrinkage Methods Hyun Hak Kim 1 and Norman Swanson 2 1 Bank of Korea and 2 Rutgers University ECB Workshop on using Big Data for Forecasting and Statistics

More information

Unsupervised Regressive Learning in High-dimensional Space

Unsupervised Regressive Learning in High-dimensional Space Unsupervised Regressive Learning in High-dimensional Space University of Kent ATRC Leicester 31st July, 2018 Outline Data Linkage Analysis High dimensionality and variable screening Variable screening

More information

ABSTRACT INTRODUCTION

ABSTRACT INTRODUCTION ABSTRACT Presented in this paper is an approach to fault diagnosis based on a unifying review of linear Gaussian models. The unifying review draws together different algorithms such as PCA, factor analysis,

More information

* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course.

* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course. Name of the course Statistical methods and data analysis Audience The course is intended for students of the first or second year of the Graduate School in Materials Engineering. The aim of the course

More information

Chapter 14 Combining Models

Chapter 14 Combining Models Chapter 14 Combining Models T-61.62 Special Course II: Pattern Recognition and Machine Learning Spring 27 Laboratory of Computer and Information Science TKK April 3th 27 Outline Independent Mixing Coefficients

More information

A Robust Approach to Regularized Discriminant Analysis

A Robust Approach to Regularized Discriminant Analysis A Robust Approach to Regularized Discriminant Analysis Moritz Gschwandtner Department of Statistics and Probability Theory Vienna University of Technology, Austria Österreichische Statistiktage, Graz,

More information

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin 1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)

More information

Multilevel Mixture with Known Mixing Proportions: Applications to School and Individual Level Overweight and Obesity Data from Birmingham, England

Multilevel Mixture with Known Mixing Proportions: Applications to School and Individual Level Overweight and Obesity Data from Birmingham, England 1 Multilevel Mixture with Known Mixing Proportions: Applications to School and Individual Level Overweight and Obesity Data from Birmingham, England By Shakir Hussain 1 and Ghazi Shukur 1 School of health

More information

ECE 275B Homework #2 Due Thursday MIDTERM is Scheduled for Tuesday, February 21, 2012

ECE 275B Homework #2 Due Thursday MIDTERM is Scheduled for Tuesday, February 21, 2012 Reading ECE 275B Homework #2 Due Thursday 2-16-12 MIDTERM is Scheduled for Tuesday, February 21, 2012 Read and understand the Newton-Raphson and Method of Scores MLE procedures given in Kay, Example 7.11,

More information

A Bayesian Perspective on Residential Demand Response Using Smart Meter Data

A Bayesian Perspective on Residential Demand Response Using Smart Meter Data A Bayesian Perspective on Residential Demand Response Using Smart Meter Data Datong-Paul Zhou, Maximilian Balandat, and Claire Tomlin University of California, Berkeley [datong.zhou, balandat, tomlin]@eecs.berkeley.edu

More information

Functional Latent Feature Models. With Single-Index Interaction

Functional Latent Feature Models. With Single-Index Interaction Generalized With Single-Index Interaction Department of Statistics Center for Statistical Bioinformatics Institute for Applied Mathematics and Computational Science Texas A&M University Naisyin Wang and

More information

Introduction to Machine Learning Midterm, Tues April 8

Introduction to Machine Learning Midterm, Tues April 8 Introduction to Machine Learning 10-701 Midterm, Tues April 8 [1 point] Name: Andrew ID: Instructions: You are allowed a (two-sided) sheet of notes. Exam ends at 2:45pm Take a deep breath and don t spend

More information

Expectation Maximization Deconvolution Algorithm

Expectation Maximization Deconvolution Algorithm Epectation Maimization Deconvolution Algorithm Miaomiao ZHANG March 30, 2011 Abstract In this paper, we use a general mathematical and eperimental methodology to analyze image deconvolution. The main procedure

More information

Permutation-invariant regularization of large covariance matrices. Liza Levina

Permutation-invariant regularization of large covariance matrices. Liza Levina Liza Levina Permutation-invariant covariance regularization 1/42 Permutation-invariant regularization of large covariance matrices Liza Levina Department of Statistics University of Michigan Joint work

More information

STAT 535 Lecture 5 November, 2018 Brief overview of Model Selection and Regularization c Marina Meilă

STAT 535 Lecture 5 November, 2018 Brief overview of Model Selection and Regularization c Marina Meilă STAT 535 Lecture 5 November, 2018 Brief overview of Model Selection and Regularization c Marina Meilă mmp@stat.washington.edu Reading: Murphy: BIC, AIC 8.4.2 (pp 255), SRM 6.5 (pp 204) Hastie, Tibshirani

More information

Statistics Toolbox 6. Apply statistical algorithms and probability models

Statistics Toolbox 6. Apply statistical algorithms and probability models Statistics Toolbox 6 Apply statistical algorithms and probability models Statistics Toolbox provides engineers, scientists, researchers, financial analysts, and statisticians with a comprehensive set of

More information

Overlapping Astronomical Sources: Utilizing Spectral Information

Overlapping Astronomical Sources: Utilizing Spectral Information Overlapping Astronomical Sources: Utilizing Spectral Information David Jones Advisor: Xiao-Li Meng Collaborators: Vinay Kashyap (CfA) and David van Dyk (Imperial College) CHASC Astrostatistics Group April

More information

Graph Functional Methods for Climate Partitioning

Graph Functional Methods for Climate Partitioning Graph Functional Methods for Climate Partitioning Mathilde Mougeot - with D. Picard, V. Lefieux*, M. Marchand* Université Paris Diderot, France *Réseau Transport Electrique (RTE) Buenos Aires, 2015 Mathilde

More information

Inverse Covariance Estimation with Missing Data using the Concave-Convex Procedure

Inverse Covariance Estimation with Missing Data using the Concave-Convex Procedure Inverse Covariance Estimation with Missing Data using the Concave-Convex Procedure Jérôme Thai 1 Timothy Hunter 1 Anayo Akametalu 1 Claire Tomlin 1 Alex Bayen 1,2 1 Department of Electrical Engineering

More information

Directional Statistics

Directional Statistics Directional Statistics Kanti V. Mardia University of Leeds, UK Peter E. Jupp University of St Andrews, UK I JOHN WILEY & SONS, LTD Chichester New York Weinheim Brisbane Singapore Toronto Contents Preface

More information

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Andrew O. Finley 1 and Sudipto Banerjee 2 1 Department of Forestry & Department of Geography, Michigan

More information

Smoothly Clipped Absolute Deviation (SCAD) for Correlated Variables

Smoothly Clipped Absolute Deviation (SCAD) for Correlated Variables Smoothly Clipped Absolute Deviation (SCAD) for Correlated Variables LIB-MA, FSSM Cadi Ayyad University (Morocco) COMPSTAT 2010 Paris, August 22-27, 2010 Motivations Fan and Li (2001), Zou and Li (2008)

More information

Dispersion modeling for RNAseq differential analysis

Dispersion modeling for RNAseq differential analysis Dispersion modeling for RNAseq differential analysis E. Bonafede 1, F. Picard 2, S. Robin 3, C. Viroli 1 ( 1 ) univ. Bologna, ( 3 ) CNRS/univ. Lyon I, ( 3 ) INRA/AgroParisTech, Paris IBC, Victoria, July

More information

Exploratory quantile regression with many covariates: An application to adverse birth outcomes

Exploratory quantile regression with many covariates: An application to adverse birth outcomes Exploratory quantile regression with many covariates: An application to adverse birth outcomes June 3, 2011 eappendix 30 Percent of Total 20 10 0 0 1000 2000 3000 4000 5000 Birth weights efigure 1: Histogram

More information

Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation

Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation Adam J. Rothman School of Statistics University of Minnesota October 8, 2014, joint work with Liliana

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

High Dimensional Inverse Covariate Matrix Estimation via Linear Programming

High Dimensional Inverse Covariate Matrix Estimation via Linear Programming High Dimensional Inverse Covariate Matrix Estimation via Linear Programming Ming Yuan October 24, 2011 Gaussian Graphical Model X = (X 1,..., X p ) indep. N(µ, Σ) Inverse covariance matrix Σ 1 = Ω = (ω

More information

Biostatistics-Lecture 16 Model Selection. Ruibin Xi Peking University School of Mathematical Sciences

Biostatistics-Lecture 16 Model Selection. Ruibin Xi Peking University School of Mathematical Sciences Biostatistics-Lecture 16 Model Selection Ruibin Xi Peking University School of Mathematical Sciences Motivating example1 Interested in factors related to the life expectancy (50 US states,1969-71 ) Per

More information

Generalized Elastic Net Regression

Generalized Elastic Net Regression Abstract Generalized Elastic Net Regression Geoffroy MOURET Jean-Jules BRAULT Vahid PARTOVINIA This work presents a variation of the elastic net penalization method. We propose applying a combined l 1

More information

Extended Bayesian Information Criteria for Gaussian Graphical Models

Extended Bayesian Information Criteria for Gaussian Graphical Models Extended Bayesian Information Criteria for Gaussian Graphical Models Rina Foygel University of Chicago rina@uchicago.edu Mathias Drton University of Chicago drton@uchicago.edu Abstract Gaussian graphical

More information

Robust estimation of scale and covariance with P n and its application to precision matrix estimation

Robust estimation of scale and covariance with P n and its application to precision matrix estimation Robust estimation of scale and covariance with P n and its application to precision matrix estimation Garth Tarr, Samuel Müller and Neville Weber USYD 2013 School of Mathematics and Statistics THE UNIVERSITY

More information

The Expectation-Maximization Algorithm

The Expectation-Maximization Algorithm 1/29 EM & Latent Variable Models Gaussian Mixture Models EM Theory The Expectation-Maximization Algorithm Mihaela van der Schaar Department of Engineering Science University of Oxford MLE for Latent Variable

More information

Improved Bayesian Compression

Improved Bayesian Compression Improved Bayesian Compression Marco Federici University of Amsterdam marco.federici@student.uva.nl Karen Ullrich University of Amsterdam karen.ullrich@uva.nl Max Welling University of Amsterdam Canadian

More information