BMI/STAT 768 : Lecture 13 Sparse Models in Images
|
|
- Neal Blankenship
- 5 years ago
- Views:
Transcription
1 BMI/STAT 768 : Lecture 13 Sparse Models in Images Moo K. Chung mkchung@wisc.edu March 25, Why sparse models are needed? If we are interested quantifying the voxel measurements in every voxel in an image simultaneously, the standard procedure is to set up a multivariate general linear model (MGLM), which generalizes widely used univariate GLM by incorporating vector valued responses and explanatory variables [1, 12, 33, 34, 30, 6]. Hotelling s T 2 statistic is a special case of MGLM and has been mainly used for inference on surface shapes and deformations [31, 19, 4, 13, 7]. Let J n p = (J ij ) be the measurement matrix, J ij is the measurement for subject i at voxel position j. The
2 subscripts denote the dimension of matrix. We can think J ij as either Jacobian determinant, fractional anisotropy values or fmri activation. Assume there are total n subjects and p voxels of interest. The measurement vector at the j-th voxel is denoted as x j = (J 1j,, J nj ). The measurement vector for the i-th subject is denoted as y i = (J i1,, J ip ). y i, is expected to be distributed identically and independently over subjects. Note that J = (x 1,, x p ) = (y 1,, y n ). We may assume the covariance matrix of y i to be V(y 1 ) = = V(y n ) = Σ p p = (σ kl ). With these notations, we now set up the following MGLM over all subjects and across different voxel positions: J n p = X n k B k p + Z n q G q p + U n p Σ 1/2 p p, (1) where X is the matrix of contrasted explanatory variables while B is the matrix of unknown coefficients to be estimated. Nuisance covariates of non-interest are in the matrix Z and the corresponding coefficients are in the matrix G. The components of Gaussian random matrix U are independently distributed with zero mean and unit variance. The symmetric matrix Σ 1/2 is the square-root
3 of the covariance matrix accounting for the spatial dependency across different voxels. In MGLM (1), we are interested in testing the null hypothesis H 0 : B = 0. The parameter matrices in the model are estimated via the least squares method. The resulting multivariate test statistics are called the Lawley-Hotelling trace or Roy s maximum root. When there is only one voxel, i.e. p = 1, these multivariate test statistics collapses to Hotelling s T 2 statistic [34, 6]. Note that MGLM (1) is equivalent to the assumption that y i follows multivariate normal with some mean µ and covariance Σ, i.e., y i N(µ, Σ). Then neglecting constant terms, the log-likelihood function L of y i is given by L(µ, Σ) = log det Σ 1 1 n (y i µ) Σ 1 (y i µ). n i=1 By maximizing the log-likelihood, MLE of µ and Σ are given by µ = ȳ i = 1 n y i n Σ = 1 n i=1 n (y i ȳ i ) (y i ȳ i ). (2) i=
4 For a notational convenience, we can center the measurement y i such that y i y i ȳ i. We are basically centering the measurements by subtracting the group mean over subjects. Then MLE (3) can be written in a more compact form Σ = 1 n J p nj n p. (3) However, there is a serious defect with MGLM (1) and its MLE (3); namely the estimated covariance matrix Σ is positive definite only for n p [12, 29]. J J becomes rank deficient for n < p. In most imaging studies, there are more voxels than the number of subjects, i.e., n < p. When Σ is singular, we do not properly have the inverse of Σ, which is the precision matrix often needed in partial correlation based network analyses [22]. This is the main reason MGLM was rarely employed over the whole brain region and researchers are still using mostly univariate approaches in imaging studies. 1.1 Why sparse network? The majority of functional and structural connectivity studies in brain imaging are usually performed following the
5 standard analysis framework [14, 15, 10, 35]. From 3D whole brain images, n regions of interest (ROI) are identified and serve as the nodes of the brain network. Measurements at ROIs are then correlated in a pair-wise fashion to produce the connectivity matrix of size n n. The connectivity matrix is then thresholded to produce the adjacency matrix consisting of zeros and ones that define the link between two nodes. The binarized adjacency matrix is then used to construct the brain network. Then various graph complexity measures such as degree, clustering coefficients, entropy, path length, hub centrality and modularity are defined on the graph and the subsequent statistical inference is performed on these complexity measures. For a large number of nodes, simple thresholding of correlation will produce a large number of links which makes the interpretation difficult. For example, for voxels in an image, we can possibly have a total of links in the graph. For this reason we used the sparse data recovery framework in obtaining a far smaller number of significant links.
6 2 Graphical-LASSO To remedy the small n and large-p problem, the likelihood is regularized with a L1-norm penalty. If we center the measurements y i, µ = 0. So the log-likelihood can be written as L(Σ) = log det Σ 1 1 n n yi Σ 1 y i i=1 = log det Σ 1 tr ( Σ 1 S ), where S = 1 n n i=1 y i y i is the sample covariance matrix. We used the fact that the trace of a scalar value is equivalent to the scalar value itself and tr(ab) = tr(ba) for matrices A and B. We made the likelihood as a function of Σ 1 to simply emphasize that we are trying to estimate the inverse covariance matrix. To avoid the small-n large-p problem, we penalize the log-likelihood with L1-norm penalty: ( ) L(Σ) = log det Σ 1 tr Σ 1 S λ Σ 1 1, (4) where 1 is the sum of the absolute values of the elements. The penalized log-likelihood is maximized over the space of all possible symmetric positive definite matrices. (4) is a convex problem and it is usually solved
7 using the graphical-lasso (GLASSO) algorithm [3, 2, 11, 18, 25]. The tuning parameter λ > 0 controls the sparsity of the off-diagonal elements of the inverse covariance matrix. By increasing λ > 0, the estimated inverse covariance matrix becomes more sparse. To remedy this small n and large-p problem, we propose to regularize the likelihood term with L 1 -penalty and maximize the sparse likelihood: ( ) L(Σ) = log det Σ 1 tr Σ 1 S λ Σ 1, (5) where is the sum of the absolute values of the elements. The sparse-likelihood is given as a function of Σ 1 to emphasize that we are actually interested in estimating the inverse covariance. The tuning parameter λ > 0 controls the sparsity of the off-diagonal elements of the covariance matrix. Then we maximize L over the space of all possible symmetric positive definite matrices. (5) is a convex problem and we solve it using the graphical-lasso (GLASSO) algorithm [3, 11, 18]. By increasing λ, the estimated covariance matrix becomes more sparse. GLASSO is a fairly time consuming algorithm [11, 18]. Solving GLASSO for 548 nodes, for instance, takes about 6 minutes on a desktop computer. If Σ i (λ) is the estimated sparse covariance for group i at given sparse
8 parameter λ, we are usually interested in testing the equivalence of covariance matrices between the two groups at fixed λ, i.e., H 0 : Σ 1 (λ) = Σ 2 (λ). 2.1 Filtration in graphical-lasso The solution to graphical-lasso has a peculiar topological structure. Let Σ 1 (λ) = (σ ij (λ)) be the inverse covariance estimated from graphical-lasso. Let A(λ) = (a ij ) be the corresponding adjacency matrix given by { 1 if σ ij 0; a ij (λ) = (6) 0 otherwise. The adjacency matrix A induces a graph G(λ) consisting of κ(λ) number of partitioned subgraphs G(λ) = κ(λ) l=1 G l (λ) with G l = {V l (λ), A l (λ)}, where V l and A l are node and edge sets of subgraph G l. Let S = (s ij ) be the sample covariance matrix. Let B(λ) = (b ij ) be the adjacency matrix defined by { 1 if ŝ ij > λ; b ij (λ) = (7) 0 otherwise.
9 Figure 1: Left: Adjacency matrices obtained through graphical-lasso with increasing λ values. The persistent homological structure is self-evident. Right: Adjacency matrices are clustered as a block diagonal matrix D by permutation. The adjacency matrix B similarly induces a graph with τ (λ) disjoint subgraphs: τ (λ) H(λ) = [ Hl (λ) with Hl = {Wl (λ), Bl (λ)}, l=1 where Wl and Bl are node and edge sets of subgraph Hl. Then the partitioned graphs are shown to be partially nested in a sense that the node sets exhibits persistency.
10 Theorem 1 For any λ > 0, the adjacency matrices (6) and (7) induce the identical vertex partition so that κ(λ) = τ(λ) and V l (λ) = W l (λ). Further, the node sets V l and W l form filtrations over the sparse parameter: V l (λ 1 ) V l (λ 2 ) V l (λ 3 ) (8) W l (λ 1 ) W l (λ 2 ) W l (λ 3 ) (9) for λ 1 λ 2 λ 3. From (7), it is trivial to see the filtration holds for W l. The filtration for V l is proved in [18]. The equivalence of the node sets V l = W l is proved in [25]. Note that the edge sets may not form a filtration. The construction of the filtration on the node sets V l (8) is very time consuming since we have to solve the sequence of graphical-lasso. For instance, for 548 node sets and 547 different filtration values, the whole filtration takes more than 54 hours in a desktop [5]. In Figure 1, we randomly simulated the data matrix X 5 10 from the standard normal distribution. The sample covariance matrix is then feed into graphical-lasso with different filtration values. To identify the structure better, we transformed the adjacency matrix A by permutation P such that D = P AP 1 is a block diagonal matrix. Theoretically only the partitioned node sets are expected to exhibit the nestedness but in this example, the edge sets are also nested as well.
11 3 Sparse correlation network The problem with graphial-lasso or any type of similar L1 norm optimization is that it becomes computationally expensive as the number of node p increases. So it is not really practical for large-scale brain networks. For largescale brain networks, we simply recommend thresholding correlations. Here is the mathematical justification. 3.1 Correlations Consider measurement vector x j on node j. If we center and rescale the measurement x j such that x j 2 = x jx j = 1, the sample correlation between nodes i and j is given by x i x j. Since the data is normalized, the sample covariance matrix is reduced to the sample correlation matrix. Consider the following linear regression between nodes j and k (k j): x j = γ jk x k + ɛ j. (10) We are basically correlating data at node j to data at node k. In this particular case, γ jk is the usual Pearson correlation. The least squares estimation (LSE) of γ jk is then
12 given by γ jk = x jx k, (11) which is the sample correlation. For the normalized data, regression coefficient estimation is exactly the sample correlation. For the normalized and centered data, the regression coefficient is the correlation. It can be shown that (11) minimizes the sum of least squares over all nodes: p x j γ jk x k 2. (12) j=1 k j Note that we do not really care about correlating x j to itself since the correlation is then trivially γ jj = Sparse correlations Let Γ = (γ jk ) be the correlation matrix. The sparse penalized version of (12) is given by F (Γ) = 1 p p x j γ jk x k 2 +λ γ jk. (13) 2 j=1 k j j=1 k j The sparse correlation is given by minimizing F (Γ). By increasing λ, the estimated correlation matrix Γ(λ) becomes more sparse. When λ = 0, the sparse correlation
13 is simply given by the sample correlation, i.e. γ jk = x j x k. As λ increases, the correlation matrix Γ shrinks to zero and becomes more sparse. This is separable compressed sensing or LASSO type problem. However, there is no need to numerically optimize (13) using the coordinate descent learning or the active-set algorithm often used in compressed sensing [27, 11]. The minimization of (13) can be done by the proposed soft-thresholding method analytically by exploiting the topological structure of the problem. This sparse regression is not orthogonal, i.e. x i x j δ ij, the Dirac delta, so the existing soft-thresholding method for LASSO [32] is not applicable. Theorem 2 For λ 0, the solution of the following separable LASSO problem 1 γ jk (λ) = arg min γ jk 2 p j=1 k j x j γ jk x k 2 +λ p γ jk, j=1 k j is given by the soft-thresholding x j x k λ if x j x k > λ γ jk (λ) = 0 if x j x k λ. (14) x j x k + λ if x j x k < λ
14 Proof. Write (13) as F (Γ) = 1 2 p f(γ jk ), (15) j=1 k j where f(γ jk ) = x j γ jk x k 2 +2λ γ jk. Since f(γ jk ) is nonnegative and convex, F (Γ) is minimum if each component f(γ jk ) achieves minimum. So we only need to minimize each component f(γ jk ). This differentiates our sparse correlation formulation from the standard compressed sensing that cannot be optimized in this component wise fashion. f(γ jk ) can be rewritten as f(γ jk ) = x j 2 2γ jk x jx k + γ 2 jk x k 2 + 2λ γ jk = (γ jk x jx k ) 2 + 2λ γ jk + 1. We used the fact x j x j = 1. For λ = 0, the minimum of f(γ jk ) is achieved when γ jk = x j x k, which is the usual LSE. For λ > 0, Since f(γ jk ) is quadratic in γ jk, the minimum is achieved when f γ jk = 2γ jk 2x jx k ± 2λ = 0 (16)
15 The sign of λ depends on the sign of γ jk. Thus, sparse correlation γ jk is given by a soft-thresholding of x j x k: x j x k λ if x j x k > λ γ jk (λ) = 0 if x j x k λ. (17) x j x k + λ if x j x k < λ The estimated sparse correlation (17) basically thresholds the sample correlation that is larger or smaller than λ by the amount λ. Due to this simple expression, there is no need to optimize (13) numerically as often done in compressed sensing or LASSO [27, 11]. However, Theorem 2 is only applicable to separable cases and for nonseparable cases, numerical optimization is still needed. The different choices of sparsity parameter λ will produce different solutions in sparse model A(λ). Instead of analyzing each model separately, we can analyze the whole collection of all the sparse solutions for many different values of λ. This avoids the problem of identifying the optimal sparse parameter that may not be optimal in practice. The question is then how to use the collection of A(λ) in a coherent mathematical fashion. This can be addressed using persistent homology [9, 20, 21].
16 3.3 Filtration in sparse correlations Using the sparse solution (17), we can construct a filtration. We will basically build a graph G using spare correlations. Let γ jk (λ) be the sparse correlation estimate. Let A(λ) = (a ij ) be the adjacency matrix defined as { 1 if γ jk (λ) 0; a jk (λ) = 0 otherwise. This is equivalent to the adjacency matrix B = (b jk ) defined as { 1 if x j b jk (λ) = x k > λ; (18) 0 otherwise. The adjacency matrix B is simply obtained by thresholding the sample correlations. Then the adjacency matrices A and B induce a identical graph G(λ) consisting of κ(λ) number of partitioned subgraphs G(λ) = κ(λ) l=1 G l (λ) with G l = {V l (λ), E l (λ)}, where V l and E l are node and edge sets respectively. Note G l Gm = for any l m.
17 Figure 2: Jocobian determinant of deformation field are measured at 548 nodes along the white matter boundary [5]. The β0 -number (number of connected components) of the filtrations on the sample correlations and covariances show huge group separation between normal controls and post-institutionalized (PI) children. and no two nodes between the different partitions are connected. The node S and edge sets are denoted as V(λ) = S κ κ V and E(λ) = l=1 l l=1 El respectively. Then we have the following theorem: Theorem 3 The induced graph from the spare correlation forms a filtration: G(λ1 ) G(λ2 ) G(λ3 ) (19)
18 for λ 1 λ 2 λ 3. Equivalently, the node and edge sets also form filtrations as well: V(λ 1 ) V(λ 2 ) V(λ 3 ) E(λ 1 ) E(λ 2 ) E(λ 3 ). The proof can be easily obtained from the definition of adjacency matrix (18). 4 Partial correlation network Let p be the number of nodes in the network. In most applications, the number of nodes is expected to be larger than the number of observations n, which gives an underdetermined system. Consider measurement vector at the j-th node x j = (x 1j,, x nj ) consisting of n measurements. Vector x j are assumed to be distributed with mean zero and covariance Σ = (σ ij ). The correlation γ ij between the two nodes i and j is given by γ ij = σ ij. σii σ jj By thresholding the correlation, we can establish a link between two nodes. However, there is a problem with
19 this simplistic approach in that it fails to explicitly factor out the confounding effect of other nodes. To remedy this problem, partial correlations can be used in factoring out the dependency of other nodes [16, 24, 17, 18, 27]. If we denote the inverse covariance matrix as Σ 1 = (σ ij ), the partial correlation between the nodes i and j while factoring out the effect of all other nodes is given by σij ρ ij =. (20) σ ii σjj Equivalently, we can compute the partial correlation via a linear model as follows. Consider a linear model of correlating measurement at node j to all other nodes: x j = k j β jk x k + ɛ k. (21) The parameters β jk are estimated by minimizing the sum of squared residual of (21) p L(β) = β jk x k 2 (22) j=1 x j k j in a least squares fashion. If we denote the least squares estimator by β jk, the residuals are given by r j = x j k j β jk x k. (23)
20 The partial correlation is then obtained by computing the correlation between the residuals [16, 23, 27]: ρ ij = corr (r j, r j ). 4.1 Sparse partial correlations There is a serious problem with the least squares estimation framework discussed in the previous section. Since n p, this is a significantly underdetermined system. This is also related to the covariance matrix Σ being singular so we cannot just invert the covariance matrix. For this, we need sparse network modeling. The minimization of (22) is exactly given by solving the normal equation: x j = k j β jk x k, (24) which can be turned into standard linear form y = Aβ [22]. Note that (24) can be written as β j1 β j2 x j = [x 1,, x j 1, 0, x j+1,, x p ] }{{} X j. β jp, } {{ } β j
21 where 0 n 1 is a column vector of all zero entries. Then we have x 1 X β 1 x 2 0 X 2 0 β 2 =......, (25).. x p 0 0 X p β p }{{}}{{}}{{} y np 1 A np p 2 β p 2 1 where A is a block diagonal matrix and 0 n p is a matrix of all zero entries. We regularize (25) by incorporating l 1 LASSO-penalty J [32, 27, 22]: J = i,j β ij. The sparse estimation of β ij is then given by minimizing L + λj. Since there is dependency between y and A, (25) is not exactly a standard compressed sensing problem [27, 22]. It should be intuitively understood that sparsity makes the linear equation (24) less underdetermined. The larger the value of λ, the more sparse the underlying topological structure gets. Since ρ ij = β ij σ ii σ jj, the sparsity of β ij directly corresponds to the sparsity of ρ ij, which is the strength of the link between nodes i and
22 j [27, 22]. Once the sparse partial correlation matrix ρ is obtained, we can simply link nodes j and j, if ρ ij > 0 and assign the weight ρ ij to the edge. This way, we obtain the weighted graph. 4.2 Limitations However, the sparse partial correlation framework has a serious computational bottleneck. For n measurements over p nodes, it is required that we solve a linear system with an extremely large A matrix of size np p 2, so that the complexity of the problem increases by a factor of p 3! Consequently, for a large number of nodes, the problem immediately becomes almost intractable for a small computer. For example, for 1 million nodes, we have to compute 1 trillion possible pairwise relationships between nodes. One practical solution is to modify (21) so that the measurement at node i is represented more sparsely over some possible index set S i : x i = S i β ij x j + ɛ i. making the problem substantially smaller. An alternate approach is to simply follow the homotopy path, which adds network links one by one with a very limited increase of computational complexity so
23 there is no need to compute β repeatedly from scratch [8, 28, 26]. The trajectory of the optimal solution β in LASSO follows a piecewise linear path as we change λ. By tracing the linear path, we can substantially reduce the computational burden of reestimating β when λ changes. References [1] T.W. Anderson. An Introduction to Multivariate Statistical Analysis. Wiley, 2nd edition, [2] O. Banerjee, L. El Ghaoui, and A. d Aspremont. Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data. The Journal of Machine Learning Research, 9: , [3] O. Banerjee, L.E. Ghaoui, A. d Aspremont, and G. Natsoulis. Convex optimization techniques for fitting sparse Gaussian graphical models. In Proceedings of the 23rd International Conference on Machine Learning, page 96, [4] J. Cao and K.J. Worsley. The detection of local shape changes via the geometry of Hotelling s T2 fields. Annals of Statistics, 27: , 1999.
24 [5] M.K. Chung, J.L. Hanson, J. Ye, R.J. Davidson, and S.D. Pollak. Persistent homology in sparse regression and its application to brain morphometry. IEEE Transactions on Medical Imaging, 34: , [6] M.K. Chung, K.J. Worsley, M.N. Brendon, K.M. Dalton, and R.J. Davidson. General multivariate linear modeling of surface shapes using SurfStat. NeuroImage, 53: , [7] M.K. Chung, K.J. Worsley, T. Paus, D.L. Cherif, C. Collins, J. Giedd, J.L. Rapoport, and A.C. Evans. A unified statistical approach to deformation-based morphometry. NeuroImage, 14: , [8] D.L. Donoho and Y. Tsaig. Fast solution of l 1 -norm minimization problems when the solution may be sparse. Citeseer, [9] H. Edelsbrunner and J. Harer. Persistent homology - a survey. Contemporary Mathematics, 453: , [10] A. Fornito, A. Zalesky, and E.T. Bullmore. Network scaling effects in graph analytic studies of human resting-state fmri data. Frontiers in Systems Neuroscience, 4:1 16, 2010.
25 [11] J. Friedman, T. Hastie, and R. Tibshirani. Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9:432, [12] K.J. Friston, A.P. Holmes, K.J. Worsley, J.-P. Poline, C.D. Frith, and R.S.J. Frackowiak. Statistical parametric maps in functional imaging: A general linear approach. Human Brain Mapping, 2: , [13] C. Gaser, H.-P. Volz, S. Kiebel, S. Riehemann, and H. Sauer. Detecting structural changes in whole brain based on nonlinear deformations - Application to schizophrenia research. NeuroImage, 10: , [14] G. Gong, Y. He, L. Concha, C. Lebel, D.W. Gross, A.C. Evans, and C. Beaulieu. Mapping anatomical connectivity patterns of human cerebral cortex using in vivo diffusion tensor imaging tractography. Cerebral Cortex, 19: , [15] P. Hagmann, M. Kurant, X. Gigandet, P. Thiran, V.J. Wedeen, R. Meuli, and J.P. Thiran. Mapping human whole-brain structural networks with diffusion MRI. PLoS One, 2(7):e597, [16] Y. He, Z.J. Chen, and A.C. Evans. Small-world anatomical networks in the human brain revealed
26 by cortical thickness from MRI. Cerebral Cortex, 17: , [17] S. Huang, J. Li, L. Sun, J. Liu, T. Wu, K. Chen, A. Fleisher, E. Reiman, and J. Ye. Learning brain connectivity of Alzheimer s disease from neuroimaging data. In Advances in Neural Information Processing Systems, pages , [18] S. Huang, J. Li, L. Sun, J. Ye, A. Fleisher, T. Wu, K. Chen, and E. Reiman. Learning brain connectivity of Alzheimer s disease by sparse inverse covariance estimation. NeuroImage, 50: , [19] S.C. Joshi. Large Deformation Diffeomorphisms and Gaussian Random Fields for Statistical Characterization of Brain Sub-Manifolds. PhD thesis, Washington University, St. Louis, [20] H. Lee, M.K. Chung, H. Kang, B.-N. Kim, and D.S. Lee. Computing the shape of brain networks using graph filtration and Gromov-Hausdorff metric. MICCAI, Lecture Notes in Computer Science, 6892: , [21] H. Lee, H. Kang, M.K. Chung, B.-N. Kim, and D.S Lee. Persistent brain network homology from the perspective of dendrogram. IEEE Transactions on Medical Imaging, 31: , 2012.
27 [22] H. Lee, D.S.. Lee, H. Kang, B.-N. Kim, and M.K. Chung. Sparse brain network recovery under compressed sensing. IEEE Transactions on Medical Imaging, 30: , [23] J.P. Lerch, K. Worsley, W.P. Shaw, D.K. Greenstein, R.K. Lenroot, J. Giedd, and A.C. Evans. Mapping anatomical correlations across cerebral cortex (MACACC) using cortical thickness from MRI. NeuroImage, 31: , [24] G. Marrelec, A. Krainik, H. Duffau, M. Pélégrini- Issac, S. Lehéricy, J. Doyon, and H. Benali. Partial correlation for functional brain interactivity investigation in functional MRI. NeuroImage, 32: , [25] R. Mazumder and T. Hastie. Exact covariance thresholding into connected components for largescale graphical LASSO. The Journal of Machine Learning Research, 13: , [26] M.R. Osborne, B. Presnell, and B.A. Turlach. A new approach to variable selection in least squares problems. IMA Journal of Numerical Analysis, 20: , [27] J. Peng, P. Wang, N. Zhou, and J. Zhu. Partial correlation estimation by joint sparse regression mod-
28 els. Journal of the American Statistical Association, 104: , [28] M.D. Plumbley. Geometry and homotopy for l 1 sparse representations. Proceedings of SPARS, 5: , [29] J. Schäfer and K. Strimmer. A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics. Statistical Applications in Genetics and Molecular Biology, 4:32, [30] J.E. Taylor and K.J. Worsley. Random fields of multivariate test statistics, with applications to shape analysis. Annals of Statistics, 36:1 27, [31] P.M. Thompson, D. MacDonald, M.S. Mega, C.J. Holmes, A.C. Evans, and A.W Toga. Detection and mapping of abnormal brain structure with a probabilistic atlas of cortical surfaces. Journal of Computer Assisted Tomography, 21: , [32] R. Tibshirani. Regression shrinkage and selection via the LASSO. Journal of the Royal Statistical Society. Series B (Methodological), 58: , 1996.
29 [33] K.J. Worsley, S. Marrett, P. Neelin, A.C. Vandal, K.J. Friston, and A.C. Evans. A unified statistical approach for determining significant signals in images of cerebral activation. Human Brain Mapping, 4:58 73, [34] K.J. Worsley, J.E. Taylor, F. Tomaiuolo, and J. Lerch. Unified univariate and multivariate random field theory. NeuroImage, 23:S , [35] A. Zalesky, A. Fornito, I.H. Harding, L. Cocchi, M. Yücel, C. Pantelis, and E.T. Bullmore. Wholebrain anatomical networks: Does the choice of nodes matter? NeuroImage, 50: , 2010.
1 Introduction. Moo K. Chung 1, Jamie L. Hanson 1, Hyekyung Lee 2, Nagesh Adluru 1, Andrew L. Alexander 1, Richard J. Davidson 1, Seth D.
Exploiting Hidden Persistent Structures in Multivariate Tensor-Based Morphometry and Its Application to Detecting White Matter Abnormality in Maltreated Children Moo K. Chung 1, Jamie L. Hanson 1, Hyekyung
More informationBMI/STAT 768: Lecture 09 Statistical Inference on Trees
BMI/STAT 768: Lecture 09 Statistical Inference on Trees Moo K. Chung mkchung@wisc.edu March 1, 2018 This lecture follows the lecture on Trees. 1 Inference on MST In medical imaging, minimum spanning trees
More informationBMI/STAT 768: Lecture 04 Correlations in Metric Spaces
BMI/STAT 768: Lecture 04 Correlations in Metric Spaces Moo K. Chung mkchung@wisc.edu February 1, 2018 The elementary statistical treatment on correlations can be found in [4]: http://www.stat.wisc.edu/
More informationarxiv: v2 [stat.me] 9 Mar 2015
Persistent Homology in Sparse Regression and Its Application to Brain Morphometry arxiv:1409.0177v2 [stat.me] 9 Mar 2015 Moo K. Chung, Jamie L. Hanson, Jieping Ye, Richard J. Davidson, Seth D. Pollak January
More informationAn unbiased estimator for the roughness of a multivariate Gaussian random field
An unbiased estimator for the roughness of a multivariate Gaussian random field K.J. Worsley July 28, 2000 Abstract Images from positron emission tomography (PET) and functional magnetic resonance imaging
More informationNeuroimage Processing
Neuroimage Processing Instructor: Moo K. Chung mkchung@wisc.edu Lecture 2. General Linear Models (GLM) Multivariate General Linear Models (MGLM) September 11, 2009 Research Projects If you have your own
More informationBAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage
BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage Lingrui Gan, Naveen N. Narisetty, Feng Liang Department of Statistics University of Illinois at Urbana-Champaign Problem Statement
More informationBig & Quic: Sparse Inverse Covariance Estimation for a Million Variables
for a Million Variables Cho-Jui Hsieh The University of Texas at Austin NIPS Lake Tahoe, Nevada Dec 8, 2013 Joint work with M. Sustik, I. Dhillon, P. Ravikumar and R. Poldrack FMRI Brain Analysis Goal:
More informationUnified univariate and multivariate random field theory
Unified univariate and multivariate random field theory Keith J. Worsley 1, Jonathan E. Taylor 3, Francesco Tomaiuolo 4, Jason Lerch 1 Department of Mathematics and Statistics, and Montreal Neurological
More informationPermutation-invariant regularization of large covariance matrices. Liza Levina
Liza Levina Permutation-invariant covariance regularization 1/42 Permutation-invariant regularization of large covariance matrices Liza Levina Department of Statistics University of Michigan Joint work
More information25 : Graphical induced structured input/output models
10-708: Probabilistic Graphical Models 10-708, Spring 2016 25 : Graphical induced structured input/output models Lecturer: Eric P. Xing Scribes: Raied Aljadaany, Shi Zong, Chenchen Zhu Disclaimer: A large
More informationGaussian Graphical Models and Graphical Lasso
ELE 538B: Sparsity, Structure and Inference Gaussian Graphical Models and Graphical Lasso Yuxin Chen Princeton University, Spring 2017 Multivariate Gaussians Consider a random vector x N (0, Σ) with pdf
More informationCoordinate descent. Geoff Gordon & Ryan Tibshirani Optimization /
Coordinate descent Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Adding to the toolbox, with stats and ML in mind We ve seen several general and useful minimization tools First-order methods
More informationProperties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation
Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation Adam J. Rothman School of Statistics University of Minnesota October 8, 2014, joint work with Liliana
More informationApproximation. Inderjit S. Dhillon Dept of Computer Science UT Austin. SAMSI Massive Datasets Opening Workshop Raleigh, North Carolina.
Using Quadratic Approximation Inderjit S. Dhillon Dept of Computer Science UT Austin SAMSI Massive Datasets Opening Workshop Raleigh, North Carolina Sept 12, 2012 Joint work with C. Hsieh, M. Sustik and
More informationFrist order optimization methods for sparse inverse covariance selection
Frist order optimization methods for sparse inverse covariance selection Katya Scheinberg Lehigh University ISE Department (joint work with D. Goldfarb, Sh. Ma, I. Rish) Introduction l l l l l l The field
More informationMATH 829: Introduction to Data Mining and Analysis Graphical Models II - Gaussian Graphical Models
1/13 MATH 829: Introduction to Data Mining and Analysis Graphical Models II - Gaussian Graphical Models Dominique Guillot Departments of Mathematical Sciences University of Delaware May 4, 2016 Recall
More informationMATH 829: Introduction to Data Mining and Analysis Graphical Models III - Gaussian Graphical Models (cont.)
1/12 MATH 829: Introduction to Data Mining and Analysis Graphical Models III - Gaussian Graphical Models (cont.) Dominique Guillot Departments of Mathematical Sciences University of Delaware May 6, 2016
More information17th Annual Meeting of the Organization for Human Brain Mapping. Multivariate cortical shape modeling based on sparse representation
17th Annual Meeting of the Organization for Human Brain Mapping Multivariate cortical shape modeling based on sparse representation Abstract No: 2207 Authors: Seongho Seo 1, Moo K. Chung 1,2, Kim M. Dalton
More informationDeformation Morphometry: Basics and Applications
Deformation Morphometry: Basics and Applications Valerie Cardenas Nicolson, Ph.D. Assistant Adjunct Professor NCIRE, UCSF, SFVA Center for Imaging of Neurodegenerative Diseases VA Challenge Clinical studies
More informationChapter 17: Undirected Graphical Models
Chapter 17: Undirected Graphical Models The Elements of Statistical Learning Biaobin Jiang Department of Biological Sciences Purdue University bjiang@purdue.edu October 30, 2014 Biaobin Jiang (Purdue)
More informationSparse inverse covariance estimation with the lasso
Sparse inverse covariance estimation with the lasso Jerome Friedman Trevor Hastie and Robert Tibshirani November 8, 2007 Abstract We consider the problem of estimating sparse graphs by a lasso penalty
More informationSparse Inverse Covariance Estimation for a Million Variables
Sparse Inverse Covariance Estimation for a Million Variables Inderjit S. Dhillon Depts of Computer Science & Mathematics The University of Texas at Austin SAMSI LDHD Opening Workshop Raleigh, North Carolina
More informationAn efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss
An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss arxiv:1811.04545v1 [stat.co] 12 Nov 2018 Cheng Wang School of Mathematical Sciences, Shanghai Jiao
More informationA New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables
A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables Niharika Gauraha and Swapan Parui Indian Statistical Institute Abstract. We consider the problem of
More informationSpectral Perturbation of Small-World Networks with Application to Brain Disease Detection
Spectral Perturbation of Small-World Networks with Application to Brain Disease Detection Chenhui Hu May 4, 22 Introduction Many real life systems can be described by complex networks, which usually consist
More informationLecture 2 Part 1 Optimization
Lecture 2 Part 1 Optimization (January 16, 2015) Mu Zhu University of Waterloo Need for Optimization E(y x), P(y x) want to go after them first, model some examples last week then, estimate didn t discuss
More informationLinear Regression (9/11/13)
STA561: Probabilistic machine learning Linear Regression (9/11/13) Lecturer: Barbara Engelhardt Scribes: Zachary Abzug, Mike Gloudemans, Zhuosheng Gu, Zhao Song 1 Why use linear regression? Figure 1: Scatter
More informationThe General Linear Model. Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London
The General Linear Model Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM Course Lausanne, April 2012 Image time-series Spatial filter Design matrix Statistical Parametric
More information25 : Graphical induced structured input/output models
10-708: Probabilistic Graphical Models 10-708, Spring 2013 25 : Graphical induced structured input/output models Lecturer: Eric P. Xing Scribes: Meghana Kshirsagar (mkshirsa), Yiwen Chen (yiwenche) 1 Graph
More informationSparse representation classification and positive L1 minimization
Sparse representation classification and positive L1 minimization Cencheng Shen Joint Work with Li Chen, Carey E. Priebe Applied Mathematics and Statistics Johns Hopkins University, August 5, 2014 Cencheng
More informationPopulation Based Analysis of Directional Information in Serial Deformation Tensor Morphometry
Population Based Analysis of Directional Information in Serial Deformation Tensor Morphometry Colin Studholme 1,2 and Valerie Cardenas 1,2 1 Department of Radiiology, University of California San Francisco,
More informationIntroduction to General Linear Models
Introduction to General Linear Models Moo K. Chung University of Wisconsin-Madison mkchung@wisc.edu September 27, 2014 In this chapter, we introduce general linear models (GLM) that have been widely used
More informationHigh-dimensional covariance estimation based on Gaussian graphical models
High-dimensional covariance estimation based on Gaussian graphical models Shuheng Zhou Department of Statistics, The University of Michigan, Ann Arbor IMA workshop on High Dimensional Phenomena Sept. 26,
More informationSparse regression. Optimization-Based Data Analysis. Carlos Fernandez-Granda
Sparse regression Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 3/28/2016 Regression Least-squares regression Example: Global warming Logistic
More informationThe Nonparanormal skeptic
The Nonpara skeptic Han Liu Johns Hopkins University, 615 N. Wolfe Street, Baltimore, MD 21205 USA Fang Han Johns Hopkins University, 615 N. Wolfe Street, Baltimore, MD 21205 USA Ming Yuan Georgia Institute
More informationSparse Covariance Selection using Semidefinite Programming
Sparse Covariance Selection using Semidefinite Programming A. d Aspremont ORFE, Princeton University Joint work with O. Banerjee, L. El Ghaoui & G. Natsoulis, U.C. Berkeley & Iconix Pharmaceuticals Support
More informationMultivariate Normal Models
Case Study 3: fmri Prediction Graphical LASSO Machine Learning/Statistics for Big Data CSE599C1/STAT592, University of Washington Emily Fox February 26 th, 2013 Emily Fox 2013 1 Multivariate Normal Models
More informationMultivariate Normal Models
Case Study 3: fmri Prediction Coping with Large Covariances: Latent Factor Models, Graphical Models, Graphical LASSO Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox February
More informationSparse Gaussian conditional random fields
Sparse Gaussian conditional random fields Matt Wytock, J. ico Kolter School of Computer Science Carnegie Mellon University Pittsburgh, PA 53 {mwytock, zkolter}@cs.cmu.edu Abstract We propose sparse Gaussian
More informationLASSO Review, Fused LASSO, Parallel LASSO Solvers
Case Study 3: fmri Prediction LASSO Review, Fused LASSO, Parallel LASSO Solvers Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade May 3, 2016 Sham Kakade 2016 1 Variable
More informationA Blockwise Descent Algorithm for Group-penalized Multiresponse and Multinomial Regression
A Blockwise Descent Algorithm for Group-penalized Multiresponse and Multinomial Regression Noah Simon Jerome Friedman Trevor Hastie November 5, 013 Abstract In this paper we purpose a blockwise descent
More informationA Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models
A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models Jingyi Jessica Li Department of Statistics University of California, Los
More informationProbabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms
Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms François Caron Department of Statistics, Oxford STATLEARN 2014, Paris April 7, 2014 Joint work with Adrien Todeschini,
More informationGraphical Model Selection
May 6, 2013 Trevor Hastie, Stanford Statistics 1 Graphical Model Selection Trevor Hastie Stanford University joint work with Jerome Friedman, Rob Tibshirani, Rahul Mazumder and Jason Lee May 6, 2013 Trevor
More informationVariables. Cho-Jui Hsieh The University of Texas at Austin. ICML workshop on Covariance Selection Beijing, China June 26, 2014
for a Million Variables Cho-Jui Hsieh The University of Texas at Austin ICML workshop on Covariance Selection Beijing, China June 26, 2014 Joint work with M. Sustik, I. Dhillon, P. Ravikumar, R. Poldrack,
More informationNeuroimage Processing
Neuroimage Processing Instructor: Moo K. Chung mkchung@wisc.edu Lecture 10-11. Deformation-based morphometry (DBM) Tensor-based morphometry (TBM) November 13, 2009 Image Registration Process of transforming
More informationCSC 576: Variants of Sparse Learning
CSC 576: Variants of Sparse Learning Ji Liu Department of Computer Science, University of Rochester October 27, 205 Introduction Our previous note basically suggests using l norm to enforce sparsity in
More informationRobust Principal Component Analysis
ELE 538B: Mathematics of High-Dimensional Data Robust Principal Component Analysis Yuxin Chen Princeton University, Fall 2018 Disentangling sparse and low-rank matrices Suppose we are given a matrix M
More informationRobust and sparse Gaussian graphical modelling under cell-wise contamination
Robust and sparse Gaussian graphical modelling under cell-wise contamination Shota Katayama 1, Hironori Fujisawa 2 and Mathias Drton 3 1 Tokyo Institute of Technology, Japan 2 The Institute of Statistical
More informationSometimes the domains X and Z will be the same, so this might be written:
II. MULTIVARIATE CALCULUS The first lecture covered functions where a single input goes in, and a single output comes out. Most economic applications aren t so simple. In most cases, a number of variables
More informationECE521 lecture 4: 19 January Optimization, MLE, regularization
ECE521 lecture 4: 19 January 2017 Optimization, MLE, regularization First four lectures Lectures 1 and 2: Intro to ML Probability review Types of loss functions and algorithms Lecture 3: KNN Convexity
More informationSparse and Locally Constant Gaussian Graphical Models
Sparse and Locally Constant Gaussian Graphical Models Jean Honorio, Luis Ortiz, Dimitris Samaras Department of Computer Science Stony Brook University Stony Brook, NY 794 {jhonorio,leortiz,samaras}@cs.sunysb.edu
More informationBiostatistics Advanced Methods in Biostatistics IV
Biostatistics 140.754 Advanced Methods in Biostatistics IV Jeffrey Leek Assistant Professor Department of Biostatistics jleek@jhsph.edu Lecture 12 1 / 36 Tip + Paper Tip: As a statistician the results
More informationThe lasso, persistence, and cross-validation
The lasso, persistence, and cross-validation Daniel J. McDonald Department of Statistics Indiana University http://www.stat.cmu.edu/ danielmc Joint work with: Darren Homrighausen Colorado State University
More informationProbabilistic Graphical Models
School of Computer Science Probabilistic Graphical Models Gaussian graphical models and Ising models: modeling networks Eric Xing Lecture 0, February 7, 04 Reading: See class website Eric Xing @ CMU, 005-04
More informationMorphometry. John Ashburner. Wellcome Trust Centre for Neuroimaging, 12 Queen Square, London, UK. Voxel-Based Morphometry
Morphometry John Ashburner Wellcome Trust Centre for Neuroimaging, 12 Queen Square, London, UK. Overview Voxel-Based Morphometry Morphometry in general Volumetrics VBM preprocessing followed by SPM Tissue
More informationMultivariate Regression Generalized Likelihood Ratio Tests for FMRI Activation
Multivariate Regression Generalized Likelihood Ratio Tests for FMRI Activation Daniel B Rowe Division of Biostatistics Medical College of Wisconsin Technical Report 40 November 00 Division of Biostatistics
More informationHigh-dimensional Covariance Estimation Based On Gaussian Graphical Models
High-dimensional Covariance Estimation Based On Gaussian Graphical Models Shuheng Zhou, Philipp Rutimann, Min Xu and Peter Buhlmann February 3, 2012 Problem definition Want to estimate the covariance matrix
More informationCOMS 4721: Machine Learning for Data Science Lecture 6, 2/2/2017
COMS 4721: Machine Learning for Data Science Lecture 6, 2/2/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University UNDERDETERMINED LINEAR EQUATIONS We
More informationOWL to the rescue of LASSO
OWL to the rescue of LASSO IISc IBM day 2018 Joint Work R. Sankaran and Francis Bach AISTATS 17 Chiranjib Bhattacharyya Professor, Department of Computer Science and Automation Indian Institute of Science,
More informationA. Motivation To motivate the analysis of variance framework, we consider the following example.
9.07 ntroduction to Statistics for Brain and Cognitive Sciences Emery N. Brown Lecture 14: Analysis of Variance. Objectives Understand analysis of variance as a special case of the linear model. Understand
More informationAn Introduction to Graphical Lasso
An Introduction to Graphical Lasso Bo Chang Graphical Models Reading Group May 15, 2015 Bo Chang (UBC) Graphical Lasso May 15, 2015 1 / 16 Undirected Graphical Models An undirected graph, each vertex represents
More informationGeneralized Elastic Net Regression
Abstract Generalized Elastic Net Regression Geoffroy MOURET Jean-Jules BRAULT Vahid PARTOVINIA This work presents a variation of the elastic net penalization method. We propose applying a combined l 1
More informationLearning Markov Network Structure using Brownian Distance Covariance
arxiv:.v [stat.ml] Jun 0 Learning Markov Network Structure using Brownian Distance Covariance Ehsan Khoshgnauz May, 0 Abstract In this paper, we present a simple non-parametric method for learning the
More informationA direct formulation for sparse PCA using semidefinite programming
A direct formulation for sparse PCA using semidefinite programming A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley Available online at www.princeton.edu/~aspremon
More informationSparse Permutation Invariant Covariance Estimation: Motivation, Background and Key Results
Sparse Permutation Invariant Covariance Estimation: Motivation, Background and Key Results David Prince Biostat 572 dprince3@uw.edu April 19, 2012 David Prince (UW) SPICE April 19, 2012 1 / 11 Electronic
More informationCS598 Machine Learning in Computational Biology (Lecture 5: Matrix - part 2) Professor Jian Peng Teaching Assistant: Rongda Zhu
CS598 Machine Learning in Computational Biology (Lecture 5: Matrix - part 2) Professor Jian Peng Teaching Assistant: Rongda Zhu Feature engineering is hard 1. Extract informative features from domain knowledge
More informationLecture 25: November 27
10-725: Optimization Fall 2012 Lecture 25: November 27 Lecturer: Ryan Tibshirani Scribes: Matt Wytock, Supreeth Achar Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have
More informationProximity-Based Anomaly Detection using Sparse Structure Learning
Proximity-Based Anomaly Detection using Sparse Structure Learning Tsuyoshi Idé (IBM Tokyo Research Lab) Aurelie C. Lozano, Naoki Abe, and Yan Liu (IBM T. J. Watson Research Center) 2009/04/ SDM 2009 /
More informationHYBRID PERMUTATION TEST WITH APPLICATION TO SURFACE SHAPE ANALYSIS
Statistica Sinica 8(008), 553-568 HYBRID PERMUTATION TEST WITH APPLICATION TO SURFACE SHAPE ANALYSIS Chunxiao Zhou and Yongmei Michelle Wang University of Illinois at Urbana-Champaign Abstract: This paper
More informationLinear Regression. Aarti Singh. Machine Learning / Sept 27, 2010
Linear Regression Aarti Singh Machine Learning 10-701/15-781 Sept 27, 2010 Discrete to Continuous Labels Classification Sports Science News Anemic cell Healthy cell Regression X = Document Y = Topic X
More informationComputational Brain Anatomy
Computational Brain Anatomy John Ashburner Wellcome Trust Centre for Neuroimaging, 12 Queen Square, London, UK. Overview Voxel-Based Morphometry Morphometry in general Volumetrics VBM preprocessing followed
More informationTesting for group differences in brain functional connectivity
Testing for group differences in brain functional connectivity Junghi Kim, Wei Pan, for ADNI Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 Banff Feb
More informationLearning discrete graphical models via generalized inverse covariance matrices
Learning discrete graphical models via generalized inverse covariance matrices Duzhe Wang, Yiming Lv, Yongjoon Kim, Young Lee Department of Statistics University of Wisconsin-Madison {dwang282, lv23, ykim676,
More informationTopology identification via growing a Chow-Liu tree network
2018 IEEE Conference on Decision and Control (CDC) Miami Beach, FL, USA, Dec. 17-19, 2018 Topology identification via growing a Chow-Liu tree network Sepideh Hassan-Moghaddam and Mihailo R. Jovanović Abstract
More informationNonparametric regression for topology. applied to brain imaging data
, applied to brain imaging data Cleveland State University October 15, 2010 Motivation from Brain Imaging MRI Data Topology Statistics Application MRI Data Topology Statistics Application Cortical surface
More informationThe picasso Package for Nonconvex Regularized M-estimation in High Dimensions in R
The picasso Package for Nonconvex Regularized M-estimation in High Dimensions in R Xingguo Li Tuo Zhao Tong Zhang Han Liu Abstract We describe an R package named picasso, which implements a unified framework
More informationA Short Introduction to the Lasso Methodology
A Short Introduction to the Lasso Methodology Michael Gutmann sites.google.com/site/michaelgutmann University of Helsinki Aalto University Helsinki Institute for Information Technology March 9, 2016 Michael
More informationMultivariate Statistical Analysis of Deformation Momenta Relating Anatomical Shape to Neuropsychological Measures
Multivariate Statistical Analysis of Deformation Momenta Relating Anatomical Shape to Neuropsychological Measures Nikhil Singh, Tom Fletcher, Sam Preston, Linh Ha, J. Stephen Marron, Michael Wiener, and
More informationBi-level feature selection with applications to genetic association
Bi-level feature selection with applications to genetic association studies October 15, 2008 Motivation In many applications, biological features possess a grouping structure Categorical variables may
More informationHigh Dimensional Inverse Covariate Matrix Estimation via Linear Programming
High Dimensional Inverse Covariate Matrix Estimation via Linear Programming Ming Yuan October 24, 2011 Gaussian Graphical Model X = (X 1,..., X p ) indep. N(µ, Σ) Inverse covariance matrix Σ 1 = Ω = (ω
More informationOptimization Problems
Optimization Problems The goal in an optimization problem is to find the point at which the minimum (or maximum) of a real, scalar function f occurs and, usually, to find the value of the function at that
More informationThe lasso: some novel algorithms and applications
1 The lasso: some novel algorithms and applications Newton Institute, June 25, 2008 Robert Tibshirani Stanford University Collaborations with Trevor Hastie, Jerome Friedman, Holger Hoefling, Gen Nowak,
More informationHuman Brain Networks. Aivoaakkoset BECS-C3001"
Human Brain Networks Aivoaakkoset BECS-C3001" Enrico Glerean (MSc), Brain & Mind Lab, BECS, Aalto University" www.glerean.com @eglerean becs.aalto.fi/bml enrico.glerean@aalto.fi" Why?" 1. WHY BRAIN NETWORKS?"
More informationAn Homotopy Algorithm for the Lasso with Online Observations
An Homotopy Algorithm for the Lasso with Online Observations Pierre J. Garrigues Department of EECS Redwood Center for Theoretical Neuroscience University of California Berkeley, CA 94720 garrigue@eecs.berkeley.edu
More informationDetecting fmri activation allowing for unknown latency of the hemodynamic response
Detecting fmri activation allowing for unknown latency of the hemodynamic response K.J. Worsley McGill University J.E. Taylor Stanford University January 7, 006 Abstract Several authors have suggested
More informationOn Algorithms for Solving Least Squares Problems under an L 1 Penalty or an L 1 Constraint
On Algorithms for Solving Least Squares Problems under an L 1 Penalty or an L 1 Constraint B.A. Turlach School of Mathematics and Statistics (M19) The University of Western Australia 35 Stirling Highway,
More informationhttps://goo.gl/kfxweg KYOTO UNIVERSITY Statistical Machine Learning Theory Sparsity Hisashi Kashima kashima@i.kyoto-u.ac.jp DEPARTMENT OF INTELLIGENCE SCIENCE AND TECHNOLOGY 1 KYOTO UNIVERSITY Topics:
More informationChapter 3. Linear Models for Regression
Chapter 3. Linear Models for Regression Wei Pan Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 Email: weip@biostat.umn.edu PubH 7475/8475 c Wei Pan Linear
More informationTractable Upper Bounds on the Restricted Isometry Constant
Tractable Upper Bounds on the Restricted Isometry Constant Alex d Aspremont, Francis Bach, Laurent El Ghaoui Princeton University, École Normale Supérieure, U.C. Berkeley. Support from NSF, DHS and Google.
More informationStructure estimation for Gaussian graphical models
Faculty of Science Structure estimation for Gaussian graphical models Steffen Lauritzen, University of Copenhagen Department of Mathematical Sciences Minikurs TUM 2016 Lecture 3 Slide 1/48 Overview of
More informationRobust Inverse Covariance Estimation under Noisy Measurements
.. Robust Inverse Covariance Estimation under Noisy Measurements Jun-Kun Wang, Shou-De Lin Intel-NTU, National Taiwan University ICML 2014 1 / 30 . Table of contents Introduction.1 Introduction.2 Related
More informationPathwise coordinate optimization
Stanford University 1 Pathwise coordinate optimization Jerome Friedman, Trevor Hastie, Holger Hoefling, Robert Tibshirani Stanford University Acknowledgements: Thanks to Stephen Boyd, Michael Saunders,
More informationFunctional Connectivity and Network Methods
18/Sep/2013" Functional Connectivity and Network Methods with functional magnetic resonance imaging" Enrico Glerean (MSc), Brain & Mind Lab, BECS, Aalto University" www.glerean.com @eglerean becs.aalto.fi/bml
More informationLearning Multiple Tasks with a Sparse Matrix-Normal Penalty
Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Yi Zhang and Jeff Schneider NIPS 2010 Presented by Esther Salazar Duke University March 25, 2011 E. Salazar (Reading group) March 25, 2011 1
More informationMSA220/MVE440 Statistical Learning for Big Data
MSA220/MVE440 Statistical Learning for Big Data Lecture 7/8 - High-dimensional modeling part 1 Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Classification
More informationAssociation studies and regression
Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration
More information27: Case study with popular GM III. 1 Introduction: Gene association mapping for complex diseases 1
10-708: Probabilistic Graphical Models, Spring 2015 27: Case study with popular GM III Lecturer: Eric P. Xing Scribes: Hyun Ah Song & Elizabeth Silver 1 Introduction: Gene association mapping for complex
More information