Learning graphs from data:
|
|
- Brice Patterson
- 6 years ago
- Views:
Transcription
1 Learning graphs from data: A signal processing perspective Xiaowen Dong MIT Media Lab Graph Signal Processing Workshop Pittsburgh, PA, May 2017
2 Introduction What is the problem of graph learning? 2/34
3 Introduction What is the problem of graph learning? - Given observations on a number of variables and some prior knowledge (distribution, model, etc) # samples M N # variables X 2/34
4 Introduction What is the problem of graph learning? - Given observations on a number of variables and some prior knowledge (distribution, model, etc) - Build/learn a measure of relations between variables (correlation/covariance, graph topology/operator or equivalent) # samples M N # variables X 2/34
5 Introduction What is the problem of graph learning? - Given observations on a number of variables and some prior knowledge (distribution, model, etc) - Build/learn a measure of relations between variables (correlation/covariance, graph topology/operator or equivalent) # samples M N... # variables X x : V! R N v 2 v5 v 8 v 9 v v 6 7 negative positive 2/34
6 Introduction What is the problem of graph learning? - Given observations on a number of variables and some prior knowledge (distribution, model, etc) - Build/learn a measure of relations between variables (correlation/covariance, graph topology/operator or equivalent) # samples M N... # variables X x : V! R N v 2 v5 v 8 v 9 v v 6 7 negative positive 2/34
7 Introduction What is the problem of graph learning? - Given observations on a number of variables and some prior knowledge (distribution, model, etc) - Build/learn a measure of relations between variables (correlation/covariance, graph topology/operator or equivalent) # samples M N... # variables X x : V! R N v 2 v5 v 8 v 9 v v 6 7 negative positive 2/34
8 Introduction Why is it important? - Learning relations between entities benefits numerous application domains 3/34
9 Introduction Why is it important? - Learning relations between entities benefits numerous application domains Objective: functional connectivity between brain regions Objective: behavioral similarity/ influence between people Input: fmri recordings in these regions Input: individual history of activities image credit: 3/34
10 Introduction Why is it important? - Learning relations between entities benefits numerous application domains - The learned relations can help us predict future observations Objective: functional connectivity between brain regions Objective: behavioral similarity/ influence between people Input: fmri recordings in these regions Input: individual history of activities image credit: 3/34
11 Introduction Why is it important? - Learning relations between entities benefits numerous application domains - The learned relations can help us predict future observations Objective: functional connectivity between brain regions Objective: behavioral similarity/ influence between people Input: fmri recordings in these regions Input: individual history of activities How do we build/learn the graph? image credit: 3/34
12 Outline A (partial) historic overview A signal processing perspective - GSP idea for graph learning - Three signal/graph models Perspective 4/34
13 A (partial) historical overview Simple and intuitive methods - Sample correlation - Similarity function (e.g., Gaussian RBF) 5/34
14 A (partial) historical overview Simple and intuitive methods - Sample correlation - Similarity function (e.g., Gaussian RBF) Learning graphical models Undirected graphical models: Markov random fields (MRF) 5/34
15 A (partial) historical overview Simple and intuitive methods - Sample correlation - Similarity function (e.g., Gaussian RBF) Learning graphical models Undirected graphical models: Markov random fields (MRF) Directed graphical models: Bayesian networks (BN) 5/34
16 A (partial) historical overview Simple and intuitive methods - Sample correlation - Similarity function (e.g., Gaussian RBF) Learning graphical models Undirected graphical models: Markov random fields (MRF) Directed graphical models: Bayesian networks (BN) Factor graphs 5/34
17 A (partial) historical overview Simple and intuitive methods - Sample correlation - Similarity function (e.g., Gaussian RBF) Learning graphical models Undirected graphical models: Markov random fields (MRF) Directed graphical models: Bayesian networks (BN) Factor graphs 5/34
18 A (partial) historical overview Learning pairwise MRF 6/34
19 A (partial) historical overview Learning pairwise MRF conditional independence: (i, j) /2 E, x i? x j x \{x i,x j } x 3 x 4 x 2 x 5 x 1 6/34
20 A (partial) historical overview Learning pairwise MRF conditional independence: (i, j) /2 E, x i? x j x \{x i,x j } x 3 x 4 probability parameterized by : x 2 x 5 P (x ) = 1 Z( ) exp X i2v ii x 2 i + X (i,j)2e ij x i x j x 1 6/34
21 A (partial) historical overview Learning pairwise MRF conditional independence: (i, j) /2 E, x i? x j x \{x i,x j } x 3 x 4 probability parameterized by : x 2 x 5 P (x ) = 1 Z( ) exp X i2v ii x 2 i + X (i,j)2e ij x i x j x 1 Gaussian graphical models with precision : P (x ) = 1/2 (2 ) N/2 exp 1 2 xt x 6/34
22 A (partial) historical overview Learning pairwise MRF conditional independence: (i, j) /2 E, x i? x j x \{x i,x j } x 3 x 4 probability parameterized by : x 2 x 5 P (x ) = 1 Z( ) exp X i2v ii x 2 i + X (i,j)2e ij x i x j x 1 Gaussian graphical models with precision : P (x ) = 1/2 (2 ) N/2 exp 1 2 xt x Learning a sparse : - interactions are mostly local - computationally more tractable 6/34
23 A (partial) historical overview covariance selection Dempster 1972 Prune the smallest elements in precision (inverse covariance) matrix 7/34
24 A (partial) historical overview covariance selection Dempster 1972 Prune the smallest elements in precision (inverse covariance) matrix groundtruth precision X N (0, ) data matrix S 1 inverse of sample covariance 7/34
25 A (partial) historical overview covariance selection Dempster 1972 Prune the smallest elements in precision (inverse covariance) matrix X N (0, ) S 1 groundtruth precision data matrix inverse of sample covariance Not applicable when sample covariance is not invertible! 7/34
26 A (partial) historical overview covariance selection Dempster `1-regularized neighborhood regression Meinshausen & Buhlmann Learning a graph = learning neighborhood of each node 8/34
27 A (partial) historical overview covariance selection Dempster `1-regularized neighborhood regression Meinshausen & Buhlmann Learning a graph = learning neighborhood of each node X 1 T X \1 T /34
28 A (partial) historical overview covariance selection Dempster `1-regularized neighborhood regression Meinshausen & Buhlmann Learning a graph = learning neighborhood of each node X 1 T 14 X \1 T 12 LASSO regression: min X 1 X \ /34
29 A (partial) historical overview covariance selection Dempster `1-regularized neighborhood regression Meinshausen & Buhlmann `1 -regularized log-determinant Banerjee Friedman Estimation of sparse precision matrix X S 9/34
30 A (partial) historical overview covariance selection Dempster `1-regularized neighborhood regression Meinshausen & Buhlmann `1 -regularized log-determinant Banerjee Friedman Estimation of sparse precision matrix graphical LASSO maximizes likelihood of precision matrix : X S M/2 exp MX m=1 1 2 X(m)T X(m) 9/34
31 A (partial) historical overview covariance selection Dempster `1-regularized neighborhood regression Meinshausen & Buhlmann `1 -regularized log-determinant Banerjee Friedman Estimation of sparse precision matrix X S graphical LASSO maximizes likelihood of precision matrix : max log det tr(s ) 1 log-likelihood function 9/34
32 A (partial) historical overview covariance selection `1-regularized neighborhood regression `1 -regularized log-determinant quadratic approx. of Gauss. neg. log-likelihood Dempster Meinshausen & Buhlmann Banerjee Friedman Hsieh Estimation of sparse precision matrix X S graphical LASSO maximizes likelihood of precision matrix : max log det tr(s ) 1 log-likelihood function 9/34
33 A (partial) historical overview covariance selection `1-regularized neighborhood regression `1 -regularized log-determinant `1-regularized logistic regression quadratic approx. of Gauss. neg. log-likelihood Dempster Meinshausen & Buhlmann Banerjee Friedman Ravikumar Hsieh Neighborhood learning for discrete variables /34
34 A (partial) historical overview covariance selection `1-regularized neighborhood regression `1 -regularized log-determinant `1-regularized logistic regression quadratic approx. of Gauss. neg. log-likelihood Dempster Meinshausen & Buhlmann Banerjee Friedman Ravikumar Hsieh Neighborhood learning for discrete variables m X 1m X \1m /34
35 A (partial) historical overview covariance selection `1-regularized neighborhood regression `1 -regularized log-determinant `1-regularized logistic regression quadratic approx. of Gauss. neg. log-likelihood Dempster Meinshausen & Buhlmann Banerjee Friedman Ravikumar Hsieh Neighborhood learning for discrete variables m X 1m X \1m regularized logistic regression: max log P (X 1m X \1m ) logistic function 10/34
36 A (partial) historical overview Simple and intuitive methods - Sample correlation - Similarity function (e.g., Gaussian RBF) Learning graphical models - Classical learning approaches lead to both positive/negative relations - What about learning a graph topology with non-negative weights? 11/34
37 A (partial) historical overview Simple and intuitive methods - Sample correlation - Similarity function (e.g., Gaussian RBF) Learning graphical models - Classical learning approaches lead to both positive/negative relations - What about learning a graph topology with non-negative weights? Learning topologies with non-negative weights - M-matrices (sym., p.d., non-pos. off-diag.) have been used as precision, leading to attractive GMRF (Slawski and Hein 2015) 11/34
38 A (partial) historical overview Simple and intuitive methods - Sample correlation - Similarity function (e.g., Gaussian RBF) Learning graphical models - Classical learning approaches lead to both positive/negative relations - What about learning a graph topology with non-negative weights? Learning topologies with non-negative weights - M-matrices (sym., p.d., non-pos. off-diag.) have been used as precision, leading to attractive GMRF (Slawski and Hein 2015) - The combinatorial graph Laplacian L = Deg - W belongs to M-matrices and is equivalent to graph topology 11/34
39 A (partial) historical overview Simple and intuitive methods - Sample correlation - Similarity function (e.g., Gaussian RBF) Learning graphical models - Classical learning approaches lead to both positive/negative relations - What about learning a graph topology with non-negative weights? Learning topologies with non-negative weights - M-matrices (sym., p.d., non-pos. off-diag.) have been used as precision, leading to attractive GMRF (Slawski and Hein 2015) - The combinatorial graph Laplacian L = Deg - W belongs to M-matrices and is equivalent to graph topology From arbitrary precision matrix to graph Laplacian! 11/34
40 A (partial) historical overview covariance selection `1-regularized neighborhood regression `1 -regularized log-determinant `1-regularized logistic regression quadratic approx. of Gauss. neg. log-likelihood Dempster Meinshausen & Buhlmann Banerjee Friedman Ravikumar Hsieh max log det tr(s ) 1 graph Laplacian L can be the precision, BUT it is singular 12/34
41 A (partial) historical overview covariance selection `1-regularized neighborhood regression `1 -regularized log-determinant `1-regularized logistic regression quadratic approx. of Gauss. neg. log-likelihood Dempster Meinshausen & Buhlmann Banerjee Friedman Ravikumar Hsieh max log det tr(s ) 1 s.t. = L I Lake graph Laplacian L can be the precision, BUT it is singular `1-regularized log-determinant on generalized L 12/34
42 A (partial) historical overview covariance selection `1-regularized neighborhood regression `1 -regularized log-determinant `1-regularized logistic regression quadratic approx. of Gauss. neg. log-likelihood Dempster Meinshausen & Buhlmann Banerjee Friedman Ravikumar Hsieh max log det tr(s ) 1 s.t. = L I Lake graph Laplacian L can be the precision, BUT it is singular `1-regularized log-determinant on generalized L precision by graphical LASSO Laplacian by Lake et al. 12/34
43 A (partial) historical overview covariance selection `1-regularized neighborhood regression `1 -regularized log-determinant `1-regularized logistic regression quadratic approx. of Gauss. neg. log-likelihood Dempster Meinshausen & Buhlmann Banerjee Friedman Ravikumar Hsieh max log det tr(s ) 1 s.t. = L I Lake graph Laplacian L can be the precision, BUT it is singular `1-regularized log-determinant on generalized L Slawski and Hein (2015) precision by graphical LASSO Laplacian by Lake et al. 12/34
44 A (partial) historical overview covariance selection `1-regularized neighborhood regression `1 -regularized log-determinant `1-regularized logistic regression quadratic approx. of Gauss. neg. log-likelihood Dempster Meinshausen & Buhlmann Banerjee Friedman Ravikumar Hsieh locally linear embedding [Roweis00] Daitch quadratic form of power of L Lake `1-regularized log-determinant on generalized L Hu quadratic form of power of L LX 2 F = X T L 2 X tr(x T L s X) W F Slawski and Hein (2015) 13/34
45 A (partial) historical overview covariance selection `1-regularized neighborhood regression `1 -regularized log-determinant `1-regularized logistic regression quadratic approx. of Gauss. neg. log-likelihood v5 Dempster Meinshausen & Buhlmann v 8 v 9 x : V! R N Banerjee Friedman Ravikumar Hsieh v 7 v GSP Daitch quadratic form of power of L Lake `1-regularized log-determinant on generalized L Slawski and Hein (2015) Hu quadratic form of power of L 14/34
46 A (partial) historical overview covariance selection `1-regularized neighborhood regression `1 -regularized log-determinant `1-regularized logistic regression quadratic approx. of Gauss. neg. log-likelihood v5 Dempster Meinshausen & Buhlmann v 8 v 9 x : V! R N Banerjee Friedman Ravikumar Hsieh v 7 v GSP Daitch Lake Hu Dong Egilmez Thanou quadratic form of power of L `1-regularized log-determinant on generalized L quadratic form of power of L Mei Pasdeloup Kalofolias Segarra Baingana Chepuri Slawski and Hein (2015) signal processing perspective 14/34
47 A signal processing perspective Existing approaches have limitations - Simple correlation or similarity functions are not enough - Most classical methods for learning graphical models do not directly lead to topologies with non-negative weights - There is no strong emphasis on signal/graph interaction with spectral/frequencydomain interpretation 15/34
48 A signal processing perspective Existing approaches have limitations - Simple correlation or similarity functions are not enough - Most classical methods for learning graphical models do not directly lead to topologies with non-negative weights - There is no strong emphasis on signal/graph interaction with spectral/frequencydomain interpretation Opportunity and challenge for graph signal processing - GSP tools such as frequency-analysis and filtering can contribute to the graph learning problem - Filtering-based approaches can provide generative models for signals with complex non-gaussian behavior 15/34
49 A signal processing perspective Signal processing is about D c = x = D c x 16/34
50 A signal processing perspective Graph signal processing is about D(G) c = x = D(G) c x G 16/34
51 A signal processing perspective Forward: Given G and x, design D to study c = D(G) c x G Fourier/wavelet atoms graph Fourier/ wavelet coefficient [Coifman06,Narang09,Hammond11, Shuman13,Sandryhaila13] trained dictionary atoms graph dictionary coefficient [Zhang12,Thanou14] 16/34
52 A signal processing perspective Backward (graph learning): Given x, design D and c to infer G = D(G) c x G 16/34
53 A signal processing perspective Backward (graph learning): Given x, design D and c to infer G = D(G) c x G - The key is a signal/graph model behind D - Designed around graph operators (adjacency/laplacian matrices, shift operators) 16/34
54 A signal processing perspective Backward (graph learning): Given x, design D and c to infer G = D(G) c x G - The key is a signal/graph model behind D - Designed around graph operators (adjacency/laplacian matrices, shift operators) - Choice of/assumption on c often determines signal characteristics 16/34
55 Model 1: Global smoothness Signal values vary smoothly between all pairs of nodes that are connected Example: Temperature of different locations in a flat geographical region Usually quantified by the Laplacian quadratic form: x T Lx = 1 2 X i,j W ij (x(i) x(j)) 2 17/34
56 Model 1: Global smoothness Signal values vary smoothly between all pairs of nodes that are connected Example: Temperature of different locations in a flat geographical region Usually quantified by the Laplacian quadratic form: x T Lx = 1 2 X x : V! R N W ij (x(i) x(j)) 2 i,j v5 v 8 v 9 v 7 v 6 x T Lx =1 v5 v 8 v 9 v7 v 6 x T Lx = 21 17/34
57 Model 1: Global smoothness Signal values vary smoothly between all pairs of nodes that are connected Example: Temperature of different locations in a flat geographical region Usually quantified by the Laplacian quadratic form: x T Lx = 1 2 X W ij (x(i) x(j)) 2 i,j x : V! R N v5 Similar to previous approaches: v 8 v 9 v 7 v 6 Lake (2010): Daitch (2009): Hu (2013): max =L+ 1 2 I log det 1 M tr(xxt ) 1 min L X T L 2 X min L tr(x T L s X) W F v5 v 8 v 9 x T Lx =1 v7 v 6 x T Lx = 21 17/34
58 Model 1: Global smoothness Dong et al. (2015) & Kalofolias (2016) = - D(G) = (eigenvector matrix of L) c x G - Gaussian assumption on c: c N (0, ) 18/34
59 Model 1: Global smoothness Dong et al. (2015) & Kalofolias (2016) = - D(G) = (eigenvector matrix of L) - Gaussian assumption on c: c N (0, ) - Maximum a posterior (MAP) estimation on c leads to minimization of Laplacian quadratic form: min c x c 2 2 log P c (c) c x G min L,Y X Y 2 F + tr(y T LY)+ L 2 F data fidelity smoothness on Y regularization 18/34
60 Model 1: Global smoothness Dong et al. (2015) & Kalofolias (2016) = - D(G) = (eigenvector matrix of L) - Gaussian assumption on c: c N (0, ) - Maximum a posterior (MAP) estimation on c leads to minimization of Laplacian quadratic form: c x G min c x c 2 2 log P c (c) v5 v 8 v 9 v 7 v 6 min L,Y X Y 2 F + tr(y T LY)+ L 2 F data fidelity smoothness on Y regularization x 18/34
61 Model 1: Global smoothness Dong et al. (2015) & Kalofolias (2016) = - D(G) = (eigenvector matrix of L) - Gaussian assumption on c: c N (0, ) - Maximum a posterior (MAP) estimation on c leads to minimization of Laplacian quadratic form: c x G min c x c 2 2 log P c (c) v5 v 8 v 9 v 7 v 6 min L,Y X Y 2 F + tr(y T LY)+ L 2 F data fidelity smoothness on Y regularization y 18/34
62 Model 1: Global smoothness Dong et al. (2015) & Kalofolias (2016) = - D(G) = (eigenvector matrix of L) - Gaussian assumption on c: c N (0, ) - Maximum a posterior (MAP) estimation on c leads to minimization of Laplacian quadratic form: c x G min c x c 2 2 log P c (c) v5 v 8 v 9 v 7 v 6 min L,Y X Y 2 F + tr(y T LY)+ L 2 F data fidelity smoothness on Y regularization y Learning enforces signal property (global smoothness)! 18/34
63 Model 1: Global smoothness Egilmez et al. (2016) min tr( K) log det s.t. K = S 2 (11T I) = c x G - Solve for as three different graph Laplacian matrices: 19/34
64 Model 1: Global smoothness Egilmez et al. (2016) min tr( K) log det s.t. K = S 2 (11T I) = c x G - Solve for as three different graph Laplacian matrices: non-negative negative generalized Laplacian = L + V = Deg W + V 19/34
65 Model 1: Global smoothness Egilmez et al. (2016) min tr( K) log det s.t. K = S 2 (11T I) = c x G - Solve for as three different graph Laplacian matrices: non-negative negative generalized Laplacian = L + V = Deg W + V diagonally dominant generalized Laplacian = L + V = Deg W + V (V 0) 19/34
66 Model 1: Global smoothness Egilmez et al. (2016) min tr( K) log det s.t. K = S 2 (11T I) = c x G - Solve for as three different graph Laplacian matrices: non-negative negative generalized Laplacian diagonally dominant generalized Laplacian combinatorial Laplacian = L + V = L + V = L = Deg W = Deg W + V = Deg W + V (V 0) 19/34
67 Model 1: Global smoothness Egilmez et al. (2016) min tr( K) log det s.t. K = S 2 (11T I) = c x G - Solve for as three different graph Laplacian matrices: non-negative negative generalized Laplacian diagonally dominant generalized Laplacian combinatorial Laplacian = L + V = L + V = L = Deg W = Deg W + V = Deg W + V (V 0) Generalizes graphical LASSO and Lake Adding priors on edge weights leads to interpretation of MAP estimation 19/34
68 Model 1: Global smoothness Chepuri et al. (2016) = - An edge selection mechanism based on the same smoothness measure: c x G v5 v 8 v 9 v7 v 6 20/34
69 Model 1: Global smoothness Chepuri et al. (2016) = - An edge selection mechanism based on the same smoothness measure: c x G v5 v 8 v 9 v7 v 6 20/34
70 Model 1: Global smoothness Chepuri et al. (2016) = - An edge selection mechanism based on the same smoothness measure: c x G v7 v5 v 8 v 9 v 6 v7 v5 v 8 v 9 v 6 20/34
71 Model 1: Global smoothness Chepuri et al. (2016) = - An edge selection mechanism based on the same smoothness measure: c x G v7 v5 v 8 v 9 v 6 v7 v5 v 8 v 9 v 6 v7 v5 v 8 v 9 v 6 20/34
72 Model 1: Global smoothness Chepuri et al. (2016) = - An edge selection mechanism based on the same smoothness measure: c x G v7 v5 v 8 v 9 v 6 v7 v5 v 8 v 9 v 6 v7 v5 v 8 v 9 v 6 Similar in spirit to Dempster Good for learning unweighted graph Explicit edge-handler is desirable in some applications 20/34
73 Model 2: Diffusion process Signals are outcome of some diffusion processes on the graph (more of local smoothness than global one!) Example: Movement of people/vehicles in transportation network 21/34
74 Model 2: Diffusion process Signals are outcome of some diffusion processes on the graph (more of local smoothness than global one!) Example: Movement of people/vehicles in transportation network Characterized by diffusion operators v v v v 9 initial stage 21/34
75 Model 2: Diffusion process Signals are outcome of some diffusion processes on the graph (more of local smoothness than global one!) Example: Movement of people/vehicles in transportation network Characterized by diffusion operators heat diffusion v5 v 8 v 9 v 7 v 6 observation v v v v 9 initial stage 21/34
76 Model 2: Diffusion process Signals are outcome of some diffusion processes on the graph (more of local smoothness than global one!) Example: Movement of people/vehicles in transportation network Characterized by diffusion operators heat diffusion v5 v 8 v 9 v 7 v 6 observation v v v v 9 initial stage general graph shift operator (e.g., A) v5 v 8 v 9 v 7 v 6 observation 21/34
77 Model 2: Diffusion process Pasdeloup et al. (2015, 2016) = - D(G) =T k(m) = W k(m) norm W k norm c x G - {c m } are i.i.d. samples with independent entries 22/34
78 Model 2: Diffusion process Pasdeloup et al. (2015, 2016) = - D(G) =T k(m) = W k(m) norm W k norm c x G - {c m } are i.i.d. samples with independent entries - Two-step approach: - Estimate eigenvector matrix from sample covariance (if covariance unknown): h X M = E m=1 X(m)X(m) T i = MX m=1 W 2k(m) norm (polynomial of ) W norm 22/34
79 Model 2: Diffusion process Pasdeloup et al. (2015, 2016) = - D(G) =T k(m) = W k(m) norm W k norm c x G - {c m } are i.i.d. samples with independent entries - Two-step approach: - Estimate eigenvector matrix from sample covariance (if covariance unknown): h X M = E m=1 X(m)X(m) T i = MX m=1 W 2k(m) norm (polynomial of ) W norm W norm - Optimize for eigenvalues given constraints of (mainly non-negativity of off-diagonal of and eigenvalue range) and some priors (e.g., sparsity) W norm 22/34
80 Model 2: Diffusion process Pasdeloup et al. (2015, 2016) = - D(G) =T k(m) = W k(m) norm W k norm c x G - {c m } are i.i.d. samples with independent entries - Two-step approach: - Estimate eigenvector matrix from sample covariance (if covariance unknown): h X M = E m=1 X(m)X(m) T i = MX m=1 W 2k(m) norm (polynomial of ) W norm W norm - Optimize for eigenvalues given constraints of (mainly non-negativity of off-diagonal of and eigenvalue range) and some priors (e.g., sparsity) W norm More a graph-centric learning framework: Cost function on graph components instead of signals 22/34
81 Model 2: Diffusion process Segarra et al. (2016) - D(G) =H(S G )= (diffusion defined by a graph shift operator S G that can be arbitrary, but practically W or L) - c is a white signal LX 1 l=0 h l S G l H(S G ) = c x G 23/34
82 Model 2: Diffusion process Segarra et al. (2016) - D(G) =H(S G )= (diffusion defined by a graph shift operator S G that can be arbitrary, but practically W or L) - c is a white signal LX 1 l=0 - Two-step approach: h l S G l - Estimate eigenvector matrix: = HH T H(S G ) = c x G 23/34
83 Model 2: Diffusion process Segarra et al. (2016) - D(G) =H(S G )= (diffusion defined by a graph shift operator S G that can be arbitrary, but practically W or L) - c is a white signal LX 1 l=0 - Two-step approach: h l S G l H(S G ) = c x G - Estimate eigenvector matrix: = HH T - Select eigenvalues that satisfy constraints of : S G min S G, S G 0 s.t. S G = NX n=1 nv n v n T spectral templates (eigenvectors) 23/34
84 Model 2: Diffusion process Segarra et al. (2016) - D(G) =H(S G )= (diffusion defined by a graph shift operator S G that can be arbitrary, but practically W or L) - c is a white signal LX 1 l=0 - Two-step approach: h l S G l H(S G ) = c x G - Estimate eigenvector matrix: = HH T - Select eigenvalues that satisfy constraints of : S G min S G, S G 0 s.t. S G = NX n=1 nv n v n T spectral templates (eigenvectors) Similar in spirit to Pasdeloup, same assumption on stationarity but different inference framework due to different D Can handle noisy or incomplete information on spectral templates 23/34
85 Model 2: Diffusion process Thanou et al. (2016) = - D(G) =e L (localization in vertex domain) e L c x G - Sparsity assumption on c 24/34
86 Model 2: Diffusion process Thanou et al. (2016) = - D(G) =e L (localization in vertex domain) - Sparsity assumption on c e L - Each signal is a combination of several heat diffusion processes at time c x G 24/34
87 Model 2: Diffusion process Thanou et al. (2016) = - D(G) =e L (localization in vertex domain) - Sparsity assumption on c e L - Each signal is a combination of several heat diffusion processes at time c x G min X L,C, MX D(L)C 2 F + c m 1 + m=1 L 2 F s.t. D =[e 1L,...,e SL ] 24/34
88 Model 2: Diffusion process Thanou et al. (2016) = - D(G) =e L (localization in vertex domain) - Sparsity assumption on c e L - Each signal is a combination of several heat diffusion processes at time c x G min X L,C, MX D(L)C 2 F + c m 1 + m=1 L 2 F s.t. D =[e 1L,...,e SL ] data fidelity sparsity on c regularization 24/34
89 Model 2: Diffusion process Thanou et al. (2016) = - D(G) =e L (localization in vertex domain) - Sparsity assumption on c - Each signal is a combination of several heat diffusion processes at time e L c x G min X L,C, MX D(L)C 2 F + c m 1 + m=1 L 2 F s.t. D =[e 1L,...,e SL ] data fidelity sparsity on c regularization Still diffusion-based model, but more signal-centric No assumption on eigenvectors/stationarity, but on signal structure and sparsity Can be extended to general polynomial case (Maretic et al. 2017) 24/34
90 Model 3: Time-varying observations Signals are time-varying observations that are causal outcome of current or past values (mixed degree of smoothness depending on previous states) Example: Evolution of individual behavior due to influence of different friends at different timestamps Characterized by an autoregressive model or a structural equation model (SEM) 25/34
91 Model 3: Time-varying observations Mei and Moura (2015) S s=1 = - D s (G) =P s (W) : polynomial of W of degree s P s (W) x[t s] x G - Define as x[t s] c s 26/34
92 Model 3: Time-varying observations Mei and Moura (2015) S s=1 = - D s (G) =P s (W) : polynomial of W of degree s P s (W) x[t s] x G - Define as c s x[t s] v5 v 8 v 9 v7 v 6... = v7 v v 8 v 9 v 8 v 9 v7 v 6 x[t] x[t 1] x[t S] 26/34
93 Model 3: Time-varying observations Mei and Moura (2015) S s=1 = - D s (G) =P s (W) : polynomial of W of degree s P s (W) x[t s] x G - Define as c s x[t s] v5 v 8 v 9 v7 v 6... = v7 v v 8 v 9 v 8 v 9 v7 v 6 x[t] x[t 1] x[t S] min W,a 1 2 KX k=s+1 x[k] SX P s (W)x[k s] vec(w) a 1 s=1 26/34
94 Model 3: Time-varying observations Mei and Moura (2015) S s=1 = - D s (G) =P s (W) : polynomial of W of degree s P s (W) x[t s] x G - Define as c s x[t s] v5 v 8 v 9 v7 v 6... = v7 v v 8 v 9 v 8 v 9 v7 v 6 x[t] x[t 1] x[t S] min W,a 1 2 KX k=s+1 x[k] SX P s (W)x[k s] vec(w) a 1 s=1 data fidelity sparsity on W sparsity on a 26/34
95 Model 3: Time-varying observations Mei and Moura (2015) S s=1 = - D s (G) =P s (W) : polynomial of W of degree s P s (W) x[t s] x G - Define as c s x[t s] v5 v 8 v 9 v7 v 6... = v7 v v 8 v 9 v 8 v 9 v7 v 6 x[t] x[t 1] x[t S] min W,a 1 2 KX k=s+1 x[k] SX P s (W)x[k s] vec(w) a 1 s=1 data fidelity sparsity on W sparsity on a Polynomial design similar in spirit to Pasdeloup and Segarra Good for inferring causal relations between signals Kernelized version (nonlinear): Shen et al. (2016) 26/34
96 Model 3: Time-varying observations Baingana and Giannakis (2016) + = - D(G) =W s(t) : Graph at time t ext. W x x G (topologies switch at each time between S discrete states) - Define c as x 27/34
97 Model 3: Time-varying observations Baingana and Giannakis (2016) + = - D(G) =W s(t) : Graph at time t ext. W x x G (topologies switch at each time between S discrete states) - Define c as x x[t] =W s(t) x[t]+b s(t) y[t] internal (neighbors) external 27/34
98 Model 3: Time-varying observations Baingana and Giannakis (2016) + = - D(G) =W s(t) : Graph at time t ext. W x x G (topologies switch at each time between S discrete states) - Define c as x x[t] =W s(t) x[t]+b s(t) y[t] internal (neighbors) external - Solve for all states of W: min {W s(t),b s(t) } 1 2 TX x[t] W s(t) x[t] B s(t) y[t] 2 F + t=1 SX s=1 s W s(t) 1 data fidelity sparsity on W 27/34
99 Model 3: Time-varying observations Baingana and Giannakis (2016) + = - D(G) =W s(t) : Graph at time t ext. W x x G (topologies switch at each time between S discrete states) - Define c as x x[t] =W s(t) x[t]+b s(t) y[t] internal (neighbors) external - Solve for all states of W: min {W s(t),b s(t) } 1 2 TX x[t] W s(t) x[t] B s(t) y[t] 2 F + t=1 SX s=1 s W s(t) 1 data fidelity sparsity on W Good for inferring causal relations between signals as well as dynamic topologies 27/34
100 Comparison of different methods Methods Signal model Assumption Learning output Edge direction Inference Dong (2015) Global smoothness Gaussian Laplacian Undirected Signal-centric Kalofolias (2016) Global smoothness Gaussian Adjacency Undirected Signal-centric Egilmez (2016) Global smoothness Gaussian Generalized Laplacian Undirected Signal-centric Chepuri (2016) Global smoothness Gaussian Adjacency Undirected Graph-centric Pasdeloup (2015) Diffusion by Adj. Stationary Normalized Adj./ Laplacian Undirected Graph-centric Segarra (2016) Diffusion by Graph shift operator Stationary Graph shift operator Undirected Graph-centric Thanou (2016) Heat diffusion Sparsity Laplacian Undirected Signal-centric Mei (2015) Time-varying Dependent on previous states Adjacency Directed Signal-centric Baingana (2016) Time-varying Dependent on current int/ext info Time-varying Adjacency Directed Signal-centric 28/34
101 Perspective GSP for graph learning v 7 v5 v 8 v 9 v 7 v 6 v 8 v 9 v 6 29/34
102 Perspective GSP for graph learning v 7 v5 v 8 v 9 v 7 v 6 v 8 v 9 v 6 Learning input - missing observations - partial observations, e.g., by sampling 29/34
103 Perspective GSP for graph learning v 7 v5 v 8 v 9 v 7 v 6 v 8 v 9 v 6 Learning input - missing observations - partial observations, e.g., by sampling Learning output - directed graphs (Shen 2017) - time-varying graphs (Kalofolias 2017) - multi-layer graphs - subgraphs or ego-networks - intermediate graph representation 29/34
104 Perspective GSP for graph learning v 7 v5 v 8 v 9 v 7 v 6 v 8 v 9 v 6 Learning input Signal/graph model Learning output - missing observations - partial observations, e.g., by sampling - beyond smoothness: localization in vertex-frequency domain, bandlimited (Sardellitti 2017) - directed graphs (Shen 2017) - time-varying graphs (Kalofolias 2017) - multi-layer graphs - subgraphs or ego-networks - intermediate graph representation 29/34
105 Perspective Theoretical consideration - performance guarantee (Rabbat 2017) - computational efficiency GSP for graph learning v 7 v 6 v 7 v5 v 8 v 9 v 8 v 9 v 6 Learning input Signal/graph model Learning output - missing observations - partial observations, e.g., by sampling - beyond smoothness: localization in vertex-frequency domain, bandlimited (Sardellitti 2017) - directed graphs (Shen 2017) - time-varying graphs (Kalofolias 2017) - multi-layer graphs - subgraphs or ego-networks - intermediate graph representation 29/34
106 Perspective Theoretical consideration - performance guarantee (Rabbat 2017) - computational efficiency Learning objective - for what SP applications? e.g., classification (Yankelevsky 2016), coding and compression (Rotondo 2015, Fracastoro 2016) - for traditional graph-based learning, e.g., clustering, dim. reduction, ranking v 7 GSP for graph learning v 6 v 7 v5 v 8 v 9 v 8 v 9 v 6 Learning input Signal/graph model Learning output - missing observations - partial observations, e.g., by sampling - beyond smoothness: localization in vertex-frequency domain, bandlimited (Sardellitti 2017) - directed graphs (Shen 2017) - time-varying graphs (Kalofolias 2017) - multi-layer graphs - subgraphs or ego-networks - intermediate graph representation 29/34
107 Graph learning at GSPW 2017 Thursday June 1st Friday June 2nd 30/34
108 References B. Baingana and G. B. Giannakis. Tracking switching network topologies from propagating graph signals. In Graph Signal Processing Workshop, B. Baingana and G. B. Giannakis. Tracking switched dynamic network topologies from information cascades. IEEE Transactions on Signal Processing, 65(4): , O. Banerjee, L. El Ghaoui, and A. d Aspremont. Model selection through sparse maximum likelihood estimation for multivariate gaussian or binary data. The Journal of Machine Learning Research, 9: , S. P. Chepuri, S. Liu, G. Leus, and A. O. Hero III. Learning sparse graphs under smoothness prior. arxiv: , D. I Shuman, S. K. Narang, P. Frossard, A. Ortega, and P. Vandergheynst. The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains. IEEE Signal Processing Magazine, 30(3):83-98, S. I. Daitch, J. A. Kelner, and D. A. Spielman. Fitting a graph to vector data. In Proceedings of the International Conference on Machine Learning, , A. P. Dempster. Covariance selection. Biometrics, 28(1): , X. Dong, D. Thanou, P. Frossard, and P. Vandergheynst. Laplacian matrix learning for smooth graph signal representation. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, , X. Dong, D. Thanou, P. Frossard, and P. Vandergheynst. Learning laplacian matrix in smooth graph signal representations. IEEE Transactions on Signal Processing, 64(23): , /34
109 References H. E. Egilmez, E. Pavez, and A. Ortega. Graph learning from data under structural and laplacian constraints. arxiv: , G. Fracastoro, D. Thanou, and P. Frossard. Graph transform learning for image compression. In Proceedings of the Picture Coding Symposium, J.Friedman, T.Hastie, and R.Tibshirani. Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9(3): , D. K. Hammond, P. Vandergheynst, and R. Gribonval. Wavelets on graphs via spectral graph theory. Applied and Computational Harmonic Analysis, 30(2): , C.-J. Hsieh, I. S. Dhillon, P. K. Ravikumar, and M. A. Sustik. Sparse inverse covariance matrix estimation using quadratic approximation. In Advances in Neural Information Processing Systems 24, , C. Hu, L. Cheng, J. Sepulcre, G. E. Fakhri, Y. M. Lu, and Q. Li. A graph theoretical regression model for brain connectivity learning of Alzheimer s disease. In Proceedings of the IEEE International Symposium on Biomedical Imaging, , V. Kalofolias. How to learn a graph from smooth signals. In Proceedings of the International Conference on Artificial Intelligence and Statistics, , V. Kalofolias, A. Loukas, D. Thanou, and P. Frossard. Learning time varying graphs. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, , B. Lake and J. Tenenbaum. Discovering structure by learning sparse graph. In Proceedings of the Annual Cognitive Science Conference, /34
110 References H. P. Maretic, D. Thanou, and P. Frossard. Graph learning under sparsity priors. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, , J. Mei and J. M. F. Moura. Signal processing on graphs: Estimating the structure of a graph. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, , N. Meinshausen and P. Bühlmann. High-dimensional graphs and variable selection with the lasso. Annals of Statistics, 34(3): , S. K. Narang and A. Ortega. Lifting based wavelet transforms on graphs. In Proceedings of the Asia- Pacific Signal and Information Processing Association Annual Summit and Conference, , B. Pasdeloup, V. Gripon, G. Mercier, D. Pastor, and M. G. Rabbat. Characterization and inference of weighted graph topologies from observations of diffused signals. arxiv: , B. Pasdeloup, M. Rabbat, V. Gripon, D. Pastor, and G. Mercier. Graph reconstruction from the observation of diffused signals. In Proceedings of the Annual Allerton Conference, , P. Ravikumar, M. J. Wainwright, and J. Lafferty. High-dimensional Ising model selection using l1- regularized logistic regression. Annals of Statistics, 38(3): , I. Rotondo, G. Cheung, A. Ortega, and H. E. Egilmez. Designing sparse graphs via structure tensor for block transform coding of images. In Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, , A. Sandryhaila and J. M. F. Moura. Discrete signal processing on graphs. IEEE Transactions on Signal Processing, 61(7): , /34
111 References S. Segarra, A. G. Marques, G. Mateos, and A. Ribeiro. Network topology inference from spectral templates. arxiv: , Y. Shen, B. Baingana, and G. B. Giannakis. Nonlinear structural vector autoregressive models for inferring effective brain network connectivity. arxiv: , Y. Shen, B. Baingana, and G. B. Giannakis. Kernel-based structural equation models for topology identification of directed networks. IEEE Transactions on Signal Processing, 65(10): , M. Slawski and M. Hein. Estimation of positive definite M-matrices and structure learning for attractive Gaussian Markov random fields. Linear Algebra and its Applications, 473: , D. Thanou, D. I Shuman, and P. Frossard. Learning parametric dictionaries for signals on graphs. IEEE Transactions on Signal Processing, 62(15): , D. Thanou, X. Dong, D. Kressner, and P. Frossard. Learning heat diffusion graphs. arxiv: , Y. Yankelevsky and M. Elad. Dual graph regularized dictionary learning. IEEE Transactions on Signal and Information Processing over Networks, 2(4): , X. Zhang, X. Dong, and P. Frossard. Learning of structured graph dictionaries. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, , /34
Characterization and inference of weighted graph topologies from observations of diffused signals
Characterization and inference of weighted graph topologies from observations of diffused signals Bastien Pasdeloup*, Vincent Gripon*, Grégoire Mercier*, Dominique Pastor*, Michael G. Rabbat** * name.surname@telecom-bretagne.eu
More informationNetwork Topology Inference from Non-stationary Graph Signals
Network Topology Inference from Non-stationary Graph Signals Rasoul Shafipour Dept. of Electrical and Computer Engineering University of Rochester rshafipo@ece.rochester.edu http://www.ece.rochester.edu/~rshafipo/
More informationLearning Graphs from Data: A Signal Representation Perspective
1 Learning Graphs from Data: A Signal Representation Perspective Xiaowen Dong*, Dorina Thanou*, Michael Rabbat, and Pascal Frossard arxiv:1806.00848v1 [cs.lg] 3 Jun 2018 The construction of a meaningful
More informationTropical Graph Signal Processing
Tropical Graph Signal Processing Vincent Gripon To cite this version: Vincent Gripon. Tropical Graph Signal Processing. 2017. HAL Id: hal-01527695 https://hal.archives-ouvertes.fr/hal-01527695v2
More informationROBUST NETWORK TOPOLOGY INFERENCE
ROBUST NETWORK TOPOLOGY INFERENCE Santiago Segarra, Antonio G. Marques, Gonzalo Mateos and Alejandro Ribeiro Institute for Data, Systems, and Society, Massachusetts Institute of Technology, Cambridge,
More informationCOMMUNITY DETECTION FROM LOW-RANK EXCITATIONS OF A GRAPH FILTER
COMMUNITY DETECTION FROM LOW-RANK EXCITATIONS OF A GRAPH FILTER Hoi-To Wai, Santiago Segarra, Asuman E. Ozdaglar, Anna Scaglione, Ali Jadbabaie School of ECEE, Arizona State University, Tempe, AZ, USA.
More informationarxiv: v1 [cs.lg] 24 Jan 2019
GRAPH HEAT MIXTURE MODEL LEARNING Hermina Petric Maretic Mireille El Gheche Pascal Frossard Ecole Polytechnique Fédérale de Lausanne (EPFL), Signal Processing Laboratory (LTS4) arxiv:1901.08585v1 [cs.lg]
More informationProbabilistic Graphical Models
School of Computer Science Probabilistic Graphical Models Gaussian graphical models and Ising models: modeling networks Eric Xing Lecture 0, February 7, 04 Reading: See class website Eric Xing @ CMU, 005-04
More informationHigh-dimensional graphical model selection: Practical and information-theoretic limits
1 High-dimensional graphical model selection: Practical and information-theoretic limits Martin Wainwright Departments of Statistics, and EECS UC Berkeley, California, USA Based on joint work with: John
More informationGENERALIZED LAPLACIAN PRECISION MATRIX ESTIMATION FOR GRAPH SIGNAL PROCESSING. Eduardo Pavez and Antonio Ortega
GENERALIZED LAPLACIAN PRECISION MATRIX ESTIMATION FOR GRAPH SIGNAL PROCESSING Eduardo Pavez and Antonio Ortega Department of Electrical Engineering, University of Southern California, Los Angeles, USA
More informationDATA defined on network-like structures are encountered
Graph Fourier Transform based on Directed Laplacian Rahul Singh, Abhishek Chakraborty, Graduate Student Member, IEEE, and B. S. Manoj, Senior Member, IEEE arxiv:6.v [cs.it] Jan 6 Abstract In this paper,
More informationLearning discrete graphical models via generalized inverse covariance matrices
Learning discrete graphical models via generalized inverse covariance matrices Duzhe Wang, Yiming Lv, Yongjoon Kim, Young Lee Department of Statistics University of Wisconsin-Madison {dwang282, lv23, ykim676,
More informationRobust Inverse Covariance Estimation under Noisy Measurements
.. Robust Inverse Covariance Estimation under Noisy Measurements Jun-Kun Wang, Shou-De Lin Intel-NTU, National Taiwan University ICML 2014 1 / 30 . Table of contents Introduction.1 Introduction.2 Related
More informationHigh-dimensional graphical model selection: Practical and information-theoretic limits
1 High-dimensional graphical model selection: Practical and information-theoretic limits Martin Wainwright Departments of Statistics, and EECS UC Berkeley, California, USA Based on joint work with: John
More informationProbabilistic Graphical Models
School of Computer Science Probabilistic Graphical Models Gaussian graphical models and Ising models: modeling networks Eric Xing Lecture 0, February 5, 06 Reading: See class website Eric Xing @ CMU, 005-06
More informationGAUSSIAN PROCESS TRANSFORMS
GAUSSIAN PROCESS TRANSFORMS Philip A. Chou Ricardo L. de Queiroz Microsoft Research, Redmond, WA, USA pachou@microsoft.com) Computer Science Department, Universidade de Brasilia, Brasilia, Brazil queiroz@ieee.org)
More informationGRAPH SIGNAL PROCESSING: A STATISTICAL VIEWPOINT
GRAPH SIGNAL PROCESSING: A STATISTICAL VIEWPOINT Cha Zhang Joint work with Dinei Florêncio and Philip A. Chou Microsoft Research Outline Gaussian Markov Random Field Graph construction Graph transform
More informationGraph Signal Processing
Graph Signal Processing Rahul Singh Data Science Reading Group Iowa State University March, 07 Outline Graph Signal Processing Background Graph Signal Processing Frameworks Laplacian Based Discrete Signal
More informationarxiv: v1 [cs.it] 26 Sep 2018
SAPLING THEORY FOR GRAPH SIGNALS ON PRODUCT GRAPHS Rohan A. Varma, Carnegie ellon University rohanv@andrew.cmu.edu Jelena Kovačević, NYU Tandon School of Engineering jelenak@nyu.edu arxiv:809.009v [cs.it]
More informationRandom Sampling of Bandlimited Signals on Graphs
Random Sampling of Bandlimited Signals on Graphs Pierre Vandergheynst École Polytechnique Fédérale de Lausanne (EPFL) School of Engineering & School of Computer and Communication Sciences Joint work with
More informationProperties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation
Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation Adam J. Rothman School of Statistics University of Minnesota October 8, 2014, joint work with Liliana
More informationA PROBABILISTIC INTERPRETATION OF SAMPLING THEORY OF GRAPH SIGNALS. Akshay Gadde and Antonio Ortega
A PROBABILISTIC INTERPRETATION OF SAMPLING THEORY OF GRAPH SIGNALS Akshay Gadde and Antonio Ortega Department of Electrical Engineering University of Southern California, Los Angeles Email: agadde@usc.edu,
More informationHigh dimensional Ising model selection
High dimensional Ising model selection Pradeep Ravikumar UT Austin (based on work with John Lafferty, Martin Wainwright) Sparse Ising model US Senate 109th Congress Banerjee et al, 2008 Estimate a sparse
More information6160 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 64, NO. 23, DECEMBER 1, Xiaowen Dong, Dorina Thanou, Pascal Frossard, and Pierre Vandergheynst
6160 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 64, NO. 23, DECEMBER 1, 2016 Learning Laplacian Matrix in Smooth Graph Signal Representations Xiaowen Dong, Dorina Thanou, Pascal Frossard, and Pierre
More informationIntroduction to graphical models: Lecture III
Introduction to graphical models: Lecture III Martin Wainwright UC Berkeley Departments of Statistics, and EECS Martin Wainwright (UC Berkeley) Some introductory lectures January 2013 1 / 25 Introduction
More informationGraph Partitioning Using Random Walks
Graph Partitioning Using Random Walks A Convex Optimization Perspective Lorenzo Orecchia Computer Science Why Spectral Algorithms for Graph Problems in practice? Simple to implement Can exploit very efficient
More informationApproximation. Inderjit S. Dhillon Dept of Computer Science UT Austin. SAMSI Massive Datasets Opening Workshop Raleigh, North Carolina.
Using Quadratic Approximation Inderjit S. Dhillon Dept of Computer Science UT Austin SAMSI Massive Datasets Opening Workshop Raleigh, North Carolina Sept 12, 2012 Joint work with C. Hsieh, M. Sustik and
More information11 : Gaussian Graphic Models and Ising Models
10-708: Probabilistic Graphical Models 10-708, Spring 2017 11 : Gaussian Graphic Models and Ising Models Lecturer: Bryon Aragam Scribes: Chao-Ming Yen 1 Introduction Different from previous maximum likelihood
More informationA DIGRAPH FOURIER TRANSFORM WITH SPREAD FREQUENCY COMPONENTS
A DIGRAPH FOURIER TRANSFORM WITH SPREAD FREQUENCY COMPONENTS Rasoul Shafipour, Ali Khodabakhsh, Gonzalo Mateos, and Evdokia Nikolova Dept. of Electrical and Computer Engineering, University of Rochester,
More informationDigraph Fourier Transform via Spectral Dispersion Minimization
Digraph Fourier Transform via Spectral Dispersion Minimization Gonzalo Mateos Dept. of Electrical and Computer Engineering University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/
More informationINTERPOLATION OF GRAPH SIGNALS USING SHIFT-INVARIANT GRAPH FILTERS
INTERPOLATION OF GRAPH SIGNALS USING SHIFT-INVARIANT GRAPH FILTERS Santiago Segarra, Antonio G. Marques, Geert Leus, Alejandro Ribeiro University of Pennsylvania, Dept. of Electrical and Systems Eng.,
More informationBAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage
BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage Lingrui Gan, Naveen N. Narisetty, Feng Liang Department of Statistics University of Illinois at Urbana-Champaign Problem Statement
More informationGraph Signal Processing for Image Compression & Restoration (Part II)
ene Cheung, Xianming Liu National Institute of Informatics 11 th July, 016 raph Signal Processing for Image Compression & Restoration (Part II). ICME'16 utorial 07/11/016 1 Outline (Part II) Image Restoration
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Undirected Graphical Models Mark Schmidt University of British Columbia Winter 2016 Admin Assignment 3: 2 late days to hand it in today, Thursday is final day. Assignment 4:
More informationUncertainty Principle and Sampling of Signals Defined on Graphs
Uncertainty Principle and Sampling of Signals Defined on Graphs Mikhail Tsitsvero, Sergio Barbarossa, and Paolo Di Lorenzo 2 Department of Information Eng., Electronics and Telecommunications, Sapienza
More informationGraph Signal Processing in Distributed Network Processing
Graph Signal Processing in Distributed Network Processing Antonio G. Marques King Juan Carlos University - Madrid (SPAIN) antonio.garcia.marques@urjc.es http://www.tsc.urjc.es/~amarques/ Thanks: S. Segarra,
More informationIntelligent Systems (AI-2)
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 19 Oct, 23, 2015 Slide Sources Raymond J. Mooney University of Texas at Austin D. Koller, Stanford CS - Probabilistic Graphical Models D. Page,
More informationSampling, Inference and Clustering for Data on Graphs
Sampling, Inference and Clustering for Data on Graphs Pierre Vandergheynst École Polytechnique Fédérale de Lausanne (EPFL) School of Engineering & School of Computer and Communication Sciences Joint work
More informationReconstruction of graph signals: percolation from a single seeding node
Reconstruction of graph signals: percolation from a single seeding node Santiago Segarra, Antonio G. Marques, Geert Leus, and Alejandro Ribeiro Abstract Schemes to reconstruct signals defined in the nodes
More informationSparse Covariance Selection using Semidefinite Programming
Sparse Covariance Selection using Semidefinite Programming A. d Aspremont ORFE, Princeton University Joint work with O. Banerjee, L. El Ghaoui & G. Natsoulis, U.C. Berkeley & Iconix Pharmaceuticals Support
More informationSparse Gaussian conditional random fields
Sparse Gaussian conditional random fields Matt Wytock, J. ico Kolter School of Computer Science Carnegie Mellon University Pittsburgh, PA 53 {mwytock, zkolter}@cs.cmu.edu Abstract We propose sparse Gaussian
More informationNonparametric Bayesian Methods (Gaussian Processes)
[70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent
More informationFitting a Graph to Vector Data. Samuel I. Daitch (Yale) Jonathan A. Kelner (MIT) Daniel A. Spielman (Yale)
Fitting a Graph to Vector Data Samuel I. Daitch (Yale) Jonathan A. Kelner (MIT) Daniel A. Spielman (Yale) Machine Learning Summer School, June 2009 Given a collection of vectors x 1,..., x n IR d 1,, n
More informationAn efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss
An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss arxiv:1811.04545v1 [stat.co] 12 Nov 2018 Cheng Wang School of Mathematical Sciences, Shanghai Jiao
More informationCharacterization and inference of weighted graph topologies from observations of diffused signals
1 Characterization and inference of weighted graph topologies from observations of diffused signals Bastien Pasdeloup, Vincent Gripon, Grégoire Mercier, Dominique Pastor, and Michael G. Rabbat arxiv:1605.02569v1
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Brown University CSCI 295-P, Spring 213 Prof. Erik Sudderth Lecture 11: Inference & Learning Overview, Gaussian Graphical Models Some figures courtesy Michael Jordan s draft
More informationDiscriminative Direction for Kernel Classifiers
Discriminative Direction for Kernel Classifiers Polina Golland Artificial Intelligence Lab Massachusetts Institute of Technology Cambridge, MA 02139 polina@ai.mit.edu Abstract In many scientific and engineering
More informationHYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH
HYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH Hoang Trang 1, Tran Hoang Loc 1 1 Ho Chi Minh City University of Technology-VNU HCM, Ho Chi
More informationNonlinear Dimensionality Reduction
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Kernel PCA 2 Isomap 3 Locally Linear Embedding 4 Laplacian Eigenmap
More informationBig & Quic: Sparse Inverse Covariance Estimation for a Million Variables
for a Million Variables Cho-Jui Hsieh The University of Texas at Austin NIPS Lake Tahoe, Nevada Dec 8, 2013 Joint work with M. Sustik, I. Dhillon, P. Ravikumar and R. Poldrack FMRI Brain Analysis Goal:
More informationPermutation-invariant regularization of large covariance matrices. Liza Levina
Liza Levina Permutation-invariant covariance regularization 1/42 Permutation-invariant regularization of large covariance matrices Liza Levina Department of Statistics University of Michigan Joint work
More informationGRAPH ERROR EFFECT IN GRAPH SIGNAL PROCESSING. Jari Miettinen, Sergiy A. Vorobyov, and Esa Ollila
GRAPH ERROR EFFECT IN GRAPH SIGNAL PROCESSING Jari Miettinen, Sergiy A. Vorobyov, and Esa Ollila Department of Signal Processing and Acoustics, Aalto University, P.O. Box 15400, FIN-00076 Aalto ABSTRACT
More informationComputational and Statistical Aspects of Statistical Machine Learning. John Lafferty Department of Statistics Retreat Gleacher Center
Computational and Statistical Aspects of Statistical Machine Learning John Lafferty Department of Statistics Retreat Gleacher Center Outline Modern nonparametric inference for high dimensional data Nonparametric
More informationMultivariate Normal Models
Case Study 3: fmri Prediction Graphical LASSO Machine Learning/Statistics for Big Data CSE599C1/STAT592, University of Washington Emily Fox February 26 th, 2013 Emily Fox 2013 1 Multivariate Normal Models
More informationStationary Graph Processes: Nonparametric Spectral Estimation
Stationary Graph Processes: Nonparametric Spectral Estimation Santiago Segarra, Antonio G. Marques, Geert Leus, and Alejandro Ribeiro Dept. of Signal Theory and Communications King Juan Carlos University
More informationLearning Multiple Tasks with a Sparse Matrix-Normal Penalty
Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Yi Zhang and Jeff Schneider NIPS 2010 Presented by Esther Salazar Duke University March 25, 2011 E. Salazar (Reading group) March 25, 2011 1
More informationMultivariate Normal Models
Case Study 3: fmri Prediction Coping with Large Covariances: Latent Factor Models, Graphical Models, Graphical LASSO Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox February
More informationIntroduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin
1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)
More informationGRAPH FOURIER TRANSFORM WITH NEGATIVE EDGES FOR DEPTH IMAGE CODING. Weng-Tai Su, Gene Cheung #, Chia-Wen Lin
GRAPH FOURIER TRANSFORM WITH NEGATIVE EDGES FOR DEPTH IMAGE CODING Weng-Tai Su, Gene Cheung #, Chia-Wen Lin National Tsing Hua University, # National Institute of Informatics ABSTRACT Recent advent in
More informationIntelligent Systems (AI-2)
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 19 Oct, 24, 2016 Slide Sources Raymond J. Mooney University of Texas at Austin D. Koller, Stanford CS - Probabilistic Graphical Models D. Page,
More informationhttps://goo.gl/kfxweg KYOTO UNIVERSITY Statistical Machine Learning Theory Sparsity Hisashi Kashima kashima@i.kyoto-u.ac.jp DEPARTMENT OF INTELLIGENCE SCIENCE AND TECHNOLOGY 1 KYOTO UNIVERSITY Topics:
More informationGaussian Graphical Models and Graphical Lasso
ELE 538B: Sparsity, Structure and Inference Gaussian Graphical Models and Graphical Lasso Yuxin Chen Princeton University, Spring 2017 Multivariate Gaussians Consider a random vector x N (0, Σ) with pdf
More informationLECTURE NOTE #11 PROF. ALAN YUILLE
LECTURE NOTE #11 PROF. ALAN YUILLE 1. NonLinear Dimension Reduction Spectral Methods. The basic idea is to assume that the data lies on a manifold/surface in D-dimensional space, see figure (1) Perform
More informationLogistic Regression: Online, Lazy, Kernelized, Sequential, etc.
Logistic Regression: Online, Lazy, Kernelized, Sequential, etc. Harsha Veeramachaneni Thomson Reuter Research and Development April 1, 2010 Harsha Veeramachaneni (TR R&D) Logistic Regression April 1, 2010
More informationHigh-dimensional covariance estimation based on Gaussian graphical models
High-dimensional covariance estimation based on Gaussian graphical models Shuheng Zhou Department of Statistics, The University of Michigan, Ann Arbor IMA workshop on High Dimensional Phenomena Sept. 26,
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Brown University CSCI 2950-P, Spring 2013 Prof. Erik Sudderth Lecture 12: Gaussian Belief Propagation, State Space Models and Kalman Filters Guest Kalman Filter Lecture by
More informationCONTROL OF GRAPH SIGNALS OVER RANDOM TIME-VARYING GRAPHS
CONTROL OF GRAPH SIGNALS OVER RANDOM TIME-VARYING GRAPHS Fernando Gama, Elvin Isufi, Geert Leus and Alejandro Ribeiro Department of Electrical and Systems Engineering, University of Pennsylvania, Philadelphia,
More informationProximity-Based Anomaly Detection using Sparse Structure Learning
Proximity-Based Anomaly Detection using Sparse Structure Learning Tsuyoshi Idé (IBM Tokyo Research Lab) Aurelie C. Lozano, Naoki Abe, and Yan Liu (IBM T. J. Watson Research Center) 2009/04/ SDM 2009 /
More informationSparse Graph Learning via Markov Random Fields
Sparse Graph Learning via Markov Random Fields Xin Sui, Shao Tang Sep 23, 2016 Xin Sui, Shao Tang Sparse Graph Learning via Markov Random Fields Sep 23, 2016 1 / 36 Outline 1 Introduction to graph learning
More informationRobust and sparse Gaussian graphical modelling under cell-wise contamination
Robust and sparse Gaussian graphical modelling under cell-wise contamination Shota Katayama 1, Hironori Fujisawa 2 and Mathias Drton 3 1 Tokyo Institute of Technology, Japan 2 The Institute of Statistical
More informationGraph Matching & Information. Geometry. Towards a Quantum Approach. David Emms
Graph Matching & Information Geometry Towards a Quantum Approach David Emms Overview Graph matching problem. Quantum algorithms. Classical approaches. How these could help towards a Quantum approach. Graphs
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear
More informationCS6220: DATA MINING TECHNIQUES
CS6220: DATA MINING TECHNIQUES Matrix Data: Classification: Part 2 Instructor: Yizhou Sun yzsun@ccs.neu.edu September 21, 2014 Methods to Learn Matrix Data Set Data Sequence Data Time Series Graph & Network
More informationCollaborative Filtering via Graph Signal Processing
Collaborative Filtering via Graph Signal Processing Weiyu Huang, Antonio G. Marques, and Alejandro Ribeiro Abstract This paper develops new designs for recommender systems inspired by recent advances in
More informationDiscrete Signal Processing on Graphs: Sampling Theory
IEEE TRANS. SIGNAL PROCESS. TO APPEAR. 1 Discrete Signal Processing on Graphs: Sampling Theory Siheng Chen, Rohan Varma, Aliaksei Sandryhaila, Jelena Kovačević arxiv:153.543v [cs.it] 8 Aug 15 Abstract
More informationSparse and Locally Constant Gaussian Graphical Models
Sparse and Locally Constant Gaussian Graphical Models Jean Honorio, Luis Ortiz, Dimitris Samaras Department of Computer Science Stony Brook University Stony Brook, NY 794 {jhonorio,leortiz,samaras}@cs.sunysb.edu
More informationRandom walks and anisotropic interpolation on graphs. Filip Malmberg
Random walks and anisotropic interpolation on graphs. Filip Malmberg Interpolation of missing data Assume that we have a graph where we have defined some (real) values for a subset of the nodes, and that
More informationStatistical Machine Learning for Structured and High Dimensional Data
Statistical Machine Learning for Structured and High Dimensional Data (FA9550-09- 1-0373) PI: Larry Wasserman (CMU) Co- PI: John Lafferty (UChicago and CMU) AFOSR Program Review (Jan 28-31, 2013, Washington,
More informationCompressed Sensing and Neural Networks
and Jan Vybíral (Charles University & Czech Technical University Prague, Czech Republic) NOMAD Summer Berlin, September 25-29, 2017 1 / 31 Outline Lasso & Introduction Notation Training the network Applications
More informationGaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012
Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature
More informationDiscriminative Models
No.5 Discriminative Models Hui Jiang Department of Electrical Engineering and Computer Science Lassonde School of Engineering York University, Toronto, Canada Outline Generative vs. Discriminative models
More informationSparse Gaussian Markov Random Field Mixtures for Anomaly Detection
Sparse Gaussian Markov Random Field Mixtures for Anomaly Detection Tsuyoshi Idé ( Ide-san ), Ankush Khandelwal*, Jayant Kalagnanam IBM Research, T. J. Watson Research Center (*Currently with University
More informationTractable Upper Bounds on the Restricted Isometry Constant
Tractable Upper Bounds on the Restricted Isometry Constant Alex d Aspremont, Francis Bach, Laurent El Ghaoui Princeton University, École Normale Supérieure, U.C. Berkeley. Support from NSF, DHS and Google.
More informationHigh-dimensional Covariance Estimation Based On Gaussian Graphical Models
High-dimensional Covariance Estimation Based On Gaussian Graphical Models Shuheng Zhou, Philipp Rutimann, Min Xu and Peter Buhlmann February 3, 2012 Problem definition Want to estimate the covariance matrix
More informationPattern Recognition and Machine Learning
Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability
More informationActive and Semi-supervised Kernel Classification
Active and Semi-supervised Kernel Classification Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London Work done in collaboration with Xiaojin Zhu (CMU), John Lafferty (CMU),
More informationDiscrete Signal Processing on Graphs: Sampling Theory
SUBMITTED TO IEEE TRANS. SIGNAL PROCESS., FEB 5. Discrete Signal Processing on Graphs: Sampling Theory Siheng Chen, Rohan Varma, Aliaksei Sandryhaila, Jelena Kovačević arxiv:53.543v [cs.it] 3 Mar 5 Abstract
More informationShort Course Robust Optimization and Machine Learning. 3. Optimization in Supervised Learning
Short Course Robust Optimization and 3. Optimization in Supervised EECS and IEOR Departments UC Berkeley Spring seminar TRANSP-OR, Zinal, Jan. 16-19, 2012 Outline Overview of Supervised models and variants
More informationSpectral Clustering. Zitao Liu
Spectral Clustering Zitao Liu Agenda Brief Clustering Review Similarity Graph Graph Laplacian Spectral Clustering Algorithm Graph Cut Point of View Random Walk Point of View Perturbation Theory Point of
More informationUnsupervised dimensionality reduction
Unsupervised dimensionality reduction Guillaume Obozinski Ecole des Ponts - ParisTech SOCN course 2014 Guillaume Obozinski Unsupervised dimensionality reduction 1/30 Outline 1 PCA 2 Kernel PCA 3 Multidimensional
More informationA Local Inverse Formula and a Factorization
A Local Inverse Formula and a Factorization arxiv:60.030v [math.na] 4 Oct 06 Gilbert Strang and Shev MacNamara With congratulations to Ian Sloan! Abstract When a matrix has a banded inverse there is a
More informationComputer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo
Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain
More informationJoint Gaussian Graphical Model Review Series I
Joint Gaussian Graphical Model Review Series I Probability Foundations Beilun Wang Advisor: Yanjun Qi 1 Department of Computer Science, University of Virginia http://jointggm.org/ June 23rd, 2017 Beilun
More informationarxiv: v1 [cs.na] 6 Jan 2017
SPECTRAL STATISTICS OF LATTICE GRAPH STRUCTURED, O-UIFORM PERCOLATIOS Stephen Kruzick and José M. F. Moura 2 Carnegie Mellon University, Department of Electrical Engineering 5000 Forbes Avenue, Pittsburgh,
More informationBlind Identification of Invertible Graph Filters with Multiple Sparse Inputs 1
Blind Identification of Invertible Graph Filters with Multiple Sparse Inputs Chang Ye Dept. of ECE and Goergen Institute for Data Science University of Rochester cye7@ur.rochester.edu http://www.ece.rochester.edu/~cye7/
More informationExploring Granger Causality for Time series via Wald Test on Estimated Models with Guaranteed Stability
Exploring Granger Causality for Time series via Wald Test on Estimated Models with Guaranteed Stability Nuntanut Raksasri Jitkomut Songsiri Department of Electrical Engineering, Faculty of Engineering,
More informationSemi-Supervised Graph Classifier Learning with Negative Edge Weights
Gene Cheung National Institute of Informatics 15 th May, 217 Semi-Supervised Graph Classifier Learning with Negative Edge Weights. Torino Visit 5/15/217 1 Acknowledgement Collaborators: Y. Ji, M. Kaneko
More informationGraphs, Geometry and Semi-supervised Learning
Graphs, Geometry and Semi-supervised Learning Mikhail Belkin The Ohio State University, Dept of Computer Science and Engineering and Dept of Statistics Collaborators: Partha Niyogi, Vikas Sindhwani In
More informationNEAR-OPTIMALITY OF GREEDY SET SELECTION IN THE SAMPLING OF GRAPH SIGNALS. Luiz F. O. Chamon and Alejandro Ribeiro
NEAR-OPTIMALITY OF GREEDY SET SELECTION IN THE SAMPLING OF GRAPH SIGNALS Luiz F. O. Chamon and Alejandro Ribeiro Department of Electrical and Systems Engineering University of Pennsylvania e-mail: luizf@seas.upenn.edu,
More information