Graph Signal Processing: Fundamentals and Applications to Diffusion Processes

Size: px
Start display at page:

Download "Graph Signal Processing: Fundamentals and Applications to Diffusion Processes"

Transcription

1 Graph Signal Processing: Fundamentals and Applications to Diffusion Processes Antonio G. Marques, Alejandro Ribeiro, Santiago Segarra King Juan Carlos University University of Pennsylvania Massachusetts Institute of Technology Thanks: Gonzalo Mateos, Geert Leus, and Weiyu Huang ICASSP17 New Orleans, USA - March 5-9, 2017 Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 1 / 120

2 Network Science analytics Online social media Internet Clean energy and grid analy,cs Desiderata: Process, analyze and learn from network data [Kolaczyk 09] Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 2 / 120

3 Network Science analytics Online social media Internet Clean energy and grid analy,cs Desiderata: Process, analyze and learn from network data [Kolaczyk 09] Network as graph G = (V, E, W ): encode pairwise relationships Interest here not in G itself, but in data associated with nodes in V Object of study is a graph signal x Networks are common and interesting. So are graph signals? Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 2 / 120

4 Network of economic sectors of the United States Bureau of Economic Analysis of the U.S. Department of Commerce E = Output of sector i is an input to sector j (62 sectors in V) Oil and Gas Services Finance PC AS FR RA OG IC CO MP SC MP Oil extraction (OG), Petroleum and coal products (PC), Construction (CO) Administrative services (AS), Professional services (MP) Credit intermediation (FR), Securities (SC), Real state (RA), Insurance (IC) Only interactions stronger than a threshold are shown Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 3 / 120

5 Network of economic sectors of the United States Bureau of Economic Analysis of the U.S. Department of Commerce E = Output of sector i is an input to sector j (62 sectors in V) A few sectors have widespread strong influence (services, finance, energy) Some sectors have strong indirect influences (oil) The heavy last row is final consumption This is an interesting network Signals on this graph are as well Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 4 / 120

6 Disaggregated GDP of the United States Signal x = output per sector = disaggregated GDP Network structure used to, e.g., reduce GDP estimation noise Signal is as interesting as the network itself. Arguably more Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 5 / 120

7 Disaggregated GDP of the United States Signal x = output per sector = disaggregated GDP Network structure used to, e.g., reduce GDP estimation noise Signal is as interesting as the network itself. Arguably more Same is true on brain connectivity and fmri brain signals,... Gene regulatory networks and gene expression levels,... Online social networks and information cascades,... Alignment of customer preferences and product ratings,... Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 5 / 120

8 Graph signal processing Graph SP: broaden classical SP to graph signals [Shuman et al. 13] Our view: GSP well suited to study network (diffusion) processes x x 4 x x 3 x 5 As.: Signal properties related to topology of G (locality, smoothness) Algorithms that fruitfully leverage this relational structure Q: Why do we expect the graph structure to be useful in processing x? Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 6 / 120

9 Importance of signal structure in time Signal and Information Processing is about exploiting signal structure Discrete time described by cyclic graph Time n follows time n 1 x x 1 2 x 2 Signal value x n similar to x n 1 Formalized with the notion of frequency Cyclic structure Fourier transform x = F H x x x 4 3 x 3 ) (F kn = ej2πkn/n N Fourier transform Projection on eigenvector space of cycle Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 7 / 120

10 Covariances and principal components Random signal with mean E [x] = 0 and covariance C x = E [ xx H] Eigenvector decomposition C x = VΛV H Covariance matrix C x is a graph Not a very good graph, but still Precision matrix C 1 x a common graph too Conditional dependencies of Gaussian x x 1 σ 16 1 σ 12 x 6 x σ 15 σ 45 4 x 4 σ 13 σ x 2 x 3 Covariance matrix structure Principal components (PCA) x = V H x PCA transform Projection on eigenvector space of (inverse) covariance Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 8 / 120

11 Covariances and principal components Random signal with mean E [x] = 0 and covariance C x = E [ xx H] Eigenvector decomposition C x = VΛV H Covariance matrix C x is a graph Not a very good graph, but still Precision matrix C 1 x a common graph too Conditional dependencies of Gaussian x x 1 σ 16 1 σ 12 x 6 x σ 15 σ 45 4 x 4 σ 13 σ x 2 x 3 Covariance matrix structure Principal components (PCA) x = V H x PCA transform Projection on eigenvector space of (inverse) covariance GSP: Extend principles to general graphs and corresponding (graph) signals Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 8 / 120

12 Part I: Fundamentals Motivation and preliminaries Part I: Fundamentals Graphs 101 Graph signals and the shift operator Graph Fourier Transform (GFT) Graph filters and network processes Part II: Applications Sampling graph signals Stationarity of graph processes Network topology inference Concluding remarks Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 9 / 120

13 Graphs 101 Formally, a graph G (or a network) is a triplet (V, E, W ) V = {1, 2,..., N} is a finite set of N nodes or vertices E V V is a set of edges defined as ordered pairs (n, m) Write N (n) = {m V : (m, n) E} as the in-neighbors of n W : E R is a map from the set of edges to scalar values w nm Represents the level of relationship from n to m Often weights are strictly positive, W : E R++ Unweighted graphs w nm {0, 1}, for all (n, m) E Undirected graphs (n, m) E if and only if (m, n) E and w nm = w mn, for all (n, m) E Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 10 / 120

14 Time, Images, and Covariances Time: Unweighted and directed graphs V = {0, 1,..., 23} E = {(0, 1), (1, 2),..., (22, 23), (23, 0)} W : (n, m) 1, for all (n, m) E Images: Unweighted and undirected graphs V = {1, 2, 3,..., 9} E = {(1, 2), (2, 3),..., (8, 9), (1, 4),..., (6, 9)} W : (n, m) 1, for all (n, m) E Σ 11 Σ 22 Σ 12 p 1 p 2 Σ 14 Σ 13 Σ23 Σ 24 p 3 p 4 Σ 34 Σ 33 Σ 44 Covariances: Weighted and undirected graphs V = {1, 2, 3, 4} E = {(1, 1), (1, 2),..., (4, 4)} = V V W : (n, m) σnm = σ mn, for all (n, m) Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 11 / 120

15 Adjacency matrix Algebraic graph theory: matrices associated with a graph G Adjacency A and Laplacian L matrices Spectral graph theory: properties of G using spectrum of A or L Given G = (V, E, W ), the adjacency matrix A R N N is { w nm, if (n, m) E A nm = 0, otherwise Matrix representation incorporating all information about G For unweighted graphs, positive entries represent connected pairs For weighted graphs, also denote proximities between pairs Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 12 / 120

16 Degree and k-hop neighbors If G is unweighted and undirected, the degree of node i is N (i) In directed graphs, have out-degree and an in-degree Using the adjacency matrix in the undirected case For node i: deg(i) = j N (i) A ij = j A ij For all N nodes: d = A1 Degree matrix: D := diag(d) The kth power of A describes k hop neighborhoods of each node. [A k ] ij non-zero only if there exists a path of length k from i to j Support of A k : pairs that can be reached in k hops Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 13 / 120

17 Laplacian of a graph Given undirected G with A and D, the Laplacian matrix L R N N is L = D A Equivalently, L can be defined element-wise as deg(i), if i = j L ij = w ij, if (i, j) E 0, otherwise Normalized Laplacian: L = D 1/2 LD 1/2 (we will focus on L) L = Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 14 / 120

18 Spectral properties of the Laplacian Denote by λ i and v i the eigenvalues and eigenvectors of L L is positive semi-definite x T Lx = 1 2 (i,j) E w ij(x i x j ) 2 0, for all x All eigenvalues are nonnegative, i.e. λ i 0 for all i A constant vector 1 is an eigenvector of L with eigenvalue 0 [L1] i = j N (i) Thus, λ 1 = 0 and v 1 = (1/ N) 1 w ij (1 1) = 0 In connected graphs, it holds that λ i > 0 for i = 2,..., N Multiplicity{λ = 0} = number of connected components Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 15 / 120

19 Graph signals and the shift operator Motivation and preliminaries Part I: Fundamentals Graphs 101 Graph signals and the shift operator Graph Fourier Transform (GFT) Graph filters and network processes Part II: Applications Sampling graph signals Stationarity of graph processes Network topology inference Concluding remarks Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 16 / 120

20 Graph signals Consider graph G = (V, E, W ). Graph signals are mappings x : V R Defined on the vertices of the graph (data tied to nodes) Ex: Opinion profile, buffer congestion levels, neural activity, epidemic May be represented as a vector x R N x n denotes the signal value at the n-th vertex in V Implicit ordering of vertices (same as in A or L) x x x = x 2 = x Data associated with links of G Use line graph of G Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 17 / 120

21 Graph signals Genetic profiles Graphs representing gene-gene interactions Each node denotes a single gene (loosely speaking) Connected if their coded proteins participate in same metabolism Genetic profiles for each patient can be considered as a graph signal Signal on each node is 1 if mutated and 0 otherwise x 1 = Sample patient 1 with subtype 1 P P T C K J T I I Sample patient 2 with subtype 1 P P T C K J T I I x 2 = To understand a graph signal, the structure of G must be considered Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 18 / 120

22 Graph-shift operator To understand and analyze x, useful to account for G s structure Associated with G is the graph-shift operator S R N N S ij = 0 for i j and (i, j) E (captures local structure in G) S can take nonzero values in the edges of G or in its diagonal Ex: Adjacency A, degree D, and Laplacian L = D A matrices Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 19 / 120

23 Relevance of the graph-shift operator Q: Why is S called shift? A: Resemblance to time shifts S will be building block for GSP algorithms (More soon) Same is true in the time domain (filters and delay) Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 20 / 120

24 Local structure of graph-shift operator S represents a linear transformation that can be computed locally at the nodes of the graph. More rigorously, if y is defined as y = Sx, then node i can compute y i if it has access to x j at j N (i). Straightforward because [S] ij 0 only if i = j or (j, i) E What if y = S 2 x? Like powers of A: neighborhoods y i found using values within 2-hops Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 21 / 120

25 Graph Fourier Transform Motivation and preliminaries Part I: Fundamentals Graphs 101 Graph signals and the shift operator Graph Fourier Transform (GFT) Graph filters and network processes Part II: Applications Sampling graph signals Stationarity of graph processes Network topology inference Concluding remarks Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 22 / 120

26 Discrete Fourier Transform (DFT) Let x be a temporal signal, its DFT is x = F H x, with F kn = 1 N e +j 2π N kn Equivalent description, provides insights Oftentimes, more parsimonious (bandlimited) Facilitates the design of SP algorithms: e.g., filters Many other transformations (orthogonal dictionaries) exist Q: What transformation is suitable for graph signals? Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 23 / 120

27 Graph Fourier Transform (GFT) Useful transformation? S involved in generation/description of x Let S = VΛV 1 be the shift associated with G The Graph Fourier Transform (GFT) of x is defined as x = V 1 x While the inverse GFT (igft) of x is defined as x = V x Eigenvectors V = [v 1,..., v N ] are the frequency basis (atoms) Additional structure If S is normal, then V 1 = V H and x k = v H k x =< v k, x > Parseval holds, x 2 = x 2 GFT Projection on eigenvector space of shift operator S Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 24 / 120

28 Is this a reasonable transform? Particularized to cyclic graphs GFT Fourier transform Particularized to covariance matrices GFT PCA transform But really, this is an empirical question. GFT of disaggregated GDP GFT transform characterized by a few coefficients Notion of bandlimitedness: x = K k=1 x kv k Sampling, compression, filtering, pattern recognition Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 25 / 120

29 Meaning of the eigenvalues Columns of V are the frequency atoms: x = k x kv k Q: What about the eigenvalues λ k = Λ kk? When S = A dc, we get λ k = e j 2π N (k 1) λ k can be viewed as frequencies!! In time, well-defined relation between frequency and variation Higher k faster oscillations Bounds on total-variation: TV (x) = n (x n x n 1 ) 2 Q: Does this carry over for graph signals? No, only if S = L {λ k } N k=1 will be very important when analyzing graph filters Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 26 / 120

30 Interpretation for the Laplacian Consider a graph G, let x be a signal on G, and set S = L Define the quadratic form x T Lx = 1 2 w ij (x i x j ) 2 (i,j) E x T Lx quantifies the (aggregated) local variation of signal x Natural measure of signal smoothness w.r.t. G Q: Interpretation of frequencies {λ k } N k=1 when S = L? If x = v k, we get x T Lx = λ k local variation of v k Frequencies account for local variation, they can be ordered Eigenvector associated with eigenvalue 0 is constant Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 27 / 120

31 Frequencies of the Laplacian Laplacian eigenvalue λ k accounts for the local variation of v k Let us plot some of the eigenvectors of L (also graph signals) Ex: gene network, N =10, k =1, k =2, k =9 T P I T P I T P I C P I C P I C P I K J T K J T K J T Ex: smooth natural images, N = 2 16, k = 2,..., 6 Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 28 / 120

32 Application: Cancer subtype classification Patients diagnosed with same disease exhibit different behaviors Each patient has a genetic profile describing gene mutations Would be beneficial to infer phenotypes from genotypes Targeted treatments, more suitable suggestions, etc. Traditional approaches consider different genes to be independent Not ideal, as different genes may affect same metabolism Alternatively, consider genetic network Genetic profiles become graph signals on genetic network We will see how this consideration improves subtype classification Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 29 / 120

33 Genetic network Genes Undirected and unweighted gene-to-gene interaction graph 2458 nodes are genes in human DNA related to breast cancer An edge between two genes represents interaction Coded proteins participate in the same metabolic process Adjacency matrix of the gene-interaction network Genes Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 30 / 120

34 Genetic profiles Genetic profile of 240 women with breast cancer 44 with serous subtype and 196 with endometrioid subtype Patient i has an associated profile x i {0, 1} 2458 Mutations are very varied across patients Some patients present a lot of mutations Some genes are consistently mutated across patients Q: Can we use genetic profiles to classify patients across subtypes? Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 31 / 120

35 Improving k-nearest neighbor classification Signal Amplitude Distance between genetic profiles d(i, j) = x i x j 2 N-fold cross-validation error from k-nn classification k = %, k = %, k = % Q: Can we do any better using graph signal processing? Each genetic profile x i is a graph signal on the genetic network Look at the frequency components x i using the GFT (S = L) Example of signal x i Frequency representation x i Gene Frequencies Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 32 / 120

36 Discriminative power Distinguishing Power Define the discriminative power of frequency v k as DP(v k ) = i:y i =1 x i(k) i 1 {y i = 1} i:y i =2 x i(k) i 1 {y i = 2} / i x i (k), Normalized difference between the mean GFT coefficient for v k Among patients with serous and endometrioid subtypes Discriminative power is not equal across frequencies Frequency The discriminative power defined is one of many proper heuristics Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 33 / 120

37 Retaining most discriminative frequencies Keep information in frequencies with higher discriminative power Filter, i.e., multiply x i by diag( h p ) where { [ h p 1, if DP(v k ) p-th percentile of DP ] k = 0, otherwise Perform igft to get the graph signal ˆx i Classify with k-nn 15 error percentage original one frequency p = 60 p = 75 p = 90 k = 3 k = 5 k = 7 Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 34 / 120

38 Graph filters and network processes Motivation and preliminaries Part I: Fundamentals Graphs 101 Graph signals and the shift operator Graph Fourier Transform (GFT) Graph filters and network processes Part II: Applications Sampling graph signals Stationarity of graph processes Network topology inference Concluding remarks Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 35 / 120

39 Linear (shift-invariant) graph filter Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 36 / 120 A graph filter H : R N R N is a map between graph signals Focus on linear filters map represented by an N N matrix Def1: Polynomial in S of degree L, with coeff. h = [h 0,..., h L ] T H := h 0 S 0 + h 1 S h L S L = L l=0 h ls l [Sandryhaila13] Def2: Orthogonal operator in the frequency domain H := Vdiag ( h) V 1, hk = g(λ k ) With [Ψ] k,l := λ l 1 k, we have h = Ψh Defs can be rendered equivalent More on this later, now focus on Def1

40 Graph filters as linear network operators Def1 says H = L l=0 h ls l Suppose H acts on a graph signal x to generate y = Hx If we define x (l) := S l x = Sx (l 1) y = L h l x (l) l=0 y is a linear combination of successive shifted versions of x After introducing S, we stressed that y =Sx can be computed locally x (l) can be found locally if x (l 1) is known The output of the filter can be found in L local steps A graph filter represents a linear transformation that Accounts for local structure of the graph Can be implemented distributedly in L steps Only requires info in L-neighborhood [Shuman13, Sandryhaila14] Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 37 / 120

41 An example of a graph filter x=[ 1, 2, 0, 0, 0, 0] T, h=[1, 1, 0.5] T, y =( L l=0 h ls)x= L l=0 h lx (l) Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 38 / 120

42 Frequency response of a graph filter Def2 says H = Vdiag( h)v 1 Since S = VΛV 1, Def1 H = L l=0 h ls l yields ( L L L ) H = h l S l = h l VΛ l V 1 = V h l Λ l l=0 l=0 l=0 V 1 The application Hx of filter H to x can be split into three parts V 1 takes signal x to the graph freq. domain x H := L l=0 h lλ l modulates the freq. coefficients x to obtain ỹ V brings the signal ỹ back to the graph domain y Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 39 / 120

43 Frequency response of a graph filter Def2 says H = Vdiag( h)v 1 Since S = VΛV 1, Def1 H = L l=0 h ls l yields ( L L L ) H = h l S l = h l VΛ l V 1 = V h l Λ l l=0 l=0 l=0 V 1 The application Hx of filter H to x can be split into three parts V 1 takes signal x to the graph freq. domain x H := L l=0 h lλ l modulates the freq. coefficients x to obtain ỹ V brings the signal ỹ back to the graph domain y Since H is diagonal, define H =: diag( h) h is the frequency response of the filter H Output at frequency k depends only on input at frequency k ỹ k = h k x k Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 39 / 120

44 Frequency response and filter coefficients Relation between h and h in a more friendly manner? Since h = diag( L l=0 h lλ l ), we have that h k = L l=0 h lλ l k Define the Vandermonde matrix Ψ as 1 λ 1... λ L 1 Ψ :=... 1 λ N... λ L N Frequency response of a graph filter If h are the coefficients of a graph filter, its frequency response is h = Ψh Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 40 / 120

45 Frequency response and filter coefficients Relation between h and h in a more friendly manner? Since h = diag( L l=0 h lλ l ), we have that h k = L l=0 h lλ l k Define the Vandermonde matrix Ψ as 1 λ 1... λ L 1 Ψ :=... 1 λ N... λ L N Frequency response of a graph filter If h are the coefficients of a graph filter, its frequency response is h = Ψh Given a desired h, we can find the coefficients h as h = Ψ 1 h Since Ψ is Vandermonde, invertible as long as λ k λ k for k k Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 40 / 120

46 More on the frequency response Since h = Ψ 1 h If all {λk } N k=1 distinct, then Any h can be implemented with at most L+1 =N coefficients Since h = Ψh If λ k = λ k, then The corresponding frequency response will be the same h k = h k For the particular case when S = A dc, we have that λ k = e j 2π N (k 1) π(1)(1) j 1 e N... e Ψ =... 2π(N 1)(1) j 1 e N... e j 2π(1)(N 1) N j 2π(N 1)(N 1) N = FH The frequency response is the DFT of the impulse response h = F H h Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 41 / 120

47 Frequency response for graph signals and filters Suppose that we have a signal x and filter coefficients h For time signals, it holds that the output y is ỹ = diag(f H h)f H x For graph signals, the output y in the frequency domain is ỹ = diag(ψh)v 1 x The GFT for filters is different from the GFT for signals Symmetry is lost, but both depend on spectrum of S Some of the properties are not true for graphs Several options to generalize operations Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 42 / 120

48 Deltas and system identification In time, given canonical delta δ 1 = e 1 = [1, 0,..., 0] T Any other delta e i is a shifted version of e 1 Fourier transform is constant: ẽ 1 = 1 (all freqs. are excited) Output to e 1 characterizes the filter, in fact output to e 1 is h In graph SP none of the above is true! Shifting is not translation e j is not a shifted version of e i GFT ẽ i = V 1 e i how strongly node i expresses each freq. ẽ i is not an eigenvector, ẽ i 1, and entries of ẽ i can be zero System id? Let y i be filter output when x = e i, then we have h = Ψ 1 diag 1 (V 1 e i )V 1 y i = Ψ 1 diag 1 (ẽ i )V 1 y i More involved, works if [ẽ i ] k are non-zero (see [Segarra16] for blind id) Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 43 / 120

49 Implementing graph filters: frequency or space Frequency or space? y = Vdiag( h)v 1 x vs. y = L l=0 h ls l x In space: leverage the fact that Sx can be computed locally Signal x is percolated L times to find {x (l) } L l=0 Every node finds its own y i by computing L l=0 h l[x (l) ] i Frequency implementation useful for processing if, e.g., Filter bandlimited and eigenvectors easy to find Low complexity [Anis16, Tremblay16] Space definition useful for modeling Diffusion, percolation, opinion formation,... (more on this soon) Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 44 / 120

50 Implementing graph filters: frequency or space Frequency or space? y = Vdiag( h)v 1 x vs. y = L l=0 h ls l x In space: leverage the fact that Sx can be computed locally Signal x is percolated L times to find {x (l) } L l=0 Every node finds its own y i by computing L l=0 h l[x (l) ] i Frequency implementation useful for processing if, e.g., Filter bandlimited and eigenvectors easy to find Low complexity [Anis16, Tremblay16] Space definition useful for modeling Diffusion, percolation, opinion formation,... (more on this soon) More on filter design Chebyshev polyn. [Shuman12]; AR-MA [Isufi-Leus15]; Node-var. [Segarra15]; Time-var. [Isufi-Leus16]; Median filters [Segarra16] Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 44 / 120

51 Linear network processes via graph filters Consider linear dynamics of the form x t x t 1 = αjx t 1 x t = (I αj)x t 1 If x is network process [x t ] i depends only on [x t 1 ] j, j N (i) [S] ij =[J] ij x t = (I αs)x t 1 x t = (I αs) t x 0 x t = Hx 0, with H a polynomial of S linear graph filter Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 45 / 120

52 Linear network processes via graph filters Consider linear dynamics of the form x t x t 1 = αjx t 1 x t = (I αj)x t 1 If x is network process [x t ] i depends only on [x t 1 ] j, j N (i) [S] ij =[J] ij x t = (I αs)x t 1 x t = (I αs) t x 0 x t = Hx 0, with H a polynomial of S linear graph filter If the system has memory output weighted sum of previous exchanges (opinion dynamics) still a polynomial of S y = T t=0 βt x t y = T t=0 (βi βαs)t x 0 Everything holds true if α t or β t are time varying Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 45 / 120

53 Diffusion dynamics and AR (IIR) filters Before finite-time dynamics (FIR filters) Consider now diffusion dynamics x t = αsx t 1 + w x t = α t S t x 0 + t t =0 αt S t w When t : x = (I αs) 1 w AR graph filter Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 46 / 120

54 Diffusion dynamics and AR (IIR) filters Before finite-time dynamics (FIR filters) Consider now diffusion dynamics x t = αsx t 1 + w x t = α t S t x 0 + t t =0 αt S t w When t : x = (I αs) 1 w AR graph filter Higher orders [Isufi-Leus16] M successive diffusion dynamics AR of order M Process is the sum of M parallel diffusions ARMA order M x = M m=1 (I α ms) 1 w x = M m=1 (I α ms) 1 w Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 46 / 120

55 General linear network processes Combinations of all the previous are possible x t = H a t (S)x t 1 + H b t (S)w x t = H A t (S)x 0 + H B t (S)w y = x t, sequential/parallel application, linear combination Expands range of processes that can be modeled via GSP Coefficients can change according to some control inputs A number of linear processes can be modeled using graph filters Theoretical GSP results can be applied to distributed networking Deconvolution, filtering, system id,... Beyond linearity possible too (more at the end of the talk) Links with control theory (of networks and complex systems) Controllability, observability Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 47 / 120

56 Graph filters: Summary Linear graph-signal operators that account for the structure of G Polynomials of S, orthogonal operators on the frequency domain A few take-home messages Output is weighted sum of shifted inputs Shifting is not a translation, but a local diffusion Distributed implementation across L-hop neighborhood GFT given by Vandermonde matrix Ψ, different from V 1 System identification more involved Graph filter design Designing h, designing h, and... designing S? Useful to process signals, but also to model networked phenomena Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 48 / 120

57 Application: Explaining human learning rates Why do some people learn faster than others? Can we answer this by looking at their brain activity? Brain activity during learning of a motor skill in 112 cortical regions fmri while learning a piano pattern for 20 individuals Pattern is repeated, reducing the time needed for execution Learning rate = rate of decrease in execution time Define a functional brain graph Based on correlated activity fmri outputs a series of graph signals x(t) R 112 describing brain states Does brain state variability correlate with learning? region brain signal n time # entries in highe Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 49 / 120

58 Measuring brain state variability We propose three different measures capturing different time scales Changes in micro, meso, and macro scales Micro: instantaneous changes higher than a threshold α m 1 (x) = T t=1 { } x(t) x(t 1) 2 1 > α x(t) 2 Meso: Cluster brain states and count the changes in clusters m 2 (x) = T 1 {c(t) c(t 1)} t=1 where c(t) is the cluster to which x(t) belongs. Macro: Sample entropy. Measure of complexity of time series ( t s t m 3 (x) = log 1{ x ) 3(t) x 3 (s) > α} t s t 1{ x 2(t) x 2 (s) > α} Where x r (t) = [x(t), x(t + 1),..., x(t + r 1)] Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 50 / 120

59 Response Diffusion as low-pass filtering We diffuse each time signal x(t) across the brain graph x diff (t) = (I + βl) 1 x(t) Laplacian L = VΛV 1 and β represents the diffusion rate Analyzing diffusion in the frequency domain x diff (t) = (I + βλ) 1 V 1 x(t) = diag( h) x(t) Freq. response h k = 1/(1 + βλ k ) Diffusion acts as low-pass filtering High freq. components are attenuated β controls the level of attenuation Frequency Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 51 / 120

60 Computing correlation for three signals Variability measures consider the order of brain signal activity As a control, we include in our analysis a null signal time series x null x null (t) = x diff (π t ) π t is a random permutation of the time indices Correlation between variability (m 1, m 2, and m 3 ) and learning? We consider three time series of brain activity The original fmri data x The filtered data x diff The null signal x null Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 52 / 120

61 Low-pass filtering reveals correlation Learning Rate Learning Rate Learning Rate Correlation coeff. between learning rate and brain state variability Original Filtered Null m m m Correlation is clear when the signal is filtered Result for original signal similar to null signal Scatter plots for original, filtered, and null signals (m 2 variability) # Variability # Variability # Variability Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 53 / 120

62 Part II: Applications Motivation and preliminaries Part I: Fundamentals Graphs 101 Graph signals and the shift operator Graph Fourier Transform (GFT) Graph filters and network processes Part II: Applications Sampling graph signals Stationarity of graph processes Network topology inference Concluding remarks Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 54 / 120

63 Application domains Design graph filters to approximate desired network operators Sampling bandlimited graph signals Blind graph filter identification Infer diffusion coefficients from observed output Statistical GSP: understanding random graph signals Network topology inference Infer shift from collection of network diffused signals Many more (glad to discuss or redirect): Filter banks Windowing, convolution, duality... Nonlinear GSP Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 55 / 120

64 Sampling bandlimited graph signals Motivation and preliminaries Part I: Fundamentals Graphs 101 Graph signals and the shift operator Graph Fourier Transform (GFT) Graph filters and network processes Part II: Applications Sampling graph signals Stationarity of graph processes Network topology inference Concluding remarks Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 56 / 120

65 Motivation and preliminaries Sampling and interpolation are cornerstone problems in classical SP How to recover a signal using only a few observations? Need to limit the degrees of freedom: subspace, smoothness What are reasonable observation models for graph signals? Subset selection model Static graph signal Have access to a subset of the nodes Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 57 / 120

66 Motivation and preliminaries Sampling and interpolation are cornerstone problems in classical SP How to recover a signal using only a few observations? Need to limit the degrees of freedom: subspace, smoothness What are reasonable observation models for graph signals? Subset selection model Static graph signal Have access to a subset of the nodes Node aggregation model Incorporate local graph structure into the observation model Recover signal from local observations at one node Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 57 / 120

67 Sampling bandlimited graph signals: Overview How to find x R N using P < N observations? Our focus on bandlimited signals, but other models possible x = V 1 x sparse x = k K x kv k, with K =K <N S involved in generation of x Agnostic to the particular form of S Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 58 / 120

68 Sampling bandlimited graph signals: Overview How to find x R N using P < N observations? Our focus on bandlimited signals, but other models possible x = V 1 x sparse x = k K x kv k, with K =K <N S involved in generation of x Agnostic to the particular form of S Two sampling schemes were introduced in the literature Correspond to the previous two observation models Selection [Anis14, Chen15, Tsitsvero15, Puy15, Wang15] Aggregation [Segarra15], [Marques15] Hybrid scheme combining both Space-shift sampling Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 58 / 120

69 Revisiting sampling in time There are two ways of interpreting sampling of time signals We can either freeze the signal and sample values at different times We can fix a point (present) and sample the evolution of the signal Both strategies coincide for time signals but not for general graphs Give rise to selection and aggregation sampling Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 59 / 120

70 Selection sampling: Definition Intuitive generalization to graph signals C {0, 1} P N (matrix P rows of I N ) Sampled signal is x = Cx Goal: recover x based on x Assume that the support of K is known (w.l.o.g. K = {k} K k=1 ) Since x k = 0 for k > K, define x K := [ x 1,..., x K ] T = E T K x Approach: use x to find x K, and then recover x as x = V(E K x K ) = (VE K ) x K = V K x K Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 60 / 120

71 Selection sampling: Recovery Number of samples P K x = Cx = CV K x K (CV K ) submatrix of V Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 61 / 120

72 Selection sampling: Recovery Number of samples P K x = Cx = CV K x K (CV K ) submatrix of V Recovery of selection sampling If rank(cv K ) K, x can be recovered from the P values in x as x = V K x K = V K (CV K ) x With P = K, hard to check invertibility (by inspection) Columns of V K (CV K ) 1 are the interpolators Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 61 / 120

73 Selection sampling: Recovery Number of samples P K x = Cx = CV K x K (CV K ) submatrix of V Recovery of selection sampling If rank(cv K ) K, x can be recovered from the P values in x as x = V K x K = V K (CV K ) x With P = K, hard to check invertibility (by inspection) Columns of V K (CV K ) 1 are the interpolators In time (S = A dc ), if the samples in C are equally spaced (CV K ) is Vandermonde (DFT) and V K (CV K ) 1 are sincs Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 61 / 120

74 Aggregation sampling: Definition Idea: incorporating S to the sampling procedure Reduces to classical sampling for time signals Consider shifted (aggregated) signals y (l) = S l x y (l) = Sy (l 1) found sequentially with only local exchanges Form y i = [y (0) i, y (1) i,..., y (N 1) i ] T (obtained locally by node i) The sampled signal is ȳ i = Cy i Goal: recover x based on ȳ i Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 62 / 120

75 Aggregation sampling: Recovery Goal: recover x based on ȳ i Same approach than before Use ȳ i to find x K, and then recover x as x = V K x K Define ū i := V T K e i and recall Ψ kl = λ l 1 k Recovery of aggregation sampling Signal x can be recovered from the first K samples in ȳ i as x = V K x K = V K diag 1 (ū i )(CΨ T E K ) 1 ȳ i provided that [ū i ] k 0 and all {λ k } K k=1 are distinct. If C = E T K, node i can recover x with info from K 1 hops! Node i has to be able to capture frequencies in K The frequencies have to distinguishable Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 63 / 120

76 Aggregation sampling: Recovery Goal: recover x based on ȳ i Same approach than before Use ȳ i to find x K, and then recover x as x = V K x K Define ū i := V T K e i and recall Ψ kl = λ l 1 k Recovery of aggregation sampling Signal x can be recovered from the first K samples in ȳ i as x = V K x K = V K diag 1 (ū i )(CΨ T E K ) 1 ȳ i provided that [ū i ] k 0 and all {λ k } K k=1 are distinct. If C = E T K, node i can recover x with info from K 1 hops! Node i has to be able to capture frequencies in K The frequencies have to distinguishable Bandlimited signals: Signals that can be well estimated locally Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 63 / 120

77 Aggregation and selection sampling: Example In time (S = A dc ), selection and aggregation are equivalent Differences for a more general graph? Erdős-Rényi p = 0.2, S = A, K = 3, non-smooth First 3 observations at node 4: y 4 = [ 0.55, 1.27, 2.94] T [y 4 ] 1 = x 4 = 0.55, [y 4 ] 2 = x 2 + x 3 + x 5 + x 6 + x 7 = 1.27 For this example, any node guarantees recovery Selection sampling fails if, e.g., {1, 3, 4} Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 64 / 120

78 Sampling: Discussion and extensions Discussion on aggregation sampling Observation matrix: diagonal times Vandermonde Very appropriate in distributed scenarios Different nodes will lead to different performance (soon) Types of signals that are actually bandlimited (role of S) Three extensions: Sampling in the presence of noise Unknown frequency support Space-shift sampling (hybrid) Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 65 / 120

79 Presence of noise Linear observation model additive noise w i covariance R (i) w BLUE interpolation ˆ x (i) K = [ΨH i C H ( R (i) w ) 1 CΨ i ] 1 Ψ H i C H ( R (i) w ) 1 z i If P = K, then ˆx (i) = V K (CΨ i ) 1 z i Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 66 / 120

80 Presence of noise Linear observation model additive noise w i covariance R (i) w BLUE interpolation ˆ x (i) K = [ΨH i C H ( R (i) w ) 1 CΨ i ] 1 Ψ H i C H ( R (i) w ) 1 z i If P = K, then ˆx (i) = V K (CΨ i ) 1 z i Error covariances (R e (i), R (i) e ) in closed form Noise covariances? Colored, different models: white noise in z i, in x, or in x K Metric to optimize? [ trace(r (i) e ), λ max (R (i) e ), log det( R (i) e ), trace )] 1 1 ( R(i) e Select i and C to min. error Depends on metric and noise [Marques16] Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 66 / 120

81 Unknown frequency support Falls into the class of sparse reconstruction: observation matrix? Selec. submatrix of unitary V K Aggr. Vander. diag [u i ] k 0 and λ k λ k full-spark Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 67 / 120

82 Unknown frequency support Falls into the class of sparse reconstruction: observation matrix? Selec. submatrix of unitary V K Aggr. Vander. diag [u i ] k 0 and λ k λ k full-spark Joint recovery and support identification (noiseless) x := arg min x x 0 s. to Cy i = CΨ i x, If full spark P = 2K samples suffice Different relaxations are possible Conditioning will depend on Ψ i (e.g., how different {λ k } are) Noisy case: sampling nodes critical Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 67 / 120

83 Recovery with unknown support: Example Erdős-Rényi p = 0.15, 0.20, 0.25, K = 3, non-smooth Three different shifts: A, (I A) and 1 2 A2 Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 68 / 120

84 Space-shift sampling Space-shift sampling (hybrid) Multiple nodes and multiple shifts Selection: 4 nodes, 1 sample Space-shift: 2 nodes, 2 samples Aggregat.: 1 node, 4 samples Section and aggregation sampling as particular cases With Ū := [diag(ū 1),..., diag(ū N )] T, the sampled signal is ( = C I (Ψ z T E K ) )Ū xk + Cw As before, BLUE and error covariance in closed-form Optimizing sample selection more challenging More structured schemes easier: e.g., message passing Node i knows y (l) i node i knows y (l ) j for all j N i and l < l Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 69 / 120

85 Sampling the US economy 62 economic sectors in USA + 2 artificial sectors Graph: average flows in , S = A Signal x: production in 2011 x is approximately bandlimited with K = 4 Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 70 / 120

86 Sampling the US economy: Results Setup 1: we add different types of noise Error depends on sampling node: better if more connected Setup 2: we try different shift-space strategies Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 71 / 120

87 More on sampling graph signals Beyond bandlimitedness Smooth signals [Chen15] Parsimonious in kernelized domain [Romero16] Strategies to select the sampling nodes Random (sketching) [Varma15] Optimal reconstruction [Marques16, Chepuri-Leus16] Designed based on posterior task [Gama16] And more... Low-complexity implementations [Tremblay16, Anis16] Local implementations [Wang14, Segarra15] Unknown spectral decomposition [Anis16] Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 72 / 120

88 Stationarity of random graph processes Motivation and preliminaries Part I: Fundamentals Graphs 101 Graph signals and the shift operator Graph Fourier Transform (GFT) Graph filters and network processes Part II: Applications Sampling graph signals Stationarity of graph processes Network topology inference Concluding remarks Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 73 / 120

89 Motivation and context We frequently encounter stochastic processes Statistical SP tools for their understanding Stationarity facilitates the analysis of random signals in time Statistical properties are time-invariant We seek to extend the concept of stationarity to graph processes Network data and irregular domains motivate this Lack of regularity leads to multiple definitions Classical SSP can be generalized: spectral estimation, periodograms,... Better understanding and estimation of graph processes Related works: [Girault 15], [Perraudin 16] Segarra, Marques, Leus, Ribeiro, Stationary Graph Processes: Nonparametric Spectral Estimation, SAM16 Marques, Segarra, Leus, Ribeiro, Stationary Graph Processes and Spectral Estimation, IEEE TSP (sub.) Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 74 / 120

90 Prelimaries: Weak stationarity in time (1) Correlation of stationary discrete time signals is invariant to shifts [ C x := E xx H] ] = E [x H (n l) N x(n l) N = E [S l x(s l x) H] (2) Signal is the output of a LTI filter H excited with white noise w x = Hw, with E [ ww H] = I (3) The covariance matrix C x is diagonalized by the Fourier matrix C x = Fdiag(p)F H The process has a power spectral density p := diag ( F H C x F ) Each of these definitions can be generalized to graph signals Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 75 / 120

91 Stationary graph processes: Shifts Definition (shift invariance) Process x is weakly stationary with respect to S if and only if (b > c) (S E[ a x )( (S H ) b x ) ] [ H (S = E a+c x )( (S H ) b c x ) ] H Use a and b shifts as reference. Shift by c forward and backward Signal is stationary if these shifts do not alter its covariance It reduces to E [ xx H] = E [ S l x(s l x) H] when S is a directed cycle Time shift is orthogonal, S H = S 1 (a = 0, b = N and c = l) Need reference shifts because S can change energy of the signal Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 76 / 120

92 Stationary graph processes: Filtering Definition (filtering of white noise) Process x is weakly stationary with respect to S if it can be written as the output of linear graph filter H with white input w x = Hw, with E [ ww H] = I The filter H is linear shift invariant if H(Sx) = S(Hx) L Equivalently, H polynomial on the shift operator H = h l S l Filter H determines color C x = E [ (Hw)(Hw) H] = HH H l=0 Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 77 / 120

93 Stationary graph processes: Diagonalization Definition (Simultaneous diagonalization) Process x is weakly stationary with respect to S if the covariance C x and the shift S are simultaneously diagonalizable S = V Λ V H = C x = V diag(p) V H Equivalent to time definition because F diagonalizes cycle graph The process has a power spectral density p := diag ( V H C x V ) Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 78 / 120

94 Stationary graph processes: Diagonalization Definition (Simultaneous diagonalization) Process x is weakly stationary with respect to S if the covariance C x and the shift S are simultaneously diagonalizable S = V Λ V H = C x = V diag(p) V H Equivalent to time definition because F diagonalizes cycle graph The process has a power spectral density p := diag ( V H C x V ) Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 78 / 120

95 Equivalence of definitions and PSD Have introduced three equally valid definitions of weak stationarity They are different but, under mild conditions, equivalent Proposition Process x has shift invariant correlation matrix it is the output of a linear shift invariant filter Covariance jointly diagonalizable with shift Shift and Filtering How stationary signals look like (local invariance) Simultaneous Diagonalization A PSD exists p := diag ( V H C x V ) The PSD collects the eigenvalues of C x and is nonnegative Proposition Let x be stationary in S and define the process x := V H x. Then, it holds that x is uncorrelated with covariance matrix C x = E [ x x H] = diag(p). Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 79 / 120

96 Weak stationary graph processes examples Example (White noise) White noise w is stationary in any graph shift S = VΛV H Covariance C w = σ 2 I simultaneously diagonalizable with all S Example (Covariance matrix graphs and Precision matrices) Every process is stationary in the graph defined by its covariance matrix If S = C x, shift S and covariance C x diagonalized by same basis Process is also stationary on precision matrix S = C 1 x Example (Heat diffusion processes and ARMA processes) Heat diffusion process in a graph x = α 0 (I αl) 1 w Stationary in L since α 0 (I αl) 1 is a polynomial on L Any autoregressive moving average (ARMA) process on a graph Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 80 / 120

97 Power spectral density examples Example (White noise) Power spectral density p = diag ( V H (σ 2 I)V ) = σ 2 1 Example (Covariance matrix graphs and Precision matrices) Power spectral density p = diag ( V H (VΛV H )V ) = diag(λ) Example (Heat diffusion processes and ARMA processes) Power spectral density p = diag [α 0 2 (I αλ) 2] Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 81 / 120

98 PSD estimation with correlogram & periodogram Given a process x, the covariance of x = V H x is given by C x := E [ x x H] = E [ (V H x)(v H x) H] = diag(p) Periodogram Given samples {x r } R r=1, average GFTs of samples ˆp pg := 1 R R x r 2 = 1 R r=1 R V H x r 2 r=1 Correlogram Replace C x in PSD definition by sample covariance ˆp cg ( ) := diag V H Ĉ x V [ [ 1 := diag V H R R x r x H r r=1 ] ] V Periodogram and correlogram lead to identical estimates ˆp pg = ˆp cg Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 82 / 120

99 Accuracy of periodogram estimator Theorem If the process x is Gaussian, periodogram estimates have bias and variance Bias b pg := E [ˆp pg ] p = 0 Variance Σ pg := E [ (ˆp pg p)(ˆp pg p) H] = 2 R diag2 (p) The periodogram is unbiased but the variance is not too good Quadratic in p. Same as time processes Alternative nonparametric methods to reduce variance Average windowed periodogram Filterbanks Bias - variance tradeoff characterized [Marques16,Segarra16] Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 83 / 120

100 Parametric PSD estimation Until now non-parametric estimators: parametric estimation? View x as output of a graph filter with P N parameters Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 84 / 120

101 Parametric PSD estimation Until now non-parametric estimators: parametric estimation? View x as output of a graph filter with P N parameters Key how to use x to estimate ĥ Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 84 / 120

102 Parametric PSD estimation: finding h How to use realization x to estimate ĥ Input w white and x stationary Rely on correlations Problem can be formulated in the vertex or in the frequency domain Vertex domain: ĥ = arg min h d(c(h), Ĉ(x)) C(h) = H(h, S)H(h, S) H Ĉ(x) = xxh Frequency domain: ĥ = arg min h d(p(h), ˆp(x)) p(h) = diag(ψ(hh H )Ψ H ) ˆp(x) = diag(v H (xx H )V) C( ) and p( ) quadratic on h General estimation nonconvex If d(, ) quadratic, phase retrieval Particular cases (S 0 and h 0) tractable Not in time Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 85 / 120

103 Parametric PSD estimation: an example Intuition on parametric estimation Better performance because less parameters must be estimated Give rise to smoother PSD estimates Example: Laplacian diffusion with positive coefficients Diffusion will favor exponential spectral decay Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 86 / 120

104 Parametric PSD estimation: an example Intuition on parametric estimation Better performance because less parameters must be estimated Give rise to smoother PSD estimates Example: Laplacian diffusion with positive coefficients Diffusion will favor exponential spectral decay Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 86 / 120

105 Parametric PSD estimation: an example Intuition on parametric estimation Better performance because less parameters must be estimated Give rise to smoother PSD estimates Example: Laplacian diffusion with positive coefficients Diffusion will favor exponential spectral decay Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 86 / 120

106 Alternative parametric formulations Chose vertex or frequency based on the type of filter MA, AR, ARMA can be generalized If the number of parameters (filter degree) not known Use regularizers so that h is sparse Alternative parametric models x sum of frequency basis x = V H x is sparse x is diffusion of a sparse initial state w is sparse Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 87 / 120

107 Normalized MSE Average periodogram MSE of periodogram as a function of the nr. of observations R Baseline ER random graph (N = 100 and p = 0.05) and S = A Observe filtered white Gaussian noise and estimate PSD 2 1 Baseline ER Smaller ER Larger PSD Small-world Number of observations Normalized MSE evolves as 2/R as expected Invariant to size, topology, and PSD Same behavior observed in non-gaussian processes (theory not valid) Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 88 / 120

108 Windowed average periodogram Performance of local windows and random windows Block stochastic graph (N = 100, 10 communities) and small world Process filters white noise with different number of taps Normalized MSE Bias squared Variance Theoretical error Empirical error Normalized MSE Local window (L=10) Local window (L=2) Random window (L=10) Random window (L=2) 0 Single window Ten local wind. Ten random wind Number of windows The use of windows introduces bias but reduces total error (MSE) Local windows work better than random windows Advantage of local windows is larger for local processes Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 89 / 120

109 Source identification error Opinion source identification Opinion diffusion in Zachary s karate club network (N = 34) Observed opinion x obtained by diffusing sparse white rumor w Given {x r } R r=1 generated from unknown {w r } R r=1 Diffused through filter of unknown nonnegative coefficients β Goal Identify the support of each rumor w r First Estimate β from Moving Average PSD estimation Second Solve R sparse linear regressions to recover supp(w r ) < = 0 < = 0.10 < = 0.15 < = Number of observations Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 90 / 120

110 Frequency index PSD of face images PSD estimation for spectral signatures of faces of different people 100 grayscale face images {x i } 100 i=1 R10304 (10 images 10 people) Consider x i as realization graph process that is Stationary on Ĉ x Construct Ĉ(j) x = V (j) Λ (j) c V H(j) based on images of person j # Frequency index Process of person j approximately stationary in Ĉx (left) Use windowed average periodogram to estimate PSD of new face Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 91 / 120

111 Stationarity: Takeaways Extended the notion of weak stationarity for graph processes Three definitions inspired in stationary time processes Shown all of them to be equivalent Defined power spectral density and studied its estimation Generalized classical non-parametric estimation methods Periodogram and correlogram where shown to be equivalent Windowed average periodogram, filter banks Generalized classical ARMA parametric estimation methods Particular cases tractable Extensions Other parametric schemes Space-time variation Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 92 / 120

112 Network topology inference Motivation and preliminaries Part I: Fundamentals Graphs 101 Graph signals and the shift operator Graph Fourier Transform (GFT) Graph filters and network processes Part II: Applications Sampling graph signals Stationarity of graph processes Network topology inference Concluding remarks Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 93 / 120

113 Motivation and context Network topology inference from nodal observations [Kolaczyk 09] Approaches use Pearson correlations to construct graphs [Brovelli04] Partial correlations and conditional dependence [Friedman08, Karanikolas16] Key in neuroscience [Sporns 10] Functional net inferred from activity Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 94 / 120

114 Motivation and context Network topology inference from nodal observations [Kolaczyk 09] Approaches use Pearson correlations to construct graphs [Brovelli04] Partial correlations and conditional dependence [Friedman08, Karanikolas16] Key in neuroscience [Sporns 10] Functional net inferred from activity Most GSP works: How known graph S affects signals and filters Here, reverse path: How to use GSP to infer the graph topology? Gaussian graphical models [Egilmez16] Smooth signals [Dong15], [Kalofolias16] Stationary signals [Segarra16], [Pasdeloup16] Directed graphs [Mei-Moura15], [Shen16] Segarra, Marques, Mateos, Ribeiro, Network Topology Identification from Spectral Templates, IEEE SSP16 Segarra, Marques, Mateos, Ribeiro, Network Topology Inference from Spectral Templates, JSTSP (sub.) Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 94 / 120

115 Our approach for topology inference We propose a two-step approach for graph topology identification Input" A"priori"info,"desired" topological"features" Inferred" network" STEP%1:% Iden-fy%the%eigenvectors% of%the%shi9% Inferred" eigenvectors" STEP%2:% Iden-fy%eigenvalues%to% obtain%a%suitable%shi9% Beyond diffusion alternative sources for spectral templates V Graph sparsification, network deconvolution,... Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 95 / 120

116 Step 1: Eigenvectors from graph stationarity Given a set of signals {x r } R r=1 find S We view signals as samples of random graph process x AS. x is stationary in S Equivalent to x is the linear diffusion of a white input x = α 0 (I α l S)w = l=1 β l S l w l=0 i.e. x = Hw Examples: heat diffusion, structural equation models x = (I αl) 1 w x = Ax + w We say the graph shift S explains the structure of signal x Key point after assuming stationarity: eigenvectors of the covariance Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 96 / 120

117 Step 1: Eigenvectors from covariance The covariance matrix of the stationary signal x = Hw is C x = E [ xx T ] = HE [( ww T )] H T = HH T Since H is diagonalized by V, so is the covariance C x C x = V L 1 l=0 h lλ l 2 V T = V diag(p) V T Any shift with eigenvectors V can explain x G and its specific eigenvalues have been obscured by diffusion Observations (a) Identifying S identifying the eigenvalues (b) Correlation methods eigenvalues are kept unchanged (c) Precision methods eigenvalues are inverted Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 97 / 120

118 Step 1: Other sources of spectral templates 1) Graph sparsification Goal: given S f find sparser S with same eigenvectors Find S f = V f Λ f Vf T and set V = V f Otentimes referred to as network deconvolution problem Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 98 / 120

119 Step 1: Other sources of spectral templates 1) Graph sparsification Goal: given S f find sparser S with same eigenvectors Find S f = V f Λ f Vf T and set V = V f Otentimes referred to as network deconvolution problem 2) Nodal relation assumed by a given transform GSP: decompose S = VΛV T and set V T as GFT SP: some transforms T known to work well on specific data Goal: given T, set V T = T and identify S intuition on data relation DCTs: i iii Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 98 / 120

120 Step 1: Other sources of spectral templates 1) Graph sparsification Goal: given S f find sparser S with same eigenvectors Find S f = V f Λ f Vf T and set V = V f Otentimes referred to as network deconvolution problem 2) Nodal relation assumed by a given transform GSP: decompose S = VΛV T and set V T as GFT SP: some transforms T known to work well on specific data Goal: given T, set V T = T and identify S intuition on data relation DCTs: i iii 3) Implementation of linear network operators Goal: distributed implementation of linear operator B via graph filter Feasible if B and S share eigenvectors Like 1) with S f = B Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 98 / 120

121 Step 2: Obtaining the eigenvalues Given V, there are many possible S = Vdiag(λ)V T We can use extra knowledge/assumptions to choose one graph Of all graphs, select one that is optimal in some sense S := argmin S,λ N f (S, λ) s. to S = λ k v k vk T, S S (1) k=1 Set S contains all admissible scaled adjacency matrices S :={S S ij 0, S M N, S ii = 0, j S 1j =1} Can accommodate Laplacian matrices as well Problem is convex if we select a convex objective f (S, λ) Minimum energy (f (S) = S F ), Fast mixing (f (λ) = λ 2 ) Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 99 / 120

122 Size of the feasibility set Feasible set in (1) small helps optimization Search over λ R N N linear constraints S ii = 0 To be rigorous define W :=V V (Khatri-Rao) D index set diagonal elements Assume that (1) is feasible, then it holds that rank(w D ) N 1. If rank(w D ) = N 1, then the feasible set of (1) is a singleton. Take-aways Convex and small feasibility set Exhaustive search affordable rank(w D ) = N 1 arises in practice Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 100 / 120

123 Non-convex objective: Sparse recovery Whenever the feasibility set of (1) is non-trivial f (S, λ) determines the features of the recovered graph Ex: Identify the sparsest shift S 0 that explains observed signal structure Set the cost f (S, λ) = S 0 S 0 = argmin S,λ N S 0 s. to S = λ k v k vk T, S S k=1 Non-convex problem, relax to l 1 norm minimization (e.g. [Tropp06]) S 1 := argmin S,λ N S 1 s. to S = λ k v k vk T, S S k=1 Does the solution S 1 coincide with the l 0 solution S 0? Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 101 / 120

124 Recovery guarantee for l 1 relaxation Recall that W :=V V Build M := (I WW ) D c the orthogonal projector onto range(w) Construct R := [M, e 1 1 N 1 ] Denote by K the indices of the support of s 0 = vec(s 0 ) S 1 and S 0 coincide if the two following conditions are satisfied: 1) rank(r K ) = K ; and 2) There exists a constant δ > 0 such that ψ R := I K c (δ 2 RR T + I T K c I K c ) 1 I T K < 1. Cond. 1) ensures uniqueness of solution S 1 Cond. 2) guarantees existence of a dual certificate for l 0 optimality Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 102 / 120

125 Robust shift identification So far, two-step algorithm based on perfect spectral templates However, perfect knowledge of V may not be available Robust designs? Q1: How to modify the optimization in step 2? Distance for noise, orthogonal subspace for incomplete Q2: Recovery guarantees? Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 103 / 120

126 Noisy spectral templates We might have access to ˆV, a noisy version of the spectral templates With d(, ) denoting a (convex) distance between matrices min {S,λ,Ŝ} S 1 s. to Ŝ = N k=1 λ kˆv kˆv k T, S S, d(s, Ŝ) ɛ How does the noise in ˆV affect the recovery? Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 104 / 120

127 Noisy spectral templates We might have access to ˆV, a noisy version of the spectral templates With d(, ) denoting a (convex) distance between matrices min {S,λ,Ŝ} S 1 s. to Ŝ = N k=1 λ kˆv kˆv k T, S S, d(s, Ŝ) ɛ How does the noise in ˆV affect the recovery? Stability (robustness) can be established Conditions 1) and 2) but based on ˆR, guaranteed d(s, S 0 ) Cɛ ɛ large enough to guarantee feasibility of S 0 Constant C depends on ˆV and the support K Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 104 / 120

128 Incomplete spectral templates Partial access to V Only K known eigenvectors V K = [v 1,..., v K ] E.g., if (sample) covariance is rank-deficient min S 1 s. to S = S K + K k=1 λ kv k v T k, S S, S K v k = 0 {S,S K,λ} Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 105 / 120

129 Incomplete spectral templates Partial access to V Only K known eigenvectors V K = [v 1,..., v K ] E.g., if (sample) covariance is rank-deficient min S 1 s. to S = S K + K k=1 λ kv k v T k, S S, S K v k = 0 {S,S K,λ} How does the (partial) knowledge of V affect the recovery? A bit more involved, conditions 1) and 2) depend now on V K For K = N, guarantees boil down to the noiseless case Incomplete and noisy scenarios can be combined Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 105 / 120

130 Topology inference in random graphs N N Frequency Erdős-Rényi graphs of varying size N {10, 20,..., 50} Edge probabilities p {0.1, 0.2,..., 0.9} Recovery rates for adjacency (left) and normalized Laplacian (mid) All realizations Successful recovery Unique solution p p Rank Recovery is easier for intermediate values of p Rate of recovery related to the rank(w D ) (histogram N =10,p =0.2) When rank is N 1, recovery is guaranteed As rank decreases, there is a detrimental effect on recovery Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 106 / 120

131 Sparse recovery guarantee Generate 1000 ER random graphs (N = 20, p = 0.1) such that Feasible set is not a singleton Cond. 1) in sparse recovery theorem is satisfied Noiseless case: l 1 norm guarantees recovery as long as ψ R < Recovery Success Recovery Failure Condition is sufficient but not necessary Tightest possible bound on this matrix norm Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 107 / 120

132 Recovery error Inferring brain graphs from noisy templates Identification of structural brain graphs N = 66 Test recovery for noisy spectral templates ˆV Obtained from sample covariances of diffused signals Patient 1 Patient 2 Patient Number of Observations Recovery error decreases with increasing number of observed signals More reliable estimate of the covariance Less noisy ˆV Brain of patient 1 is consistently the hardest to identify Robustness for identification in noisy scenarios Traditional methods like graphical lasso fail to recover S Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 108 / 120

133 Recovery error Inferring social graphs from incomplete templates Identification of multiple social networks N = 32 Defined on the same node set of students from Ljubljana Test recovery for incomplete spectral templates ˆV = [v 1,..., v K ] Obtained from a low-pass diffusion process Repeated eigenvalues in C x introduce rotation ambiguity in V Network 1 Network 2 Network 3 Network Number of spectral templates Recovery error decreases with increasing nr. of spectral templates Performance improvement is sharp and precipitous Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 109 / 120

134 Performance comparisons F-measure Comparison with graphical lasso and sparse correlation methods Evaluated on 100 realizations of ER graphs with N = 20 and p = Our for H Our for H 2 Correl. for H 1 Correl for H 2 GLasso for H 1 GLasso for H Number of observations Graphical lasso implicitly assumes a filter H 1 = (ρi + S) 1/2 For this filter spectral templates work, but not as well (MLE) For general diffusion filters H 2 spectral templates still work fine Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 110 / 120

135 Inferring direct relations Our method can be used to sparsify a given network Keep direct and important edges or relations Discard indirect relations that can be explained by direct ones Use eigenvectors ˆV of given network as noisy templates Infer contact between amino-acid residues in BPT1 BOVIN Use mutual information of amino-acid covariation as input Ground truth Mutual info. Network deconv. Our approach Network deconvolution assumes a specific filter model [Feizi13] We achieve better performance by being agnostic to this Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 111 / 120

136 Topology ID: Takeaways Network topology inference cornerstone problem in Network Science Most GSP works analyze how S affect signals and filters Here, reverse path: How to use GSP to infer the graph topology? Our GSP approach to network topology inference Two step approach: i) Obtain V; ii) Estimate S given V Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 112 / 120

137 Topology ID: Takeaways Network topology inference cornerstone problem in Network Science Most GSP works analyze how S affect signals and filters Here, reverse path: How to use GSP to infer the graph topology? Our GSP approach to network topology inference Two step approach: i) Obtain V; ii) Estimate S given V How to obtain the spectral templates V Based on covariance of diffused signals Other sources too: net operators, data transforms Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 112 / 120

138 Topology ID: Takeaways Network topology inference cornerstone problem in Network Science Most GSP works analyze how S affect signals and filters Here, reverse path: How to use GSP to infer the graph topology? Our GSP approach to network topology inference Two step approach: i) Obtain V; ii) Estimate S given V How to obtain the spectral templates V Based on covariance of diffused signals Other sources too: net operators, data transforms Infer S via convex optimization Objectives promotes desirable properties Constraints encode structure a priori info and structure Formulations for perfect and imperfect templates Sparse recovery results for both adjacency and Laplacian Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 112 / 120

139 Wrapping up Motivation and preliminaries Part I: Fundamentals Graphs 101 Graph signals and the shift operator Graph Fourier Transform (GFT) Graph filters and network processes Part II: Applications Sampling graph signals Stationarity of graph processes Network topology inference Concluding remarks Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 113 / 120

140 Concluding remarks Network science and big data pose new challenges GSP can contribute to solve some of those challenges Well suited for network (diffusion) processes Central elements in GSP: graph-shift operator and Fourier transform Graph filters: operate graph signals Polynomials of the shift operator that can be implemented locally Network diffusion/percolations processes via graph filters Successive/parallel combination of local linear dynamics Possibly time-varying diffusion coefficients Accurate to model certain setups GSP yields insights on how those processes behave Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 114 / 120

141 Concluding remarks GSP results can be applied to solve practical problems Sampling, interpolation (network control) Input and system ID (rumor ID) Shift design (network topology ID) Marques, Ribeiro, Segarra Graph SP: Fundamentals and Applications 115 / 120

Graph Signal Processing: Fundamentals and Applications to Diffusion Processes

Graph Signal Processing: Fundamentals and Applications to Diffusion Processes Graph Signal Processing: Fundamentals and Applications to Diffusion Processes Antonio G. Marques, Santiago Segarra, Alejandro Ribeiro King Juan Carlos University Massachusetts Institute of Technology University

More information

Stationary Graph Processes: Nonparametric Spectral Estimation

Stationary Graph Processes: Nonparametric Spectral Estimation Stationary Graph Processes: Nonparametric Spectral Estimation Santiago Segarra, Antonio G. Marques, Geert Leus, and Alejandro Ribeiro Dept. of Signal Theory and Communications King Juan Carlos University

More information

Graph Signal Processing in Distributed Network Processing

Graph Signal Processing in Distributed Network Processing Graph Signal Processing in Distributed Network Processing Antonio G. Marques King Juan Carlos University - Madrid (SPAIN) antonio.garcia.marques@urjc.es http://www.tsc.urjc.es/~amarques/ Thanks: S. Segarra,

More information

Network Topology Inference from Non-stationary Graph Signals

Network Topology Inference from Non-stationary Graph Signals Network Topology Inference from Non-stationary Graph Signals Rasoul Shafipour Dept. of Electrical and Computer Engineering University of Rochester rshafipo@ece.rochester.edu http://www.ece.rochester.edu/~rshafipo/

More information

Blind Identification of Invertible Graph Filters with Multiple Sparse Inputs 1

Blind Identification of Invertible Graph Filters with Multiple Sparse Inputs 1 Blind Identification of Invertible Graph Filters with Multiple Sparse Inputs Chang Ye Dept. of ECE and Goergen Institute for Data Science University of Rochester cye7@ur.rochester.edu http://www.ece.rochester.edu/~cye7/

More information

INTERPOLATION OF GRAPH SIGNALS USING SHIFT-INVARIANT GRAPH FILTERS

INTERPOLATION OF GRAPH SIGNALS USING SHIFT-INVARIANT GRAPH FILTERS INTERPOLATION OF GRAPH SIGNALS USING SHIFT-INVARIANT GRAPH FILTERS Santiago Segarra, Antonio G. Marques, Geert Leus, Alejandro Ribeiro University of Pennsylvania, Dept. of Electrical and Systems Eng.,

More information

Stationary Graph Processes and Spectral Estimation

Stationary Graph Processes and Spectral Estimation Stationary Graph Processes and Spectral Estimation Antonio G. Marques, Santiago Segarra, Geert Leus, and Alejandro Ribeiro arxiv:63.667v [cs.sy] 5 Aug 7 Abstract Stationarity is a cornerstone property

More information

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 65, NO. 22, NOVEMBER 15,

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 65, NO. 22, NOVEMBER 15, IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 65, NO. 22, NOVEMBER 15, 2017 5911 Stationary Graph Processes and Spectral Estimation Antonio G. Marques, Senior Member, IEEE, Santiago Segarra, Member, IEEE,

More information

Digraph Fourier Transform via Spectral Dispersion Minimization

Digraph Fourier Transform via Spectral Dispersion Minimization Digraph Fourier Transform via Spectral Dispersion Minimization Gonzalo Mateos Dept. of Electrical and Computer Engineering University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/

More information

Reconstruction of graph signals: percolation from a single seeding node

Reconstruction of graph signals: percolation from a single seeding node Reconstruction of graph signals: percolation from a single seeding node Santiago Segarra, Antonio G. Marques, Geert Leus, and Alejandro Ribeiro Abstract Schemes to reconstruct signals defined in the nodes

More information

Sampling of graph signals with successive local aggregations

Sampling of graph signals with successive local aggregations Sampling of graph signals with successive local aggregations Antonio G. Marques, Santiago Segarra, Geert Leus, and Alejandro Ribeiro Abstract A new scheme to sample signals defined in the nodes of a graph

More information

Ergodicity in Stationary Graph Processes: A Weak Law of Large Numbers

Ergodicity in Stationary Graph Processes: A Weak Law of Large Numbers IEEE TRASACTIOS O SIGAL PROCESSIG (SUBMITTED) Ergodicity in Stationary Graph Processes: A Weak Law of Large umbers Fernando Gama and Alejandro Ribeiro arxiv:803.04550v [eess.sp] Mar 08 Abstract For stationary

More information

Reconstruction of Graph Signals through Percolation from Seeding Nodes

Reconstruction of Graph Signals through Percolation from Seeding Nodes Reconstruction of Graph Signals through Percolation from Seeding Nodes Santiago Segarra, Antonio G. Marques, Geert Leus, and Alejandro Ribeiro arxiv:507.08364v [cs.si] 30 Jul 05 Abstract New schemes to

More information

Reconstruction of Graph Signals through Percolation from Seeding Nodes

Reconstruction of Graph Signals through Percolation from Seeding Nodes Reconstruction of Graph Signals through Percolation from Seeding Nodes Santiago Segarra, Antonio G. Marques, Geert Leus, and Alejandro Ribeiro Abstract New schemes to recover signals defined in the nodes

More information

Rethinking Sketching as Sampling: Linear Transforms of Graph Signals

Rethinking Sketching as Sampling: Linear Transforms of Graph Signals Rethinking Sketching as Sampling: Linear Transforms of Graph Signals Fernando Gama, Antonio G. Marques, Gonzalo Mateos and Alejandro Ribeiro Abstract Sampling of bandlimited graph signals has welldocumented

More information

arxiv: v1 [cs.it] 26 Sep 2018

arxiv: v1 [cs.it] 26 Sep 2018 SAPLING THEORY FOR GRAPH SIGNALS ON PRODUCT GRAPHS Rohan A. Varma, Carnegie ellon University rohanv@andrew.cmu.edu Jelena Kovačević, NYU Tandon School of Engineering jelenak@nyu.edu arxiv:809.009v [cs.it]

More information

Rethinking Sketching as Sampling: A Graph Signal Processing Approach

Rethinking Sketching as Sampling: A Graph Signal Processing Approach IEEE TRANSACTIONS ON SIGNAL PROCESSING (SUBMITTED) 1 Rethinking Sketching as Sampling: A Graph Signal Processing Approach Fernando Gama, Antonio G. Marques, Gonzalo Mateos, and Alejandro Ribeiro arxiv:1611.00119v1

More information

Random Sampling of Bandlimited Signals on Graphs

Random Sampling of Bandlimited Signals on Graphs Random Sampling of Bandlimited Signals on Graphs Pierre Vandergheynst École Polytechnique Fédérale de Lausanne (EPFL) School of Engineering & School of Computer and Communication Sciences Joint work with

More information

Optimal Graph-Filter Design and Applications to Distributed Linear Network Operators

Optimal Graph-Filter Design and Applications to Distributed Linear Network Operators IEEE TRANSACTIONS ON SIGNAL PROCESSING (ACCEPTED) Optimal Graph-Filter Design and Applications to Distributed Linear Network Operators Santiago Segarra, Member, IEEE, Antonio G. Marques, Senior Member,

More information

Tropical Graph Signal Processing

Tropical Graph Signal Processing Tropical Graph Signal Processing Vincent Gripon To cite this version: Vincent Gripon. Tropical Graph Signal Processing. 2017. HAL Id: hal-01527695 https://hal.archives-ouvertes.fr/hal-01527695v2

More information

LINEAR NETWORK OPERATORS USING NODE-VARIANT GRAPH FILTERS

LINEAR NETWORK OPERATORS USING NODE-VARIANT GRAPH FILTERS LINEAR NETWORK OPERATORS USING NODE-VARIANT GRAPH FILTERS Santiago Segarra, Antonio G. Marques, and Alejandro Ribeiro Dept. of Electrical and Systems Engineering, University of Pennsylvania, Philadelphia,

More information

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin 1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)

More information

Collaborative Filtering via Graph Signal Processing

Collaborative Filtering via Graph Signal Processing Collaborative Filtering via Graph Signal Processing Weiyu Huang, Antonio G. Marques, and Alejandro Ribeiro Abstract This paper develops new designs for recommender systems inspired by recent advances in

More information

Sampling, Inference and Clustering for Data on Graphs

Sampling, Inference and Clustering for Data on Graphs Sampling, Inference and Clustering for Data on Graphs Pierre Vandergheynst École Polytechnique Fédérale de Lausanne (EPFL) School of Engineering & School of Computer and Communication Sciences Joint work

More information

Statistics 202: Data Mining. c Jonathan Taylor. Week 2 Based in part on slides from textbook, slides of Susan Holmes. October 3, / 1

Statistics 202: Data Mining. c Jonathan Taylor. Week 2 Based in part on slides from textbook, slides of Susan Holmes. October 3, / 1 Week 2 Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Part I Other datatypes, preprocessing 2 / 1 Other datatypes Document data You might start with a collection of

More information

Part I. Other datatypes, preprocessing. Other datatypes. Other datatypes. Week 2 Based in part on slides from textbook, slides of Susan Holmes

Part I. Other datatypes, preprocessing. Other datatypes. Other datatypes. Week 2 Based in part on slides from textbook, slides of Susan Holmes Week 2 Based in part on slides from textbook, slides of Susan Holmes Part I Other datatypes, preprocessing October 3, 2012 1 / 1 2 / 1 Other datatypes Other datatypes Document data You might start with

More information

Graph Signal Processing

Graph Signal Processing Graph Signal Processing Rahul Singh Data Science Reading Group Iowa State University March, 07 Outline Graph Signal Processing Background Graph Signal Processing Frameworks Laplacian Based Discrete Signal

More information

Principal Component Analysis

Principal Component Analysis Machine Learning Michaelmas 2017 James Worrell Principal Component Analysis 1 Introduction 1.1 Goals of PCA Principal components analysis (PCA) is a dimensionality reduction technique that can be used

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Pattern Recognition Feature Extraction Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi, Payam Siyari Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Agenda Dimensionality Reduction

More information

CONTROL OF GRAPH SIGNALS OVER RANDOM TIME-VARYING GRAPHS

CONTROL OF GRAPH SIGNALS OVER RANDOM TIME-VARYING GRAPHS CONTROL OF GRAPH SIGNALS OVER RANDOM TIME-VARYING GRAPHS Fernando Gama, Elvin Isufi, Geert Leus and Alejandro Ribeiro Department of Electrical and Systems Engineering, University of Pennsylvania, Philadelphia,

More information

Uncertainty Principle and Sampling of Signals Defined on Graphs

Uncertainty Principle and Sampling of Signals Defined on Graphs Uncertainty Principle and Sampling of Signals Defined on Graphs Mikhail Tsitsvero, Sergio Barbarossa, and Paolo Di Lorenzo 2 Department of Information Eng., Electronics and Telecommunications, Sapienza

More information

Filtering Random Graph Processes Over Random Time-Varying Graphs

Filtering Random Graph Processes Over Random Time-Varying Graphs 1 Filtering Random Graph Processes Over Random Time-Varying Graphs Elvin Isufi, Andreas Loukas, Andrea Simonetto, and Geert Leus arxiv:1705.00442v1 [cs.sy 1 May 2017 Abstract Graph filters play a key role

More information

8.1 Concentration inequality for Gaussian random matrix (cont d)

8.1 Concentration inequality for Gaussian random matrix (cont d) MGMT 69: Topics in High-dimensional Data Analysis Falll 26 Lecture 8: Spectral clustering and Laplacian matrices Lecturer: Jiaming Xu Scribe: Hyun-Ju Oh and Taotao He, October 4, 26 Outline Concentration

More information

LECTURE NOTE #11 PROF. ALAN YUILLE

LECTURE NOTE #11 PROF. ALAN YUILLE LECTURE NOTE #11 PROF. ALAN YUILLE 1. NonLinear Dimension Reduction Spectral Methods. The basic idea is to assume that the data lies on a manifold/surface in D-dimensional space, see figure (1) Perform

More information

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability

More information

Graph Frequency Analysis of Brain Signals

Graph Frequency Analysis of Brain Signals 1 Graph Frequency Analysis of Brain Signals Weiyu Huang, Leah Goldsberry, Nicholas F. Wymbs, Scott T. Grafton, Danielle S. Bassett and Alejandro Ribeiro arxiv:1512.37v1 [q-bio.nc] 2 Nov 215 Abstract This

More information

UNIVERSAL BOUNDS FOR THE SAMPLING OF GRAPH SIGNALS. Luiz F. O. Chamon and Alejandro Ribeiro

UNIVERSAL BOUNDS FOR THE SAMPLING OF GRAPH SIGNALS. Luiz F. O. Chamon and Alejandro Ribeiro UNIVERSAL BOUNDS FOR THE SAMPLING OF GRAPH SIGNALS Luiz F. O. Chamon and Alejandro Ribeiro Electrical and Systems Engineering University of Pennsylvania e-mail: luizf@seas.upenn.edu, aribeiro@seas.upenn.edu

More information

Data Mining Techniques

Data Mining Techniques Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 12 Jan-Willem van de Meent (credit: Yijun Zhao, Percy Liang) DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Linear Dimensionality

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression

More information

IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS, VOL. 3, NO. 3, SEPTEMBER

IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS, VOL. 3, NO. 3, SEPTEMBER IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS, VOL. 3, NO. 3, SEPTEMBER 2017 467 Network Topology Inference from Spectral Templates Santiago Segarra, Member, IEEE, Antonio G. Marques,

More information

Statistical signal processing

Statistical signal processing Statistical signal processing Short overview of the fundamentals Outline Random variables Random processes Stationarity Ergodicity Spectral analysis Random variable and processes Intuition: A random variable

More information

FuncICA for time series pattern discovery

FuncICA for time series pattern discovery FuncICA for time series pattern discovery Nishant Mehta and Alexander Gray Georgia Institute of Technology The problem Given a set of inherently continuous time series (e.g. EEG) Find a set of patterns

More information

Discrete Signal Processing on Graphs: Sampling Theory

Discrete Signal Processing on Graphs: Sampling Theory IEEE TRANS. SIGNAL PROCESS. TO APPEAR. 1 Discrete Signal Processing on Graphs: Sampling Theory Siheng Chen, Rohan Varma, Aliaksei Sandryhaila, Jelena Kovačević arxiv:153.543v [cs.it] 8 Aug 15 Abstract

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Brown University CSCI 2950-P, Spring 2013 Prof. Erik Sudderth Lecture 12: Gaussian Belief Propagation, State Space Models and Kalman Filters Guest Kalman Filter Lecture by

More information

Spectral Perturbation of Small-World Networks with Application to Brain Disease Detection

Spectral Perturbation of Small-World Networks with Application to Brain Disease Detection Spectral Perturbation of Small-World Networks with Application to Brain Disease Detection Chenhui Hu May 4, 22 Introduction Many real life systems can be described by complex networks, which usually consist

More information

DATA defined on network-like structures are encountered

DATA defined on network-like structures are encountered Graph Fourier Transform based on Directed Laplacian Rahul Singh, Abhishek Chakraborty, Graduate Student Member, IEEE, and B. S. Manoj, Senior Member, IEEE arxiv:6.v [cs.it] Jan 6 Abstract In this paper,

More information

DYNAMIC AND COMPROMISE FACTOR ANALYSIS

DYNAMIC AND COMPROMISE FACTOR ANALYSIS DYNAMIC AND COMPROMISE FACTOR ANALYSIS Marianna Bolla Budapest University of Technology and Economics marib@math.bme.hu Many parts are joint work with Gy. Michaletzky, Loránd Eötvös University and G. Tusnády,

More information

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations. Previously Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations y = Ax Or A simply represents data Notion of eigenvectors,

More information

Spectral Generative Models for Graphs

Spectral Generative Models for Graphs Spectral Generative Models for Graphs David White and Richard C. Wilson Department of Computer Science University of York Heslington, York, UK wilson@cs.york.ac.uk Abstract Generative models are well known

More information

Discriminative Direction for Kernel Classifiers

Discriminative Direction for Kernel Classifiers Discriminative Direction for Kernel Classifiers Polina Golland Artificial Intelligence Lab Massachusetts Institute of Technology Cambridge, MA 02139 polina@ai.mit.edu Abstract In many scientific and engineering

More information

Dimension Reduction Techniques. Presented by Jie (Jerry) Yu

Dimension Reduction Techniques. Presented by Jie (Jerry) Yu Dimension Reduction Techniques Presented by Jie (Jerry) Yu Outline Problem Modeling Review of PCA and MDS Isomap Local Linear Embedding (LLE) Charting Background Advances in data collection and storage

More information

Markov Chains and Spectral Clustering

Markov Chains and Spectral Clustering Markov Chains and Spectral Clustering Ning Liu 1,2 and William J. Stewart 1,3 1 Department of Computer Science North Carolina State University, Raleigh, NC 27695-8206, USA. 2 nliu@ncsu.edu, 3 billy@ncsu.edu

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning Christoph Lampert Spring Semester 2015/2016 // Lecture 12 1 / 36 Unsupervised Learning Dimensionality Reduction 2 / 36 Dimensionality Reduction Given: data X = {x 1,..., x

More information

Signal and Information Processing. Alejandro Ribeiro

Signal and Information Processing. Alejandro Ribeiro Signal and Information Processing Alejandro Ribeiro April 26, 16 Contents 1 Principal Component Analysis 3 11 The DFT and idft as Hermitian Matrices 3 111 Properties of the DFT Matrix 4 112 The Product

More information

Advanced Digital Signal Processing -Introduction

Advanced Digital Signal Processing -Introduction Advanced Digital Signal Processing -Introduction LECTURE-2 1 AP9211- ADVANCED DIGITAL SIGNAL PROCESSING UNIT I DISCRETE RANDOM SIGNAL PROCESSING Discrete Random Processes- Ensemble Averages, Stationary

More information

HW2 - Due 01/30. Each answer must be mathematically justified. Don t forget your name.

HW2 - Due 01/30. Each answer must be mathematically justified. Don t forget your name. HW2 - Due 0/30 Each answer must be mathematically justified. Don t forget your name. Problem. Use the row reduction algorithm to find the inverse of the matrix 0 0, 2 3 5 if it exists. Double check your

More information

CPSC 340: Machine Learning and Data Mining. More PCA Fall 2017

CPSC 340: Machine Learning and Data Mining. More PCA Fall 2017 CPSC 340: Machine Learning and Data Mining More PCA Fall 2017 Admin Assignment 4: Due Friday of next week. No class Monday due to holiday. There will be tutorials next week on MAP/PCA (except Monday).

More information

6.435, System Identification

6.435, System Identification System Identification 6.435 SET 3 Nonparametric Identification Munther A. Dahleh 1 Nonparametric Methods for System ID Time domain methods Impulse response Step response Correlation analysis / time Frequency

More information

Irregularity-Aware Graph Fourier Transforms

Irregularity-Aware Graph Fourier Transforms 1 Irregularity-Aware Graph Fourier Transforms Benjamin Girault, Antonio Ortega Fellow, IEEE and Shrikanth S. Narayanan Fellow, IEEE arxiv:18.1v1 [eess.sp] 8 Feb 18 Abstract In this paper, we present a

More information

ROBUST NETWORK TOPOLOGY INFERENCE

ROBUST NETWORK TOPOLOGY INFERENCE ROBUST NETWORK TOPOLOGY INFERENCE Santiago Segarra, Antonio G. Marques, Gonzalo Mateos and Alejandro Ribeiro Institute for Data, Systems, and Society, Massachusetts Institute of Technology, Cambridge,

More information

Irregularity-Aware Graph Fourier Transforms

Irregularity-Aware Graph Fourier Transforms 1 Irregularity-Aware Graph Fourier Transforms Benjamin Girault, Antonio Ortega Fellow, IEEE and Shrikanth S. Narayanan Fellow, IEEE Abstract In this paper, we present a novel generalization of the graph

More information

COMMUNITY DETECTION FROM LOW-RANK EXCITATIONS OF A GRAPH FILTER

COMMUNITY DETECTION FROM LOW-RANK EXCITATIONS OF A GRAPH FILTER COMMUNITY DETECTION FROM LOW-RANK EXCITATIONS OF A GRAPH FILTER Hoi-To Wai, Santiago Segarra, Asuman E. Ozdaglar, Anna Scaglione, Ali Jadbabaie School of ECEE, Arizona State University, Tempe, AZ, USA.

More information

Unsupervised Learning Techniques Class 07, 1 March 2006 Andrea Caponnetto

Unsupervised Learning Techniques Class 07, 1 March 2006 Andrea Caponnetto Unsupervised Learning Techniques 9.520 Class 07, 1 March 2006 Andrea Caponnetto About this class Goal To introduce some methods for unsupervised learning: Gaussian Mixtures, K-Means, ISOMAP, HLLE, Laplacian

More information

Data Preprocessing Tasks

Data Preprocessing Tasks Data Tasks 1 2 3 Data Reduction 4 We re here. 1 Dimensionality Reduction Dimensionality reduction is a commonly used approach for generating fewer features. Typically used because too many features can

More information

Distributed Implementation of Linear Network Operators using Graph Filters

Distributed Implementation of Linear Network Operators using Graph Filters Distributed Implementation of Linear Network Operators using Graph Filters Santiago Segarra, Antonio G. Marques, and Alejandro Ribeiro Abstract A signal in a network (graph) can be defined as a vector

More information

A DIGRAPH FOURIER TRANSFORM WITH SPREAD FREQUENCY COMPONENTS

A DIGRAPH FOURIER TRANSFORM WITH SPREAD FREQUENCY COMPONENTS A DIGRAPH FOURIER TRANSFORM WITH SPREAD FREQUENCY COMPONENTS Rasoul Shafipour, Ali Khodabakhsh, Gonzalo Mateos, and Evdokia Nikolova Dept. of Electrical and Computer Engineering, University of Rochester,

More information

Math 307 Learning Goals. March 23, 2010

Math 307 Learning Goals. March 23, 2010 Math 307 Learning Goals March 23, 2010 Course Description The course presents core concepts of linear algebra by focusing on applications in Science and Engineering. Examples of applications from recent

More information

A New Space for Comparing Graphs

A New Space for Comparing Graphs A New Space for Comparing Graphs Anshumali Shrivastava and Ping Li Cornell University and Rutgers University August 18th 2014 Anshumali Shrivastava and Ping Li ASONAM 2014 August 18th 2014 1 / 38 Main

More information

COMP 408/508. Computer Vision Fall 2017 PCA for Recognition

COMP 408/508. Computer Vision Fall 2017 PCA for Recognition COMP 408/508 Computer Vision Fall 07 PCA or Recognition Recall: Color Gradient by PCA v λ ( G G, ) x x x R R v, v : eigenvectors o D D with v ^v (, ) x x λ, λ : eigenvalues o D D with λ >λ v λ ( B B, )

More information

CS168: The Modern Algorithmic Toolbox Lectures #11 and #12: Spectral Graph Theory

CS168: The Modern Algorithmic Toolbox Lectures #11 and #12: Spectral Graph Theory CS168: The Modern Algorithmic Toolbox Lectures #11 and #12: Spectral Graph Theory Tim Roughgarden & Gregory Valiant May 2, 2016 Spectral graph theory is the powerful and beautiful theory that arises from

More information

11. Further Issues in Using OLS with TS Data

11. Further Issues in Using OLS with TS Data 11. Further Issues in Using OLS with TS Data With TS, including lags of the dependent variable often allow us to fit much better the variation in y Exact distribution theory is rarely available in TS applications,

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Gaussian graphical models and Ising models: modeling networks Eric Xing Lecture 0, February 7, 04 Reading: See class website Eric Xing @ CMU, 005-04

More information

Learning graphs from data:

Learning graphs from data: Learning graphs from data: A signal processing perspective Xiaowen Dong MIT Media Lab Graph Signal Processing Workshop Pittsburgh, PA, May 2017 Introduction What is the problem of graph learning? 2/34

More information

SPECTRAL ANOMALY DETECTION USING GRAPH-BASED FILTERING FOR WIRELESS SENSOR NETWORKS. Hilmi E. Egilmez and Antonio Ortega

SPECTRAL ANOMALY DETECTION USING GRAPH-BASED FILTERING FOR WIRELESS SENSOR NETWORKS. Hilmi E. Egilmez and Antonio Ortega SPECTRAL ANOMALY DETECTION USING GRAPH-BASED FILTERING FOR WIRELESS SENSOR NETWORKS Hilmi E. Egilmez and Antonio Ortega Signal and Image Processing Institute, University of Southern California hegilmez@usc.edu,

More information

Probabilistic Graphical Models

Probabilistic Graphical Models 2016 Robert Nowak Probabilistic Graphical Models 1 Introduction We have focused mainly on linear models for signals, in particular the subspace model x = Uθ, where U is a n k matrix and θ R k is a vector

More information

Filtering and Sampling Graph Signals, and its Application to Compressive Spectral Clustering

Filtering and Sampling Graph Signals, and its Application to Compressive Spectral Clustering Filtering and Sampling Graph Signals, and its Application to Compressive Spectral Clustering Nicolas Tremblay (1,2), Gilles Puy (1), Rémi Gribonval (1), Pierre Vandergheynst (1,2) (1) PANAMA Team, INRIA

More information

6.207/14.15: Networks Lecture 12: Generalized Random Graphs

6.207/14.15: Networks Lecture 12: Generalized Random Graphs 6.207/14.15: Networks Lecture 12: Generalized Random Graphs 1 Outline Small-world model Growing random networks Power-law degree distributions: Rich-Get-Richer effects Models: Uniform attachment model

More information

Digital Image Processing Lectures 13 & 14

Digital Image Processing Lectures 13 & 14 Lectures 13 & 14, Professor Department of Electrical and Computer Engineering Colorado State University Spring 2013 Properties of KL Transform The KL transform has many desirable properties which makes

More information

EE613 Machine Learning for Engineers. Kernel methods Support Vector Machines. jean-marc odobez 2015

EE613 Machine Learning for Engineers. Kernel methods Support Vector Machines. jean-marc odobez 2015 EE613 Machine Learning for Engineers Kernel methods Support Vector Machines jean-marc odobez 2015 overview Kernel methods introductions and main elements defining kernels Kernelization of k-nn, K-Means,

More information

Lecture 10: Dimension Reduction Techniques

Lecture 10: Dimension Reduction Techniques Lecture 10: Dimension Reduction Techniques Radu Balan Department of Mathematics, AMSC, CSCAMM and NWC University of Maryland, College Park, MD April 17, 2018 Input Data It is assumed that there is a set

More information

Statistical and Adaptive Signal Processing

Statistical and Adaptive Signal Processing r Statistical and Adaptive Signal Processing Spectral Estimation, Signal Modeling, Adaptive Filtering and Array Processing Dimitris G. Manolakis Massachusetts Institute of Technology Lincoln Laboratory

More information

Signal Modeling Techniques in Speech Recognition. Hassan A. Kingravi

Signal Modeling Techniques in Speech Recognition. Hassan A. Kingravi Signal Modeling Techniques in Speech Recognition Hassan A. Kingravi Outline Introduction Spectral Shaping Spectral Analysis Parameter Transforms Statistical Modeling Discussion Conclusions 1: Introduction

More information

DEMIXING AND BLIND DECONVOLUTION OF GRAPH-DIFFUSED SPARSE SIGNALS

DEMIXING AND BLIND DECONVOLUTION OF GRAPH-DIFFUSED SPARSE SIGNALS DEMIXING AND BLIND DECONVOLUTION OF GRAPH-DIFFUSED SPARSE SIGNALS Fernando J. Iglesias, Santiago Segarra, Samuel Rey-Escudero, Antonio G. Marques, and David Ramírez, Dept. of Signal Theory and Comms.,

More information

Approximate Principal Components Analysis of Large Data Sets

Approximate Principal Components Analysis of Large Data Sets Approximate Principal Components Analysis of Large Data Sets Daniel J. McDonald Department of Statistics Indiana University mypage.iu.edu/ dajmcdon April 27, 2016 Approximation-Regularization for Analysis

More information

Super-resolution via Convex Programming

Super-resolution via Convex Programming Super-resolution via Convex Programming Carlos Fernandez-Granda (Joint work with Emmanuel Candès) Structure and Randomness in System Identication and Learning, IPAM 1/17/2013 1/17/2013 1 / 44 Index 1 Motivation

More information

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works CS68: The Modern Algorithmic Toolbox Lecture #8: How PCA Works Tim Roughgarden & Gregory Valiant April 20, 206 Introduction Last lecture introduced the idea of principal components analysis (PCA). The

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Gaussian graphical models and Ising models: modeling networks Eric Xing Lecture 0, February 5, 06 Reading: See class website Eric Xing @ CMU, 005-06

More information

22m:033 Notes: 7.1 Diagonalization of Symmetric Matrices

22m:033 Notes: 7.1 Diagonalization of Symmetric Matrices m:33 Notes: 7. Diagonalization of Symmetric Matrices Dennis Roseman University of Iowa Iowa City, IA http://www.math.uiowa.edu/ roseman May 3, Symmetric matrices Definition. A symmetric matrix is a matrix

More information

A Statistical Look at Spectral Graph Analysis. Deep Mukhopadhyay

A Statistical Look at Spectral Graph Analysis. Deep Mukhopadhyay A Statistical Look at Spectral Graph Analysis Deep Mukhopadhyay Department of Statistics, Temple University Office: Speakman 335 deep@temple.edu http://sites.temple.edu/deepstat/ Graph Signal Processing

More information

Manifold Coarse Graining for Online Semi-supervised Learning

Manifold Coarse Graining for Online Semi-supervised Learning for Online Semi-supervised Learning Mehrdad Farajtabar, Amirreza Shaban, Hamid R. Rabiee, Mohammad H. Rohban Digital Media Lab, Department of Computer Engineering, Sharif University of Technology, Tehran,

More information

Unsupervised Machine Learning and Data Mining. DS 5230 / DS Fall Lecture 7. Jan-Willem van de Meent

Unsupervised Machine Learning and Data Mining. DS 5230 / DS Fall Lecture 7. Jan-Willem van de Meent Unsupervised Machine Learning and Data Mining DS 5230 / DS 4420 - Fall 2018 Lecture 7 Jan-Willem van de Meent DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Dimensionality Reduction Goal:

More information

Spectral Methods for Subgraph Detection

Spectral Methods for Subgraph Detection Spectral Methods for Subgraph Detection Nadya T. Bliss & Benjamin A. Miller Embedded and High Performance Computing Patrick J. Wolfe Statistics and Information Laboratory Harvard University 12 July 2010

More information

Nonlinear Dimensionality Reduction

Nonlinear Dimensionality Reduction Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Kernel PCA 2 Isomap 3 Locally Linear Embedding 4 Laplacian Eigenmap

More information

Discriminant analysis and supervised classification

Discriminant analysis and supervised classification Discriminant analysis and supervised classification Angela Montanari 1 Linear discriminant analysis Linear discriminant analysis (LDA) also known as Fisher s linear discriminant analysis or as Canonical

More information

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University PRINCIPAL COMPONENT ANALYSIS DIMENSIONALITY

More information

Machine Learning (BSMC-GA 4439) Wenke Liu

Machine Learning (BSMC-GA 4439) Wenke Liu Machine Learning (BSMC-GA 4439) Wenke Liu 02-01-2018 Biomedical data are usually high-dimensional Number of samples (n) is relatively small whereas number of features (p) can be large Sometimes p>>n Problems

More information

Math Linear Algebra II. 1. Inner Products and Norms

Math Linear Algebra II. 1. Inner Products and Norms Math 342 - Linear Algebra II Notes 1. Inner Products and Norms One knows from a basic introduction to vectors in R n Math 254 at OSU) that the length of a vector x = x 1 x 2... x n ) T R n, denoted x,

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

Probabilistic & Unsupervised Learning

Probabilistic & Unsupervised Learning Probabilistic & Unsupervised Learning Week 2: Latent Variable Models Maneesh Sahani maneesh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, and MSc ML/CSML, Dept Computer Science University College

More information