Graph Frequency Analysis of Brain Signals

Size: px
Start display at page:

Download "Graph Frequency Analysis of Brain Signals"

Transcription

1 1 Graph Frequency Analysis of Brain Signals Weiyu Huang, Leah Goldsberry, Nicholas F. Wymbs, Scott T. Grafton, Danielle S. Bassett and Alejandro Ribeiro arxiv: v1 [q-bio.nc] 2 Nov 215 Abstract This paper presents methods to analyze functional brain networks and signals from graph spectral perspectives. The notion of frequency and filters traditionally defined for signals supported on regular domains such as discrete time and image grids has been recently generalized to irregular graph domains, and defines brain graph frequency associated with different levels of spatial smoothness across the brain regions. Brain network frequency also enables the decomposition of brain signals into pieces corresponding to smooth or vibrant variations. We relate graph frequency with principal component analysis when the networks of interest denote functional connectivity. The methods are utilized to analyze brain networks and signals as subjects master a simple motor skill. We observe that brain signals corresponding to different graph frequencies exhibit different levels of adaptability throughout learning. Further, we notice the strong association between graph spectral property of brain networks with the level of exposure to tasks performed, and recognize the most contributing and important frequency signatures at different task familiarity. Index Terms Functional brain network, network theory, graph signal processing, fmri, motor learning, filtering I. INTRODUCTION We consider functional brain signals that describe activities at brain regions and the networks depicting functional connectivities between regions, and demonstrate how information can be gleaned by analyzing them using tools from graph signal processing. Most often, brain activities [1], [2] and functional brain networks [3], [4] are analyzed separately as different entities. However, they are highly related since activities can be regarded as signals supported on the underlying networks. Brain networks, irrespective of whether their connectivity is functional or structural, tend to be stable for a window of time, entailing associations between brain regions during captured time of interest. Brain activities can vary more frequently, forming multiple samples of brain signals supported on a common underlying network. The benefits of capturing network information in analyzing brain signals have been recognized and exploited in multiple domains including dictionary learning [5], rating predictions in recommendation systems [6], semi-supervised learning [7], image processing and denoising [8], image and video compressing [9], and breast cancer diagnostics [1]. Graph signal processing generalizes notions of frequency and filters conventionally described for signals residing on regular Work in this paper is supported NSF CCF , PHS NS44393, ARO ICB W911NF-9-1, John D. and Catherine T. MacArthur Foundation, the Alfred P. Sloan Foundation, ARO ARL W911NF , ARO W911NF , the Office of Naval Research Young Investigator Program, NIH R1- HD86888, NSF BCS , and NSF BCS The content is solely the responsibility of the authors and does not necessarily represent the official views of any of the funding agencies. W. Huang, L. Goldsberry, D. S. Bassett, and A. Ribeiro are with the Department of Electrical and Systems Engineering, University of Pennsylvania, 2 South 33rd Street, Philadelphia, PA N. F. Wymbs is with the Department of Physical Medicine and Rehabilitation, Johns Hopkins University, Baltimore, MD 21218, S. T. Grafton is with the Department of Psychological and Brain Sciences, University of California at Santa Barbara, Santa Barbara, CA whuang@seas.upenn.edu, goleah@seas.upenn.edu, nwymbs1@jhmi.edu, scott.grafton@psych.ucsb.edu, dsb@seas.upenn.edu, and aribeiro@seas.upenn.edu. domains, e.g. discrete time and image grids, to irregular graph and network domains. Different graph frequency components entail different levels of spatial variations with respect to the brain network. Graph frequency provides a method to equivalently represent a graph signal in two different domains the domain consisted of nodes in the network and the graph spectral domain incorporating graph frequencies. Therefore, brain signals can be manipulated in the spectral domain to shape frequency to favor different levels of interactions between neighbors in the network. Conventional frequency shaping has mainly been focused in two areas: (i) removing high frequency components with high variation to de-noise in speech [11] or images [12], which has been recently generalized to improve accuracy of recommendation system [6] and breast cancer detection [1], or (ii) extracting frequency signatures that explain the most variation in observations to compress data [13] and improve classification [14]. These frequency methods have not been applied to analyze brain signals and networks to date, and it is not clear whether brain signals associated with which kind of variation on the brain networks are constructive or destructive with neurophysiological activities such as perception, recognition, motor performance, or general learning. The main problem addressed in this paper is the application of graph signal processing tools to analyze brain signals and networks during motor learning. We present formal definitions of graph frequency and frequency decomposition of graph signals into pieces corresponding to different levels of spatial variations (Section II), and establish connections between graph signal processing with principal component analysis (Section II-C) when the networks denote functional connectivity in particular. The two experiments involving different motor tasks and different sets of participants considered in the paper are described (Section VI). The graph frequency of functional brain networks for the participants is visualized and analyzed (Section IV), which is found to be highly associated across scan sessions in the same dataset and across datasets. A simple model to construct artificial networks with a few network descriptive parameters is described (Section IV-A) and used to analyze spectral properties of the brain networks (Section IV-B). The paper then utilizes graph frequency decomposition to visualize and investigate brain activities with different levels of spatial variation (Section V). It is observed that the decomposed signals associated to different graph frequencies exhibit highly distinguished levels of adaptability throughout learning (Section V-A), which is consistent for different temporal scales and experiments. We also define learning capabilities of subjects, and examine the importance of brain frequency components at different task familiarity by evaluating their respective association with learning performance (Section VI). A strong correlation exists, and graph signal processing is utilized to recognize the most crucial frequency signatures at different task familiarity.

2 2 II. GRAPH SIGNAL PROCESSING The interest of this paper is to study brain signals in which we are given a collection of measurements x i associated with each cortical region out of n different brain regions. An example signal of this type is an fmri reading in which x i estimates the level of activity of brain region i. The collection of n measurements is henceforth grouped in the vector signal x = [x 1, x 2,..., x n ] T R n. A fundamental feature of the signal x is the existence of an underlying pattern of structural or functional connectivity that couples the values of the signal x at different brain regions. Irrespective of whether connectivity is functional or structural, our goal here is to describe tools that utilize this underlying brain network to analyze patterns in the neurophysiological signal x. We do so by modelling connectivity between brain regions with a network that is connected, weighted, undirected, and symmetric. Formally, we define a network as the pair G = (V, W), where V = {1,..., n} is a set of n vertices or nodes representing individual brain regions and W : V V R ++ is a map from the set of edges to the nonnegative reals that associate a weight w ij to the edge (i, j) V V. Since the network is undirected and symmetric we have that w ij = w ji for all (i, j). The weights w ij = w ji represent the strength of the connection between regions i and j, or, equivalently, the proximity or similarity between nodes i and j. In terms of the signal x, this means that when the weight w ij is large, the signal values x i and x j tend to be related. Conversely, when the weight w ij is small or, in the extremum, when we have w ij =, the signal values x i and x j are not directly related except for what is implied by their separate connections to other nodes. We adopt the conventional definitions of the adjacency, degree, and Laplacian matrices for the weighted graph G = (V, W), [15, Chapter 1]. The adjacency matrix A R n n + is defined such that A ij = w ij when i j and A ij = otherwise. The degree matrix D R n n + is a diagonal matrix with its ith diagonal element D ii = j w ij denoting the sum of all the weights of edges out of node i. The Laplacian matrix is defined as the difference L := D A R n n. The components of the Laplacian matrix are explicitly given by L ij = w ij, L ii = n w ij. (1) j=1 Observe that the Laplacian is real, symmetric, diagonal dominant, and with strictly positive diagonal elements. As such, all of its eigenvalues are real and nonnegative, which implies that the matrix L is positive semidefinite. Further observe that L1 = with 1 and denoting the vectors of all ones and all zeros respectively, which implies that is an eigenvalue of L and that, as a consequence, L is not positive definite. The eigenvector decomposition of L is utilized in the following section to define the graph Fourier transform and the associated notion of graph frequencies. A. Graph Fourier Transform and Graph Frequencies Begin by considering the set {λ k } k=,1,...,n 1 of eigenvalues of the Laplacian L and assume they are ordered so that = λ λ 1... λ n 1. Each of these eigenvalues has an associated unit-norm eigenvector v k such that Lv k = λ k v k. Define the diagonal eigenvalue matrix Λ := diag(λ, λ 1..., λ n 1 ) and the eigenvector matrix V := [v, v 1,..., v n 1 ]. (2) Because the graph Laplacian L is real symmetric it accepts the eigenvalue decomposition L = VΛV H, (3) where V H represents the Hermitian (conjugate transpose) of the matrix V. The validity of (3) follows because the eigenvectors of symmetric matrices are orthogonal so that the definition in (2) implies that V H V = I. The eigenvector matrix V is used to define the Graph Fourier Transform of the graph signal x as we formally state next; see, e.g., [16]. Definition 1 Given a signal x R n and a graph Laplacian L R n n accepting the decomposition in (3), the Graph Fourier Transform (GFT) of x with respect to L is the signal x = V H x. (4) The inverse (i)gft of x with respect to L is defined as x = V x. (5) We say that x and x form a GFT transform pair. Observe that since VV H = I, the igft is, indeed, the inverse of the GFT. Given a signal x we can compute the GFT as per (4). Given the transform x we can recover the original signal x through the igft transform in (5). There are several reasons that justify the association of the GFT with the Fourier transform. Mathematically, it is just a matter of definition that if the vectors v k in (3) are of the form v k = [1, e j2πk/n,..., e j2πk(n 1)/n ] T, the GFT and igft in Definition 1 reduce to the conventional time domain Fourier and inverse Fourier transforms. More deeply, it is not difficult to see that if the graph G is a cycle, the vectors v k in (3) are of the form v k = [1, e j2πk/n,..., e j2πk(n 1)/n ] T. Since cycle graphs are representations of discrete periodic signals, it follows that the GFT of a time signal is equivalent to the conventional discrete Fourier transform; see, e.g., [17]. An important property of the GFT is that it encodes a notion of variability akin to the notion of variability that the Fourier transform encodes for temporal signals. To see this, define x = [ x,..., x n 1 ] T and expand the matrix product in (5) to express the original signal x as x = n 1 k= x k v k. (6) It follows from (6) that the igft allows us to write the signal x as a sum of orthogonal components v k in which the contribution of v k to the signal x is the GFT component x k. In conventional Fourier analysis, the eigenvectors v k = [1, e j2πk/n,..., e j2πk(n 1)/n ] T carry a specific notion of variability encoded in the notion of frequency. When k is close to zero (low frequencies), the corresponding complex exponential eigenvectors are smooth, slowly varying functions. When k is close to n (high frequencies), the corresponding complex exponential eigenfunctions fluctuate more rapidly in the discrete temporal domain. Thus, when we look at the frequency components x k with small k we are looking at how much slowly varying oscillations contribute to x. When we look at x k with large k, we are

3 3 examining the contribution of rapidly varying oscillations to the signal x. In the graph setting, the graph Laplacian eigenvectors provide a similar notion of frequency. Indeed, define the total variability of the graph signal x with respect to the Laplacian L as TV(x) = x H Lx = i j w ij (x i x j ) 2, (7) where in the second equality we expanded the quadratic form and used the definition of the Laplacian entries in (1). It follows that the total variation TV(x) is a measure of how much the signal changes as measured with respect to the expectation on variability that is encoded by the weight w ij. When w ij is large we expect x i and x j to be close because a large weight w ij is encoding the structure of functional relationship between brain regions i and j. If they are not related, the contribution of their difference to the total variation is amplified by the weight w ij. If the weight w ij is small or null, the difference between the signal values x i and x j makes little or no contribution to the total variation. We can then think of a signal with small total variation as one that changes slowly over the graph and of signals with large total variation as those that change rapidly over the graph. Consider now the total variation of the eigenvectors v k and use the facts that Lv k = λ k v k and that v H k v k = 1 to conclude that TV(v k ) = v H k Lv k = λ k. (8) It follows from (8) and the fact that the eigenvalues are ordered as = λ λ 1... λ n 1, that the total variations of the eigenvectors v k follow the same order. Combining this observation with the discussion following (7), we conclude that when k is close to, the eigenvectors v k vary slowly over the graph, whereas for k close to n the eigenvalues vary more rapidly. Returning now to (6) we see that the GFT and igft allows us to decompose the brain signal x into components that characterize different levels of variability. We say that the eigenvectors associated with small eigenvalues λ k in (6) represent low graph frequencies because they vary slowly on the network composed of connections between brain regions. The corresponding GFT coefficients x k indicate how much these slowly varying signals contribute to the observed brain signal x. Likewise, the eigenvectors associated with large eigenvalues λ k in (6) represent high graph frequencies because they vary rapidly with respect to the underlying network of brain connectivity. The GFT coefficients x k indicate how much these rapidly varying signals contribute to the observed brain signal x. The frequency analysis of temporal signals is useful because it permits analysis in terms of components that represent different levels of variability. We will demonstrate in sections IV, V, and VI that graph frequency decompositions of brain signals enable analogous characterizations. Before that, we describe some basic notions of graph filtering and graph frequency decompositions in the following section. B. Graph Filtering and Frequency Decompositions Given a graph signal x with its graph frequency components x, we can take its frequency components corresponding to the lowest K L graph frequencies by defining { 1, if k K L 1; h L (k) = (9), otherwise K L K L + K M n Fig. 1. An illustration of the frequency responses of the family of three filters used in graph signal frequency decomposition. Because the gain of the passbands are always 1 and the filters are mutually exclusive yet collectively exhaustive in graph spectrum, they provide a method to decompose a given graph signal into pieces exhibiting different level of fluctuation with respect to the underlying graph. and perform the multiplication x L = H L x with H L = diag( h L ) being the diagonal matrix of h L. This operation happens in the graph spectral domain. Utilizing definitions of GFT (4) and igft (5), this is equivalent to perform the following operations in the graph vertex domain h L h M x L = V x = V H L V 1 x := H L x, (1) with H L = V H L V 1. Therefore, the graph signal x L associated with graph low frequencies of x can be considered as the multiplication of H L with x. If we denote Ψ as the Vandermonde matrix 1 λ λ n 1 Ψ = (11) 1 λ n 1 λ n 1 n 1 and define h L = Ψ 1 hl, H L can also be written as H L = n 1 k= h L(k)L k. Ψ 1 exists if the graph Laplacian does not have duplicate eigenvalues, which would highly be the case for functional brain networks. The structure of the Vandermonde matrix Ψ implies that h L corresponding to the frequency response h L defined in (9) would also concentrate on h L (k) for small index values k. Therefore, the graph signal x L corresponding with graph low frequencies of x can be interpreted as the aggregated effect of x with powers of graph Laplacian L k for small values of k. Because L x = x recovers the original signal, Lx entails local interactions between neighbors, L 2 x describes interfacing between nodes via intermediate common neighbors, L 3 x denotes communications between 3-hop neighbors and so on, x L results from the localized interactions of x from the graph vertex domain. From graph spectral domain, x L only has low graph frequency components of x and therefore can be considered as the result of passing x through a graph low-pass filter. Filters have been utilized in traditional signal processing since the birth of the field to analyze data such as acoustic signals, speech, or images. Filters are favored because they allow one to shape frequencies: by choosing different coefficients at different frequencies, the output of the filter emphasizes different types of interactions. Graph signal processing enables the generalization of filters in regular domains, e.g. discrete time or image grids, to irregular domains graphs and networks, but keeps the interpretation of filters as favoring different interactions between neighboring nodes. h H

4 4 The aforementioned graph low pass filter H L prefers localized interactions of the underlying network. Other types of graph filters can be defined analogously, leaning towards interfacing between nodes in networks other than localized ones. In this paper, apart from graph low-pass filter H L, we also consider graph band-pass filter H M and high pass filter H H, whose graph frequency responses are defined as { 1, if K L k K L + K M 1; h M (k) = (12), otherwise. { 1, if k K L + K M ; h H (k) = (13), otherwise. In words, the low-pass filter takes the lowest K L graph frequencies, the band-pass filter captures the middle K M graph frequencies, and the high-pass filter only considers the highest n K L K M frequencies. An illustration is shown in Figure 1. Notice that the three filters are defined such that the graph frequencies of their respective interest are mutually exclusive yet collectively exhaustive. As a result, if we denote x L = H L x as the graph signal filtered by the graph low-pass filter H L and x M as well as x H similarly, that the frequency gain at frequency pass bands is always 1 implies that x = x L + x M + x H in the graph spectral domain. Equivalently, in the graph vertex domain, x = x L + x M + x H. (14) Therefore, the family of graph filters H L, H M, and H H provides us a method to decompose a given graph signal x into three nonduplicating pieces, which respectively correspond to the graph signals that are smooth and slow varying on the network (x L ), the signals that fluctuate rapidly with respect to the graph (x H ), and the remaining signals (x M ). The frequency decomposition introduced in this section is also highly related to principal component analysis (PCA) [18], when the brain networks are constructed from functional connectivity estimates, as we describe in the next section. C. Relationship to Principal Component Analysis Functional brain networks have recently been extensively studied [19], [2]. In such scenarios, graph observations Y R p n at p time points are available, with the ith column y i denoting the observation at node i. The functional brain networks are then constructed so that the weight of an edge w ij is given by the Person correlation coefficient between the BOLD time series of node i and the BOLD time series of node j. In this section, we connect PCA for brain signals [21], [22] with graph frequency decomposition, when the networks are constructed from correlations, and demonstrate that PCA for brain networks is similar to a specific type of graph frequency decomposition. To begin, given the observation matrix Y, PCA considers the covariance matrix Σ = cov(y) R n n with its entry Σ ij computing the covariance of observations y i and y j, and evaluates the expansion of observations onto the eigenvector sets {vk Σ} of Σ. Functional brain networks are constructed with the adjacency matrix A defined as { correlation coefficient of y i and y j, if i j A ij =. (15),, else Denote ˆΣ = corrcoef(y) R n n as the matrix representing the correlation coefficients between columns in Y, then A can also be Session 1 Session 2 Session 3 Session 4 MIN Sequences MOD Sequences EXT Sequences Fig. 2. Relationship between training duration, intensity, and depth for the first experimental framework. The values in the table denote the number of trials (i.e., depth ) of each sequence type (i.e., intensity ) completed after each scanning session (i.e., duration ) averaged over the 2 participants. expressed as A = ˆΣ I since the correlation coefficient between y i and itself is always 1 for any i. Moreover, because the variances σ(y i ) and σ(y j ) at different nodes are not highly distinguished for different nodes i and j in the observations, the covariance between y i and y j is not significantly different from their correlation coefficient. Consequently, we can write ˆΣ α 1 I Σ for some real number α 1 R. In graph signal processing, the network structure of interest is not the adjacency matrix A but rather the graph Laplacian L = D A. For functional brain networks, if the degree distribution is homogeneous, i.e. the entry D ii denoting the summation of correlations of observation y i at node i with observations y l at all other nodes l i being not highly different D jj for j i, we can write D α 2 I for some real number α 2 R. Although this is not always the case for functional brain networks, the degree distributions for most functional brain networks are not highly skewed. Therefore, combining the constructed relationships, we can write L (α 2 + 1) I α 1 Σ. (16) This implies that the set of graph eigenvectors {v k } of L is not highly different from the set of eigenvectors {vk Σ }. We emphasize that, however, the order of the eigenvectors in {v k } and {vk Σ} flips, i.e. the graph eigenvector v k associated with a low frequency corresponds to the eigenvector vk Σ associated with a high eigenvalue of Σ. Recall in PCA, given a graph signal x R n, one is interested in computing its expansion x, vk Σ in terms of the eigenvectors vk Σ associated with high eigenvalues, since these eigenvectors explain most of the variations of the observations. Therefore, PCA for funcitonal brain networks is similar to evaluating the graph frequency coefficient for low graph frequencies, i.e. focusing on the segment of the brain signal x that varies smoothly on the underlying network. Graph frequency decomposition provides a more general framework, since it enables shaping not only the low frequency but all graph frequencies through the introduction of graph filters. In this paper, we use notions of graph frequencies to study the graph spectral properties of brain networks as subjects master a new motor skill. Furthermore, we apply the frequency decomposition introduced in (14) to examine the adaptability of brain signals and their relationship to individual variability in learning rate. We begin by describing the task that subjects performed, and the neuroimaging data that was acquired. III. BRAIN SIGNALS DURING LEARNING We considered two experiments in which subjects learned a simple motor task [23] [27]. In the experiments, fourty-seven right-handed participants (29 female, 18 male; mean age years) volunteered with informed consent in accordance with the University of California, Santa Barbara Internal Review Board. After exclusions for task accuracy, incomplete scans, and abnormal MRI, 38 participants were retained for subsequent analysis.

5 5 Total Variation Total Variation Eigenvalue Index (a) Eigenvalue Index (c) Weighted Zero Crossings Weighted Zero Crossings Eigenvalue Index (b) Eigenvalue Index (d) Fig. 3. (a) Total variation TV(v k ) and (b) weighted zero crossing ZC(v k ) of the graph Laplacian eigenvectors for the brain networks averaged across participants in the 6 week training experiment. (c) and (d) present the values for the 3 day experiment. In both cases, the Laplacian eigenvectors associated with larger indexes (larger eigenvalues) vary more on the network and cross zero relatively more often, confirming the interpretation of the graph Laplacian eigenvalues as notions of frequencies. Twenty individuals participated in the first experimental framework. The experiment lasted 6 weeks, in which there were 4 scanning sessions, roughly at the start of the experiment, at the end of the 2nd week, at the end of the 4th week, and at the end of the experiment, respectively. During each scanning session, individuals performed a discrete sequence-production task in which they responded to sequentially presented stimuli with their dominant hand on a custom response box. Sequences were presented using a horizontal array of 5 square stimuli with the responses mapped from left to right such that the thumb corresponded to the leftmost stimulus. The next square in the sequence was highlighted immediately following each correct key press; the sequence was paused awaiting the depression of the appropriate key if an incorrect key was pressed. Each participant completed 6 different 1-element sequences. Each sequence consists of two squares per key. Participants performed the same sequences at home between each two adjacent scanning sessions, however, with different level of exposures for different sequence types. Therefore, the number of trials completed by the participants after the end of each scanning session depends on the sequence type. There are 3 different sequence types (MIN, MOD, EXT) with 2 sequences per type. The number of trials of each sequence type completed after each scanning session averaged over the 2 participants is summarized in Figure 2. Eighteen subjects participated in the second experimental framework. The experiment had 3 scanning sessions spanning the three days. Each scanning session lasted roughly 2 hours and no training was performed at home between adjacent scanning sessions. Subjects responded to a visually cued sequence by generating responses using the four fingers of their nondominant hand on a custom response box. Visual cues were presented as a series of musical notes on a pseudo-musical staff with four lines such that the top line of the staff mapped to the leftmost key pressed with the pinkie finger. Each 12-note sequence randomly ordered contained three notes per line. In both experiments participants were instructed to respond promptly and accurately. Repetitions (e.g., 11 ) and regularities such as trills (e.g., 121 ) and runs (e.g., 123 ) were excluded in all sequences. The order and number of sequence trials was identical for all participants. During scanning sessions, all participants completed the tasks inside the MRI scanner. Reordering with fmri were conducted using a 3. T Siemens Trio with a 12-channel phased-array head coil. For each functional run, a single-shot echo planar imaging sequence that is sensitive to blood oxygen level dependent (BOLD) contrast was utilized to obtain 37 (the first experiment) or 33 (the second experiment) slices (3mm thickness) per repetition time (TR), with a TR of 2, ms, an echo time of 3 ms, a flip angle of 9, a field of view of 192 mm, and a acquisition matrix. Image preprocessing was performed using the Oxford Center for Functional Magnetic Resonance Imaging of the Brain (FMRIB) Software Library (FSL), and motion correction was performed using FMRIB s linear image registration tool. The whole brain is parcellated into a set of n = 112 regions of interest that correspond to the 112 cortical and subcortical structures anatomically identified in FSL s Harvard-Oxford atlas. For each individual fmri dataset, we estimate regional mean BOLD time series by averaging voxel time series in each of the n regions. We evaluate the coherence or correlation between the activity of all possible pairs of regions to construct n n functional connectivity matrices A, whose individual elements are subjected to statistical testing. In the 3 day dataset, the value of all elements that do not pass the false discovery rate correction for multiple comparisons are set to zero; the values remain unchanged otherwise. In the 3 day experiment, a single brain network is constructed for each participant. In the 6 week experiment, due to the long duration of the experiment, we build a different brain network per scanning session per sequence type for each subject. We normalize the regional mean BOLD observations ˆx(t) at any time t and consider x(t) = ˆx(t)/ ˆx(t) 2 such that the total energy of activities at all structures is consistent at different t to avoid extreme spikes due to head motion or drift artifacts in fmri. IV. BRAIN NETWORK FREQUENCIES In this section, we examine the fundamental building block of graph spectral analysis of brain networks graph Laplacian frequencies associated with brain networks, and verify the interpretation of the notions and orders of graph frequencies. For the brain network A of each subject, we construct its Laplacian L = D A as in (1), and compute the eigenvectors {v k } n 1 k=. In Section II-A, we state that in the graph setting, the graph Laplacian eigenvectors provide a notion of frequency v k associated with smaller indexing number k represents graph low frequency and varies slowly across the network. To verify this explanation, we compute the total variation TV(v k ) [cf. (8)] for each eigenvector v k. Figure 3 (a) and (c) plot the total variation of all graph eigenvectors averaged across participants of the 6 week training experiment and 3 day experiment, respectively. In both experiments, the Laplacian eigenvectors associated with larger indexes and consequently larger eigenvalues fluctuate more on the network. This corroborates our claim that Laplacian eigenvalues offer a notion of graph frequencies. Besides total variation, number of zero crossings is used as a measure of the smoothness of signals with respect to an

6 6 Fig. 4. Absolute magnitude at each of the n cortical structures averaged across participants in the 6 week experiment and averaged across all frequency in (a) the set of low graph frequencies {v k } K L 1 k=, (b) the set of middle graph frequencies {v k} K L+K M 1 k=k, and (c) the set of high graph frequencies {v L k } n 1 k=k L +K. (d)-(f) M presents the average magnitudes for the 3 day experiment. Only brain regions with absolute magnitudes higher than a fixed threshold is colored. Magnitudes for high frequencies are highly akin across the datasets, and the brain regions with high magnitude values significantly overlap with the visual and sensorimotor modules. underlying network [16]. Since brain networks are weighted, we adapt a slightly modified version weighted zero crossings to investigate the given graph eigenvector v k ZC(v k ) = 1 w ij 1 {v k (i)v k (j) < }. (17) 2 i j In words, weighted zero crossings evaluate the weighted sum of the set of edges connecting a vertex with a positive signal to a vertex with a negative signal. The factor 1/2 is applied due to the fact that the graph is both symmetric and unweighted. Figure 3 (b) and (d) demonstrate the weighted zero crossings of all graph eigenvectors averaged across subjects of the 6 week and 3 day experiments, respectively. The weighted zero crossings increase proportionally with graph frequency index k for k 1, as expected. However, eigenvectors associated with higher graph frequencies exhibit lower weighted zero crossings when k is greater than 1. A detailed analysis shows that this is because the functional brain networks are highly connected with nearly homogeneous degree distribution, and consequently each high graph frequency tends to have a value with high magnitude at one vertex of high degree and similar values at other nodes, resulting in a smaller global zero crossings for eigenvectors associated with very high frequencies. It would be interesting to examine where the associated eigenvectors lie anatomically, and the relative strength of their values. To facilitate the presentation, we consider three sets of eigenvectors, {v k } KL 1 k=, {v k} KL+KM 1 k=k L and {v k } n 1 k=k L+K M, and compute the absolute magnitude at each of the n cortical and subcortical regions averaged across participants and across all graph frequencies belonging to each of the three sets. Figure 4 presents the average magnitudes for the two experimental frameworks considered in the paper using BrainNet [28], where brain regions with absolute magnitudes lower than a fixed threshold is not colored. The most interesting finding relates to the eigenvectors associated with high graph frequencies. The magnitudes at different brain regions for high frequencies are significantly similar across the two datasets investigated, and brain regions with high magnitude values are highly alike to the visual and sensorimotor cortices [29]. This is likely to be a consequence of the fact that visual and motor regions are more strongly connected with other structures, and hence an eigenvector with a high magnitude on visual or motor structures would result in high global spatial variation. The eigenvectors of low graph frequencies are more spread across the networks, resulting in low global variations. The middle graph frequencies are less interesting the magnitudes at most regions do not pass the threshold, and little associations can be found between the eigenvectors of the 6 week and 3 day experiments. A. Artificial Functional Brain Networks An approach to analyze the complex networks that are not well fit by a few graph generation or growth parameters [3], [31] is to define a model to generate artificial networks. Examples of such models include Erdős-Rényi model [32] for unweighted networks, Barabási-Albert model for scale free network [33], stochastic block model for community formation using unweighted networks [34], and recent development and insights on weighted network models [31]. Here we present a framework to construct artificial networks that can be used to mimic the functional brain networks with only a few parameter inputs. The desired output of the

7 7 method would be a symmetric network with edge weights between and 1 and without self-loops. To begin, suppose the desired network has two different clusters of nodes V 1 and V o. The algorithm requires the average edge weight µ 1 for connections between nodes of the first cluster V 1, average edge weight µ o for links between vertices of the other cluster V o, and average edge weight µ 1o for inter-cluster connections. To reflect the fact that the edge weights on some links are independent of their joining vertices, for each edge within V 1, with probability p ɛ < 1, its weight is randomly generated with respect to uniform distribution U[, 1] between and 1. For each edge within V 1, with probability 1 p ɛ, its weight is randomly generated with respect to uniform distribution U[µ 1 δ, µ 1 + δ]. The parameter δ is selected such that the random variable has variance ˆσ 2, the desired variance in the brain networks. We also constructed networks such that edges for different types of connections are randomly generated with different δ, resulting in different variance values. The networks exhibit negligible changes in their network properties, and for simplicity, we use a constant variance for all kinds of connections. We also investigated the construction of networks using random variables with respect to other commonly used distributions, e.g. Gaussian distribution and Laplacian distribution. We found that the uniform distribution is able to imitate the networks most closely, and for this reason we use uniform distribution here. The parameter p ɛ determines the percentage of edges whose weights are selected irrespective of their actual locations, and we notice that a typical value between.25 to.1 would be able to highly imitate functional brain networks. To further simulate the observation that different participants may possess distinctive brain networks, if the edge weight is randomly generated from a uniform distribution U[µ 1 δ, µ 1 + δ], it is then perturbed by w u U[ u ɛ /2, u ɛ /2] where u ɛ controls the level of perturbation, and a typical value of u ɛ highly mimicking the brain networks is between.5 to.15. The edge weights for connections within cluster U o are generated similarly, such that with probability 1 p ɛ the edge weight is randomly chosen with respect to the uniform distribution U[µ o δ, µ o + δ] before being contaminated by w u U[ u ɛ /2, u ɛ /2]. The edge weights for connections between clusters U 1 and U o are formed in analogy using µ 1o. The method presented here can be easily generalized to analyze brain networks with more regions of interest, i.e. by specifying sets of regions of interest and by detailing the expected correlation values on each type of connection between different regions. Remark 1 At one extreme we can make each node i belonging to a different set V i = {i}. Then the method requires the inputs of expected edge weights between any pairs of nodes, or alternatively speaking, the expected network. At another extreme, there is only one set of nodes V, and then the method is highly akin to a network with edge weights completely randomly generated. Any construction of interest would have some prior knowledge regarding the community structure something between the aforementioned two extremes. Therefore, the method proposed here can be used to see if the network constructed with the specific choice of community structure highly simulate the key properties of the actual network, and can be used to examine the evolution of community structure in the brain with stimuli or throughout the process to master a particular task. B. Spectral Properties of Brain Networks In this section, we analyze the graph spectral properties of brain networks. To that end, given the graph Laplacian of an underlying graph structure, we examine the level of fluctuation of its eigenvectors on different types of connections in the brain network. Different sets of brain regions in the network are defined as modules [23]. More specifically, given an eigenvector v k, its variation on the visual module is defined as TV visual (v k ) = i,j V w v,i j ij(v k (i) v k (j)) 2 i,j V w, (18) v,i j ij where V v denotes the set of nodes belonging to the visual module. Because w ij (v k (i) v k (j)) 2 is the signal difference on connections between nodes i and j within the visual module weighted by the edge weight w ij, TV visual (v k ) computes the difference for signals on the visual module for each unit of edge weight. There are n eigenvectors in total and to facilitate interpretation, we only consider three sets of eigenvectors {v k } KL 1 k=, {v k} KL+KM 1, and k=k L {v k } n 1 k=k L+K M corresponding to low graph frequency, middle frequency, and high frequency, respectively. We then compute the visual module total variation TV L visual averaged over eigenvectors {v k } KL 1 k=, and TVM visual as well as TV H visual similarly. Besides TV visual, we also examine the level of fluctuation of eigenvectors on edges uniquely within the motor module, denoted as TV motor, and on connections uniquely belonging to brain modules other than the visual and motor module, i.e. TV others. Further, there are links between two separate brain modules, and to assess the variation of eigenvectors on those links, we define total variations between the visual and motor modules TV visual-motor (v k ) = i V v,j V m w ij (v k (i) v k (j)) 2 i V v,j V m w ij, (19) where V m denotes the set of nodes belonging to the motor module. Total variations TV visual-others between the visual and other modules as well as total variations TV motor-others between the motor and other modules are defined analogously. We chose to study visual and motor modules separately from other brain modules because of their well-known associations with motor learning task [23], [35]. Figure 5 (a) presents boxplots of the variation for eigenvectors of different graph frequencies measured over different types of connections across participants, at the start of the six week training. Despite the fact that total variation of eigenvectors is identical to their corresponding eigenvalues (cf. (8)), and therefore should be higher for eigenvectors associated with higher frequencies, the variation on the other module TV L others of eigenvectors associated with low graph frequencies is higher than TV H others the variation on the other module of eigenvectors associated with high frequencies. This implies that the eigenvectors of low graph frequencies exhibit highly similar graph signals for brain regions within the visual and motor modules, and the growth in variation of eigenvectors as the associated frequency increases is purely due to the increment of variation in visual or motor modules. Put differently, low graph frequency components are expected to vary little, which is the case for visual and motor modules, and the signals fluctuating significantly are more likely to be noise the scenario for the other module. Similar observations hold true for high frequency components. To better interpret the meaning of variations for specific types of connections, we construct artificial networks as described in

8 8 Fig. 5. Spectral property of brain networks in the 6 week experiment. (a) Left: Averaged total variation of eigenvectors v k for 6 different types of connections of the brain averaged across all eigenvectors associated with low graph frequencies v k {v k } K L 1 k=, across all participants and scan sessions. Middle: Across all eigenvectors with middle graph frequencies v k {v k } K L+K M 1 k=k. Right: Across all eigenvectors with high graph frequencies v L k {v k } n 1 k=k L +K. (b) Median M total variations of brain networks across participants of different scanning sessions and different sequence types with respect to the level of exposure of participants to the sequence type at the scanning session. Relationship between training duration, intensity, and depth is summarized in Figure 2. Value of in the x-axis in the figures refers to minimum exposure to sequences (all 3 sequence types of Session 1), and value of 1 in the x-axis denotes the maximum exposure to sequences (EXT sequence types of Session 4). An association between spectral property of brain networks and the level of exposure is clearly observed. (c) Median total variations evaluated upon artificial networks. Spectral properties of actual brain networks can be closely simulated using a few parameters. Section IV-A with visual and motor modules as regions of interest, and consider other modules to be brain regions other than visual and motor modules. We observe that there are three contributing factors that cause the variation within a specific module to become higher for higher eigenvectors and to become lower for lower eigenvectors, i.e. more dominant of the variation of eigenvectors associated with high graph frequencies on the entire brain: (i) Increases in the average edge weight for connections within the module, (ii) Increments in the average edge weight for links between this module to other module, and (iii) Escalation in the average edge weight for associations within the other module. This can also be observed by analyzing closely the definition of total variation. If a module is highly connected, in order for the eigenvector associated with a low graph frequency to be smooth on the entire network, it has to be smooth on the specific module, resulting in a low value in the variation of an eigenvector associated with a low graph frequency with respect to the module of interest. Similarly, the increase in the variation of connections between two modules, e.g. between visual and other modules are resulted from: (i) The growth in the average edge weight for connections between visual and other modules, or (ii) The augmentation of average weight for links within the other module. We notice that the increment in the average weight for relations within the visual module does not contribute significantly, largely due to the fact that the number of nodes in the other module is much larger than that in the visual module. Therefore, given a specific brain module, the level of variation of the eigenvector associated with certain graph frequencies (how high for high graph frequencies and how low for low graph frequencies), can be interpreted as a measure of how strongly associated is the

9 9 module of interest to modules that are connected with high edge weights either the module of interest is highly connected itself, or it is connected with another brain module that is strongly associated. The graph spectral properties as in Figure 5 (a) are observed because (i) visual and motor modules are themselves highly connected, and (ii) visual module is also strongly linked with motor module. Next we study how the graph spectral properties of brain networks evolve as participants become more exposed and familiar with the tasks. Figure 5 (b) illustrates the median of the variation for eigenvectors of different graph frequencies measured over different types of connections across subjects, at 1 different levels of exposure in the process of six week training. As participants become more acquainted with the assignment, their brain networks display lower variation in the visual and motor modules and higher variation in the other modules for low and middle graph frequencies, and the exact opposite is true for high graph frequencies. The association with training intensity is statistically significant (average Pearson correlation coefficient r =.8164, average p =.38). Following the interpretations based on artificial network analysis, this evolution in graph spectral properties of brain networks is mainly caused by the decrease in values of connections within visual and motor modules and between the visual and motor modules. An interesting observation is that the values in the variation of eigenvectors associated with high frequencies decline with respect to the visual module much faster than that of motor module, even though the visual module is more strongly connected throughout training compared to the motor module. A deep analysis using artificial networks shows that this results from the following three factors: (i) Though more strongly connected compared to motor module, connections within visual module weaken very quickly, (ii) Motor module is more closely connected with other module than the link between visual module to other module, and (iii) Association levels within other module stay relatively constant. Therefore, as participants become more exposed to the tasks, compared to visual module, the motor module becomes more strongly connected with modules that are themselves more associated. The graph spectral properties of actual brain networks and their evolution can be closely imitated using artificial networks, as plotted in Figure 5 (c). We also used artificial networks with edge weights randomly generated with Gaussian distributions. The graph spectral properties of actual brain networks cannot be well resembled even with more parameters to tune in the constructions. The analysis for the three day training dataset is highly similar and for this reason we do not present and analyze it separately here. V. FREQUENCY DECOMPOSITION OF BRAIN SIGNALS The previous sections focus on the study of brain networks and their graph spectral properties. In this section, we investigate brain signals from a graph signal processing perspective, and analyze the brain signals by examining the decomposed graph signals x L, x M, and x H with respect to the underlying brain networks. We note that all the decomposed signals are signals of the same underlying network and therefore can be visualized on the brain. To facilitate interpretation, we compute the absolute magnitude of the decomposed signal x L for each brain region averaged across all sample signals for each individual during a scan session and then averaged across all participants. Similar aggregation is applied for x M and x H. Figure 6 presents the distribution of the decomposed signals corresponding to different levels of spatial variations on the brain for the first scan session (top row) and the last scan session (bottom row) in the 6 week experiment. Figure 7 exhibits how the decomposed signals are distributed across brain regions in the 3 day experiment. Brain regions with absolute magnitudes lower than a fixed threshold are not colored. We note the difference between Figure 4 with Figures 6 and 7. In Figure 4, colors on brain regions denote magnitudes of the associated eigenvectors we purely examine the property of brain networks. In Figures 6 and 7, colors on brain regions represent magnitudes of the respective decomposed signals our interest is signals defined on the brain network. These two are related since networks are constructed using correlation or coherence of brain activities. However, they are conceptually different. A deep analysis yields many interesting aspects of graph frequency decomposition. First, for x L, the magnitudes on adjacent brain regions tend to possess highly similar values, resulting in a more evenly spread brain signal distribution, where as for x H, neighboring signals can exhibit highly dissimilar values; this corroborates the motivation to use graph frequency decomposition to segment brain signals into pieces corresponding to different levels of spatial fluctuations. Second, decomposed signal of a specific level of variations, e.g. x L, are highly similar with respect to different scan sessions in an experiment as well as with respect to two experiments with different sets of participants. This reflects the fact that frequency decomposition is formed by applying graph filters with different pass bands upon signals and therefore should express some consistent aspects of brain signals. Third, recall that we normalize the brain signals at every sample point for all subjects, and for this reason signals x L, x M and x H would be similarly distributed across the brain if nothing interesting happens at the decomposition. However, in both Figures 6 and 7, it is observed that many brain regions possess magnitudes higher than a threshold in x L and x H while not many brain regions pass the thresholding with respect to x M. It has long been understood that the brain is a complex system combining some degree of disorganized behavior with some degree of regularity and that the complexity of a system is high when order and disorder coexist [36], [37]. x L varies smoothly across the brain network and therefore can be regarded as regularity (order), whereas x H fluctuates vibrantly and consequently can be considered as randomness (disorder). This evokes the anticipation that graph frequency decomposition segments a brain signal x into pieces x L and x H which reflect order and disorder and are therefore more interesting, as well as the remaining piece x M. In the next two sections, we follow this anticipation and demonstrate that x L and x H display more adaptability and more obvious association with learning rate across training. A. More Active Adaptation of Low and High Graph Frequency Components We start by analyzing the adaptation of decomposed signals with respect to different levels of spatial variations. To that end, we evaluate the variance of the decomposed signals over multiple temporal scales over days and minutes for the two experiments [38], [39]. We describe the method specifically for x L for simplicity and similar computations were conducted for

10 1 Fig. 6. Distribution of decomposed signals for the 6 week experiment. (a) Absolute magnitudes for all brain regions with respect to x L brain signals varing smoothly across the network averaged across all sample points for each individual and across all participants at the first scan session of the 6 week dataset. (b) With respect to x M and (c) with respect to x H signals fluctuating vibrantly across the brain. (d), (e), and (f) are averaged x L, x M and x H at the last scan session of the 6 week dataset, respectively. Only regions with absolute magnitudes higher than a fixed threshold is colored. Fig. 7. Distribution of decomposed signals for the 3 day experiment. (a), (b), and (c) are x L, x M and x H averaged across all sample points for each subject and across participants in the 3 day experiment, respectively. Regions with absolute value less than a threshold are not colored. x M and x H. At the macro or large timescale, we average the decomposed signals x L for all sample points within each scanning session with different sequence type, and evaluate the variance of the magnitudes of the averaged signals across all the scanning sessions and sequence types [4], [41]. For the 6 week experiment, there are 4 scanning sessions and 3 different sequence types, so the variance is with respect to 12 points. For the 3 day experiment, there are 3 scanning sessions and only 1 sequence type, and therefore the variance is for 3 points. As for the micro or minutescale, we average the decomposed signals x L for all sample points within each minute, and evaluate the variance of the magnitudes of the averaged signals across all minute windows for each scanning session with different sequence types. The evaluated variance is then averaged across all participants of the experiment of interest. Figure 8 displays the variance of the decomposed signals x L, x M and x H at two different temporal scales of the two experiments. For the 6 week dataset, 3 session-sequence combinations, with number proportional to the level of exposure of participants to the sequence (1-MIN refers to MIN sequence at session 1, 5 denotes MIN sequence at session 4, 9 entails EXT sequence at session 3) are selected out of the 12 combinations in total for a cleaner illustration, but all the other session-sequence combinations exhibit similar properties. Following the definition of frequency decomposition as in (14), it is expected for the low

CS168: The Modern Algorithmic Toolbox Lectures #11 and #12: Spectral Graph Theory

CS168: The Modern Algorithmic Toolbox Lectures #11 and #12: Spectral Graph Theory CS168: The Modern Algorithmic Toolbox Lectures #11 and #12: Spectral Graph Theory Tim Roughgarden & Gregory Valiant May 2, 2016 Spectral graph theory is the powerful and beautiful theory that arises from

More information

Data Preprocessing Tasks

Data Preprocessing Tasks Data Tasks 1 2 3 Data Reduction 4 We re here. 1 Dimensionality Reduction Dimensionality reduction is a commonly used approach for generating fewer features. Typically used because too many features can

More information

Tropical Graph Signal Processing

Tropical Graph Signal Processing Tropical Graph Signal Processing Vincent Gripon To cite this version: Vincent Gripon. Tropical Graph Signal Processing. 2017. HAL Id: hal-01527695 https://hal.archives-ouvertes.fr/hal-01527695v2

More information

arxiv: v1 [cs.it] 26 Sep 2018

arxiv: v1 [cs.it] 26 Sep 2018 SAPLING THEORY FOR GRAPH SIGNALS ON PRODUCT GRAPHS Rohan A. Varma, Carnegie ellon University rohanv@andrew.cmu.edu Jelena Kovačević, NYU Tandon School of Engineering jelenak@nyu.edu arxiv:809.009v [cs.it]

More information

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012 Machine Learning CSE6740/CS7641/ISYE6740, Fall 2012 Principal Components Analysis Le Song Lecture 22, Nov 13, 2012 Based on slides from Eric Xing, CMU Reading: Chap 12.1, CB book 1 2 Factor or Component

More information

Digraph Fourier Transform via Spectral Dispersion Minimization

Digraph Fourier Transform via Spectral Dispersion Minimization Digraph Fourier Transform via Spectral Dispersion Minimization Gonzalo Mateos Dept. of Electrical and Computer Engineering University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/

More information

Ergodicity in Stationary Graph Processes: A Weak Law of Large Numbers

Ergodicity in Stationary Graph Processes: A Weak Law of Large Numbers IEEE TRASACTIOS O SIGAL PROCESSIG (SUBMITTED) Ergodicity in Stationary Graph Processes: A Weak Law of Large umbers Fernando Gama and Alejandro Ribeiro arxiv:803.04550v [eess.sp] Mar 08 Abstract For stationary

More information

Discriminative Direction for Kernel Classifiers

Discriminative Direction for Kernel Classifiers Discriminative Direction for Kernel Classifiers Polina Golland Artificial Intelligence Lab Massachusetts Institute of Technology Cambridge, MA 02139 polina@ai.mit.edu Abstract In many scientific and engineering

More information

+ + ( + ) = Linear recurrent networks. Simpler, much more amenable to analytic treatment E.g. by choosing

+ + ( + ) = Linear recurrent networks. Simpler, much more amenable to analytic treatment E.g. by choosing Linear recurrent networks Simpler, much more amenable to analytic treatment E.g. by choosing + ( + ) = Firing rates can be negative Approximates dynamics around fixed point Approximation often reasonable

More information

Modelling temporal structure (in noise and signal)

Modelling temporal structure (in noise and signal) Modelling temporal structure (in noise and signal) Mark Woolrich, Christian Beckmann*, Salima Makni & Steve Smith FMRIB, Oxford *Imperial/FMRIB temporal noise: modelling temporal autocorrelation temporal

More information

MULTISCALE MODULARITY IN BRAIN SYSTEMS

MULTISCALE MODULARITY IN BRAIN SYSTEMS MULTISCALE MODULARITY IN BRAIN SYSTEMS Danielle S. Bassett University of California Santa Barbara Department of Physics The Brain: A Multiscale System Spatial Hierarchy: Temporal-Spatial Hierarchy: http://www.idac.tohoku.ac.jp/en/frontiers/column_070327/figi-i.gif

More information

8.1 Concentration inequality for Gaussian random matrix (cont d)

8.1 Concentration inequality for Gaussian random matrix (cont d) MGMT 69: Topics in High-dimensional Data Analysis Falll 26 Lecture 8: Spectral clustering and Laplacian matrices Lecturer: Jiaming Xu Scribe: Hyun-Ju Oh and Taotao He, October 4, 26 Outline Concentration

More information

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data.

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data. Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two

More information

Lecture Notes 1: Vector spaces

Lecture Notes 1: Vector spaces Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector

More information

Collaborative Filtering via Graph Signal Processing

Collaborative Filtering via Graph Signal Processing Collaborative Filtering via Graph Signal Processing Weiyu Huang, Antonio G. Marques, and Alejandro Ribeiro Abstract This paper develops new designs for recommender systems inspired by recent advances in

More information

Introduction to Machine Learning

Introduction to Machine Learning 10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what

More information

Spectral Perturbation of Small-World Networks with Application to Brain Disease Detection

Spectral Perturbation of Small-World Networks with Application to Brain Disease Detection Spectral Perturbation of Small-World Networks with Application to Brain Disease Detection Chenhui Hu May 4, 22 Introduction Many real life systems can be described by complex networks, which usually consist

More information

Spectral Clustering on Handwritten Digits Database

Spectral Clustering on Handwritten Digits Database University of Maryland-College Park Advance Scientific Computing I,II Spectral Clustering on Handwritten Digits Database Author: Danielle Middlebrooks Dmiddle1@math.umd.edu Second year AMSC Student Advisor:

More information

Sound Recognition in Mixtures

Sound Recognition in Mixtures Sound Recognition in Mixtures Juhan Nam, Gautham J. Mysore 2, and Paris Smaragdis 2,3 Center for Computer Research in Music and Acoustics, Stanford University, 2 Advanced Technology Labs, Adobe Systems

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning Christoph Lampert Spring Semester 2015/2016 // Lecture 12 1 / 36 Unsupervised Learning Dimensionality Reduction 2 / 36 Dimensionality Reduction Given: data X = {x 1,..., x

More information

Wavelet Transform And Principal Component Analysis Based Feature Extraction

Wavelet Transform And Principal Component Analysis Based Feature Extraction Wavelet Transform And Principal Component Analysis Based Feature Extraction Keyun Tong June 3, 2010 As the amount of information grows rapidly and widely, feature extraction become an indispensable technique

More information

Stationary Graph Processes: Nonparametric Spectral Estimation

Stationary Graph Processes: Nonparametric Spectral Estimation Stationary Graph Processes: Nonparametric Spectral Estimation Santiago Segarra, Antonio G. Marques, Geert Leus, and Alejandro Ribeiro Dept. of Signal Theory and Communications King Juan Carlos University

More information

Definition A finite Markov chain is a memoryless homogeneous discrete stochastic process with a finite number of states.

Definition A finite Markov chain is a memoryless homogeneous discrete stochastic process with a finite number of states. Chapter 8 Finite Markov Chains A discrete system is characterized by a set V of states and transitions between the states. V is referred to as the state space. We think of the transitions as occurring

More information

Discrete Signal Processing on Graphs: Sampling Theory

Discrete Signal Processing on Graphs: Sampling Theory IEEE TRANS. SIGNAL PROCESS. TO APPEAR. 1 Discrete Signal Processing on Graphs: Sampling Theory Siheng Chen, Rohan Varma, Aliaksei Sandryhaila, Jelena Kovačević arxiv:153.543v [cs.it] 8 Aug 15 Abstract

More information

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works CS68: The Modern Algorithmic Toolbox Lecture #8: How PCA Works Tim Roughgarden & Gregory Valiant April 20, 206 Introduction Last lecture introduced the idea of principal components analysis (PCA). The

More information

GAUSSIAN PROCESS TRANSFORMS

GAUSSIAN PROCESS TRANSFORMS GAUSSIAN PROCESS TRANSFORMS Philip A. Chou Ricardo L. de Queiroz Microsoft Research, Redmond, WA, USA pachou@microsoft.com) Computer Science Department, Universidade de Brasilia, Brasilia, Brazil queiroz@ieee.org)

More information

Principal Components Analysis (PCA)

Principal Components Analysis (PCA) Principal Components Analysis (PCA) Principal Components Analysis (PCA) a technique for finding patterns in data of high dimension Outline:. Eigenvectors and eigenvalues. PCA: a) Getting the data b) Centering

More information

The Generalized Haar-Walsh Transform (GHWT) for Data Analysis on Graphs and Networks

The Generalized Haar-Walsh Transform (GHWT) for Data Analysis on Graphs and Networks The Generalized Haar-Walsh Transform (GHWT) for Data Analysis on Graphs and Networks Jeff Irion & Naoki Saito Department of Mathematics University of California, Davis SIAM Annual Meeting 2014 Chicago,

More information

Properties of Matrices and Operations on Matrices

Properties of Matrices and Operations on Matrices Properties of Matrices and Operations on Matrices A common data structure for statistical analysis is a rectangular array or matris. Rows represent individual observational units, or just observations,

More information

ORIE 4741: Learning with Big Messy Data. Spectral Graph Theory

ORIE 4741: Learning with Big Messy Data. Spectral Graph Theory ORIE 4741: Learning with Big Messy Data Spectral Graph Theory Mika Sumida Operations Research and Information Engineering Cornell September 15, 2017 1 / 32 Outline Graph Theory Spectral Graph Theory Laplacian

More information

Machine Learning (Spring 2012) Principal Component Analysis

Machine Learning (Spring 2012) Principal Component Analysis 1-71 Machine Learning (Spring 1) Principal Component Analysis Yang Xu This note is partly based on Chapter 1.1 in Chris Bishop s book on PRML and the lecture slides on PCA written by Carlos Guestrin in

More information

Functional Connectivity and Network Methods

Functional Connectivity and Network Methods 18/Sep/2013" Functional Connectivity and Network Methods with functional magnetic resonance imaging" Enrico Glerean (MSc), Brain & Mind Lab, BECS, Aalto University" www.glerean.com @eglerean becs.aalto.fi/bml

More information

Chapter 2. Linear Algebra. rather simple and learning them will eventually allow us to explain the strange results of

Chapter 2. Linear Algebra. rather simple and learning them will eventually allow us to explain the strange results of Chapter 2 Linear Algebra In this chapter, we study the formal structure that provides the background for quantum mechanics. The basic ideas of the mathematical machinery, linear algebra, are rather simple

More information

Spectral Generative Models for Graphs

Spectral Generative Models for Graphs Spectral Generative Models for Graphs David White and Richard C. Wilson Department of Computer Science University of York Heslington, York, UK wilson@cs.york.ac.uk Abstract Generative models are well known

More information

Walks, Springs, and Resistor Networks

Walks, Springs, and Resistor Networks Spectral Graph Theory Lecture 12 Walks, Springs, and Resistor Networks Daniel A. Spielman October 8, 2018 12.1 Overview In this lecture we will see how the analysis of random walks, spring networks, and

More information

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin 1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)

More information

Analysis of Spectral Kernel Design based Semi-supervised Learning

Analysis of Spectral Kernel Design based Semi-supervised Learning Analysis of Spectral Kernel Design based Semi-supervised Learning Tong Zhang IBM T. J. Watson Research Center Yorktown Heights, NY 10598 Rie Kubota Ando IBM T. J. Watson Research Center Yorktown Heights,

More information

The weighted spectral distribution; A graph metric with applications. Dr. Damien Fay. SRG group, Computer Lab, University of Cambridge.

The weighted spectral distribution; A graph metric with applications. Dr. Damien Fay. SRG group, Computer Lab, University of Cambridge. The weighted spectral distribution; A graph metric with applications. Dr. Damien Fay. SRG group, Computer Lab, University of Cambridge. A graph metric: motivation. Graph theory statistical modelling Data

More information

EEG- Signal Processing

EEG- Signal Processing Fatemeh Hadaeghi EEG- Signal Processing Lecture Notes for BSP, Chapter 5 Master Program Data Engineering 1 5 Introduction The complex patterns of neural activity, both in presence and absence of external

More information

15 Singular Value Decomposition

15 Singular Value Decomposition 15 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

INTERPOLATION OF GRAPH SIGNALS USING SHIFT-INVARIANT GRAPH FILTERS

INTERPOLATION OF GRAPH SIGNALS USING SHIFT-INVARIANT GRAPH FILTERS INTERPOLATION OF GRAPH SIGNALS USING SHIFT-INVARIANT GRAPH FILTERS Santiago Segarra, Antonio G. Marques, Geert Leus, Alejandro Ribeiro University of Pennsylvania, Dept. of Electrical and Systems Eng.,

More information

Data Mining and Analysis: Fundamental Concepts and Algorithms

Data Mining and Analysis: Fundamental Concepts and Algorithms : Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA 2 Department of Computer

More information

Chapter Two Elements of Linear Algebra

Chapter Two Elements of Linear Algebra Chapter Two Elements of Linear Algebra Previously, in chapter one, we have considered single first order differential equations involving a single unknown function. In the next chapter we will begin to

More information

This section is an introduction to the basic themes of the course.

This section is an introduction to the basic themes of the course. Chapter 1 Matrices and Graphs 1.1 The Adjacency Matrix This section is an introduction to the basic themes of the course. Definition 1.1.1. A simple undirected graph G = (V, E) consists of a non-empty

More information

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 1 MACHINE LEARNING Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 2 Practicals Next Week Next Week, Practical Session on Computer Takes Place in Room GR

More information

Relational Nonlinear FIR Filters. Ronald K. Pearson

Relational Nonlinear FIR Filters. Ronald K. Pearson Relational Nonlinear FIR Filters Ronald K. Pearson Daniel Baugh Institute for Functional Genomics and Computational Biology Thomas Jefferson University Philadelphia, PA Moncef Gabbouj Institute of Signal

More information

A Statistical Look at Spectral Graph Analysis. Deep Mukhopadhyay

A Statistical Look at Spectral Graph Analysis. Deep Mukhopadhyay A Statistical Look at Spectral Graph Analysis Deep Mukhopadhyay Department of Statistics, Temple University Office: Speakman 335 deep@temple.edu http://sites.temple.edu/deepstat/ Graph Signal Processing

More information

Superiorized Inversion of the Radon Transform

Superiorized Inversion of the Radon Transform Superiorized Inversion of the Radon Transform Gabor T. Herman Graduate Center, City University of New York March 28, 2017 The Radon Transform in 2D For a function f of two real variables, a real number

More information

EL-GY 6813/BE-GY 6203 Medical Imaging, Fall 2016 Final Exam

EL-GY 6813/BE-GY 6203 Medical Imaging, Fall 2016 Final Exam EL-GY 6813/BE-GY 6203 Medical Imaging, Fall 2016 Final Exam (closed book, 1 sheets of notes double sided allowed, no calculator or other electronic devices allowed) 1. Ultrasound Physics (15 pt) A) (9

More information

Recipes for the Linear Analysis of EEG and applications

Recipes for the Linear Analysis of EEG and applications Recipes for the Linear Analysis of EEG and applications Paul Sajda Department of Biomedical Engineering Columbia University Can we read the brain non-invasively and in real-time? decoder 1001110 if YES

More information

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2015 Soleymani Dimensionality Reduction: Feature Selection vs. Feature Extraction Feature selection Select a subset of a given

More information

Lecture Notes 5: Multiresolution Analysis

Lecture Notes 5: Multiresolution Analysis Optimization-based data analysis Fall 2017 Lecture Notes 5: Multiresolution Analysis 1 Frames A frame is a generalization of an orthonormal basis. The inner products between the vectors in a frame and

More information

Spectral Clustering. Zitao Liu

Spectral Clustering. Zitao Liu Spectral Clustering Zitao Liu Agenda Brief Clustering Review Similarity Graph Graph Laplacian Spectral Clustering Algorithm Graph Cut Point of View Random Walk Point of View Perturbation Theory Point of

More information

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas Dimensionality Reduction: PCA Nicholas Ruozzi University of Texas at Dallas Eigenvalues λ is an eigenvalue of a matrix A R n n if the linear system Ax = λx has at least one non-zero solution If Ax = λx

More information

Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1. x 2. x =

Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1. x 2. x = Linear Algebra Review Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1 x x = 2. x n Vectors of up to three dimensions are easy to diagram.

More information

Spectral Methods for Subgraph Detection

Spectral Methods for Subgraph Detection Spectral Methods for Subgraph Detection Nadya T. Bliss & Benjamin A. Miller Embedded and High Performance Computing Patrick J. Wolfe Statistics and Information Laboratory Harvard University 12 July 2010

More information

The Principal Component Analysis

The Principal Component Analysis The Principal Component Analysis Philippe B. Laval KSU Fall 2017 Philippe B. Laval (KSU) PCA Fall 2017 1 / 27 Introduction Every 80 minutes, the two Landsat satellites go around the world, recording images

More information

Robustness of Principal Components

Robustness of Principal Components PCA for Clustering An objective of principal components analysis is to identify linear combinations of the original variables that are useful in accounting for the variation in those original variables.

More information

Statistical Analysis Aspects of Resting State Functional Connectivity

Statistical Analysis Aspects of Resting State Functional Connectivity Statistical Analysis Aspects of Resting State Functional Connectivity Biswal s result (1995) Correlations between RS Fluctuations of left and right motor areas Why studying resting state? Human Brain =

More information

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395 Data Mining Dimensionality reduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 42 Outline 1 Introduction 2 Feature selection

More information

Dominant Feature Vectors Based Audio Similarity Measure

Dominant Feature Vectors Based Audio Similarity Measure Dominant Feature Vectors Based Audio Similarity Measure Jing Gu 1, Lie Lu 2, Rui Cai 3, Hong-Jiang Zhang 2, and Jian Yang 1 1 Dept. of Electronic Engineering, Tsinghua Univ., Beijing, 100084, China 2 Microsoft

More information

Deep Learning Book Notes Chapter 2: Linear Algebra

Deep Learning Book Notes Chapter 2: Linear Algebra Deep Learning Book Notes Chapter 2: Linear Algebra Compiled By: Abhinaba Bala, Dakshit Agrawal, Mohit Jain Section 2.1: Scalars, Vectors, Matrices and Tensors Scalar Single Number Lowercase names in italic

More information

Statistical Learning Notes III-Section 2.4.3

Statistical Learning Notes III-Section 2.4.3 Statistical Learning Notes III-Section 2.4.3 "Graphical" Spectral Features Stephen Vardeman Analytics Iowa LLC January 2019 Stephen Vardeman (Analytics Iowa LLC) Statistical Learning Notes III-Section

More information

Learning Graphs from Data: A Signal Representation Perspective

Learning Graphs from Data: A Signal Representation Perspective 1 Learning Graphs from Data: A Signal Representation Perspective Xiaowen Dong*, Dorina Thanou*, Michael Rabbat, and Pascal Frossard arxiv:1806.00848v1 [cs.lg] 3 Jun 2018 The construction of a meaningful

More information

Neuroscience Introduction

Neuroscience Introduction Neuroscience Introduction The brain As humans, we can identify galaxies light years away, we can study particles smaller than an atom. But we still haven t unlocked the mystery of the three pounds of matter

More information

The homogeneous Poisson process

The homogeneous Poisson process The homogeneous Poisson process during very short time interval Δt there is a fixed probability of an event (spike) occurring independent of what happened previously if r is the rate of the Poisson process,

More information

ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015

ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015 ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015 http://intelligentoptimization.org/lionbook Roberto Battiti

More information

A Graph Signal Processing Perspective on Functional Brain Imaging

A Graph Signal Processing Perspective on Functional Brain Imaging A Graph Signal Processing Perspective on Functional Brain Imaging This article addresses how the signal processing view on brain graphs can provide additional insights into brain network analysis. By Weiyu

More information

Spectral Processing. Misha Kazhdan

Spectral Processing. Misha Kazhdan Spectral Processing Misha Kazhdan [Taubin, 1995] A Signal Processing Approach to Fair Surface Design [Desbrun, et al., 1999] Implicit Fairing of Arbitrary Meshes [Vallet and Levy, 2008] Spectral Geometry

More information

Markov Chains and Spectral Clustering

Markov Chains and Spectral Clustering Markov Chains and Spectral Clustering Ning Liu 1,2 and William J. Stewart 1,3 1 Department of Computer Science North Carolina State University, Raleigh, NC 27695-8206, USA. 2 nliu@ncsu.edu, 3 billy@ncsu.edu

More information

MULTICHANNEL SIGNAL PROCESSING USING SPATIAL RANK COVARIANCE MATRICES

MULTICHANNEL SIGNAL PROCESSING USING SPATIAL RANK COVARIANCE MATRICES MULTICHANNEL SIGNAL PROCESSING USING SPATIAL RANK COVARIANCE MATRICES S. Visuri 1 H. Oja V. Koivunen 1 1 Signal Processing Lab. Dept. of Statistics Tampere Univ. of Technology University of Jyväskylä P.O.

More information

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26 Principal Component Analysis Brett Bernstein CDS at NYU April 25, 2017 Brett Bernstein (CDS at NYU) Lecture 13 April 25, 2017 1 / 26 Initial Question Intro Question Question Let S R n n be symmetric. 1

More information

Designing Information Devices and Systems II Fall 2018 Elad Alon and Miki Lustig Homework 9

Designing Information Devices and Systems II Fall 2018 Elad Alon and Miki Lustig Homework 9 EECS 16B Designing Information Devices and Systems II Fall 18 Elad Alon and Miki Lustig Homework 9 This homework is due Wednesday, October 31, 18, at 11:59pm. Self grades are due Monday, November 5, 18,

More information

Applications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices

Applications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices Applications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices Vahid Dehdari and Clayton V. Deutsch Geostatistical modeling involves many variables and many locations.

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices)

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Chapter 14 SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Today we continue the topic of low-dimensional approximation to datasets and matrices. Last time we saw the singular

More information

SAMPLE OF THE STUDY MATERIAL PART OF CHAPTER 1 Introduction to Linear Algebra

SAMPLE OF THE STUDY MATERIAL PART OF CHAPTER 1 Introduction to Linear Algebra SAMPLE OF THE STUDY MATERIAL PART OF CHAPTER 1 Introduction to 1.1. Introduction Linear algebra is a specific branch of mathematics dealing with the study of vectors, vector spaces with functions that

More information

Effective Resistance and Schur Complements

Effective Resistance and Schur Complements Spectral Graph Theory Lecture 8 Effective Resistance and Schur Complements Daniel A Spielman September 28, 2015 Disclaimer These notes are not necessarily an accurate representation of what happened in

More information

A Riemannian Framework for Denoising Diffusion Tensor Images

A Riemannian Framework for Denoising Diffusion Tensor Images A Riemannian Framework for Denoising Diffusion Tensor Images Manasi Datar No Institute Given Abstract. Diffusion Tensor Imaging (DTI) is a relatively new imaging modality that has been extensively used

More information

Vector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.

Vector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis. Vector spaces DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Vector space Consists of: A set V A scalar

More information

A Random Dot Product Model for Weighted Networks arxiv: v1 [stat.ap] 8 Nov 2016

A Random Dot Product Model for Weighted Networks arxiv: v1 [stat.ap] 8 Nov 2016 A Random Dot Product Model for Weighted Networks arxiv:1611.02530v1 [stat.ap] 8 Nov 2016 Daryl R. DeFord 1 Daniel N. Rockmore 1,2,3 1 Department of Mathematics, Dartmouth College, Hanover, NH, USA 03755

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Laurenz Wiskott Institute for Theoretical Biology Humboldt-University Berlin Invalidenstraße 43 D-10115 Berlin, Germany 11 March 2004 1 Intuition Problem Statement Experimental

More information

Designing Information Devices and Systems II Fall 2015 Note 5

Designing Information Devices and Systems II Fall 2015 Note 5 EE 16B Designing Information Devices and Systems II Fall 01 Note Lecture given by Babak Ayazifar (9/10) Notes by: Ankit Mathur Spectral Leakage Example Compute the length-8 and length-6 DFT for the following

More information

4 Bias-Variance for Ridge Regression (24 points)

4 Bias-Variance for Ridge Regression (24 points) Implement Ridge Regression with λ = 0.00001. Plot the Squared Euclidean test error for the following values of k (the dimensions you reduce to): k = {0, 50, 100, 150, 200, 250, 300, 350, 400, 450, 500,

More information

Signal and Information Processing. Alejandro Ribeiro

Signal and Information Processing. Alejandro Ribeiro Signal and Information Processing Alejandro Ribeiro April 26, 16 Contents 1 Principal Component Analysis 3 11 The DFT and idft as Hermitian Matrices 3 111 Properties of the DFT Matrix 4 112 The Product

More information

Pre-sessional Mathematics for Big Data MSc Class 2: Linear Algebra

Pre-sessional Mathematics for Big Data MSc Class 2: Linear Algebra Pre-sessional Mathematics for Big Data MSc Class 2: Linear Algebra Yuri Kalnishkan September 22, 2018 Linear algebra is vitally important for applied mathematics We will approach linear algebra from a

More information

Statistical Analysis of fmrl Data

Statistical Analysis of fmrl Data Statistical Analysis of fmrl Data F. Gregory Ashby The MIT Press Cambridge, Massachusetts London, England Preface xi Acronyms xv 1 Introduction 1 What Is fmri? 2 The Scanning Session 4 Experimental Design

More information

4 Introduction to modeling longitudinal data

4 Introduction to modeling longitudinal data 4 Introduction to modeling longitudinal data We are now in a position to introduce a basic statistical model for longitudinal data. The models and methods we discuss in subsequent chapters may be viewed

More information

GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis. Massimiliano Pontil

GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis. Massimiliano Pontil GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis Massimiliano Pontil 1 Today s plan SVD and principal component analysis (PCA) Connection

More information

Spectral Graph Theory Lecture 2. The Laplacian. Daniel A. Spielman September 4, x T M x. ψ i = arg min

Spectral Graph Theory Lecture 2. The Laplacian. Daniel A. Spielman September 4, x T M x. ψ i = arg min Spectral Graph Theory Lecture 2 The Laplacian Daniel A. Spielman September 4, 2015 Disclaimer These notes are not necessarily an accurate representation of what happened in class. The notes written before

More information

A Dimensionality Reduction Framework for Detection of Multiscale Structure in Heterogeneous Networks

A Dimensionality Reduction Framework for Detection of Multiscale Structure in Heterogeneous Networks Shen HW, Cheng XQ, Wang YZ et al. A dimensionality reduction framework for detection of multiscale structure in heterogeneous networks. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 27(2): 341 357 Mar. 2012.

More information

B553 Lecture 5: Matrix Algebra Review

B553 Lecture 5: Matrix Algebra Review B553 Lecture 5: Matrix Algebra Review Kris Hauser January 19, 2012 We have seen in prior lectures how vectors represent points in R n and gradients of functions. Matrices represent linear transformations

More information

Stochastic Structural Dynamics Prof. Dr. C. S. Manohar Department of Civil Engineering Indian Institute of Science, Bangalore

Stochastic Structural Dynamics Prof. Dr. C. S. Manohar Department of Civil Engineering Indian Institute of Science, Bangalore Stochastic Structural Dynamics Prof. Dr. C. S. Manohar Department of Civil Engineering Indian Institute of Science, Bangalore Lecture No. # 33 Probabilistic methods in earthquake engineering-2 So, we have

More information

20 Unsupervised Learning and Principal Components Analysis (PCA)

20 Unsupervised Learning and Principal Components Analysis (PCA) 116 Jonathan Richard Shewchuk 20 Unsupervised Learning and Principal Components Analysis (PCA) UNSUPERVISED LEARNING We have sample points, but no labels! No classes, no y-values, nothing to predict. Goal:

More information

1 Adjacency matrix and eigenvalues

1 Adjacency matrix and eigenvalues CSC 5170: Theory of Computational Complexity Lecture 7 The Chinese University of Hong Kong 1 March 2010 Our objective of study today is the random walk algorithm for deciding if two vertices in an undirected

More information

A Tutorial on Data Reduction. Principal Component Analysis Theoretical Discussion. By Shireen Elhabian and Aly Farag

A Tutorial on Data Reduction. Principal Component Analysis Theoretical Discussion. By Shireen Elhabian and Aly Farag A Tutorial on Data Reduction Principal Component Analysis Theoretical Discussion By Shireen Elhabian and Aly Farag University of Louisville, CVIP Lab November 2008 PCA PCA is A backbone of modern data

More information

Outlines: (June 11, 1996) Instructor:

Outlines: (June 11, 1996) Instructor: Magnetic Resonance Imaging (June 11, 1996) Instructor: Tai-huang Huang Institute of Biomedical Sciences Academia Sinica Tel. (02) 2652-3036; Fax. (02) 2788-7641 E. mail: bmthh@ibms.sinica.edu.tw Reference:

More information

Principal Component Analysis

Principal Component Analysis Machine Learning Michaelmas 2017 James Worrell Principal Component Analysis 1 Introduction 1.1 Goals of PCA Principal components analysis (PCA) is a dimensionality reduction technique that can be used

More information

Principal Components Analysis

Principal Components Analysis Principal Components Analysis Santiago Paternain, Aryan Mokhtari and Alejandro Ribeiro March 29, 2018 At this point we have already seen how the Discrete Fourier Transform and the Discrete Cosine Transform

More information

Machine Learning - MT & 14. PCA and MDS

Machine Learning - MT & 14. PCA and MDS Machine Learning - MT 2016 13 & 14. PCA and MDS Varun Kanade University of Oxford November 21 & 23, 2016 Announcements Sheet 4 due this Friday by noon Practical 3 this week (continue next week if necessary)

More information