Bayesian inference of structural brain networks

Size: px
Start display at page:

Download "Bayesian inference of structural brain networks"

Transcription

1 Bayesian inference of structural brain networks Max Hinne a,b, Tom Heskes a, Christian F. Beckmann b, Marcel A. J. van Gerven b a Radboud University Nijmegen, Institute for Computing and Information Sciences, Nijmegen, The Netherlands b Radboud University Nijmegen, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands Abstract Structural brain networks are used to model white-matter connectivity between spatially segregated brain regions. The presence, location and orientation of these white matter tracts can be derived using diffusion-weighted magnetic resonance imaging in combination with probabilistic tractography. Unfortunately, as of yet, none of the existing approaches provide an undisputed way of inferring brain networks from the streamline distributions which tractography produces. Stateof-the-art methods rely on an arbitrary threshold or, alternatively, yield weighted results that are difficult to interpret. In this paper, we provide a generative model that explicitly describes how structural brain networks lead to observed streamline distributions. This allows us to draw principled conclusions about brain networks, which we validate using simultaneously acquired resting-state functional MRI data. Inference may be further informed by means of a prior which combines connectivity estimates from multiple subjects. Based on this prior, we obtain networks that significantly improve on the conventional approach. Keywords: structural connectivity, probabilistic tractography, hierarchical Bayesian model 1. Introduction Human behavior ultimately arises through the interactions between multiple brain regions that together form networks that can be characterized in terms of structural, functional and effective connectivity Penny et al., 06). Structural connectivity presupposes the existence of whitematter tracts that connect spatially segregated brain regions which constrain the functional and effective connectivity between these regions. Hence, structural connectivity provides the scaffolding that is required to shape neuronal dynamics. Changes in structural brain networks have been related to various neurological disorders. For this reason, optimal inference of structural brain networks is of major importance in clinical neuroscience Catani, 07). Inference of these networks entails two steps. First is the estimation of the white matter tracts. The second step consists of obtaining the network that captures which regions are connected, based on the earlier identified fibre tracts. In this paper, we focus on the latter step. For the first step, we use diffusion-weighted imaging DWI), which is a prominent way to estimate structural connectivity of whole-brain networks in vivo. It is a variant of magnetic resonance imaging MRI) which measures the restricted diffusion of water molecules, thereby providing an indirect measure of the presence and orientation of Corresponding author at Radboud University Nijmegen, Faculty of Science, Institute for Computing and Information Sciences, Postbus 9010, 6500 GL Nijmegen, The Netherlands. Tel.: , Fax: address: mhinne@cs.ru.nl Max Hinne) white-matter tracts. By following the principal diffusion direction in individual voxels, streamlines can be drawn that represent the structure of fibre bundles, connecting separate regions of grey matter. This process is known as deterministic tractography Conturo et al., 1999; Chung et al., 10; Shu et al., 11). Alternatively, fibres may be estimated using probabilistic tractography Behrens et al., 03, 07; Friman et al., 06; Jbabdi et al., 07). This comprises a model for the principal diffusion direction that is then used to sample distributions of streamlines. Ultimately, the procedure results in a measure of uncertainty about where a hypothesized connection will terminate. A benefit of the probabilistic approach is that it explicitly takes uncertainty in the streamlining process into account. Apart from studies focusing on particular tracts, much research has been devoted to the derivation of macroscopic connectivity properties, that is, whole-brain structural connectivity. Several approaches have been suggested to extract whole-brain networks from probabilistic tractography results Robinson et al., 08; Hagmann et al., 07; Gong et al., 09). Unfortunately, inference of whole-brain networks from probabilistic tractography estimates remains somewhat ad hoc. Typically the underlying brain network is derived by thresholding the streamline distribution such that counts above or below threshold are taken to reflect the presence or absence of tracts, respectively. This approach is easy to implement but it has a number of issues. First, the threshold is arbitrarily chosen to have a particular value. In a substantial part of the literature, the threshold that is used to transform the streamline distribution into a network is actually set to zero Hag- Preprint submitted to NeuroImage November 29, 12

2 2 mann et al., 07, 08; Zalesky et al., 10; Vaessen et al., 10; Chung et al., 11). However, probabilistic streamlining depends on the arbitrary number of samples that are drawn per voxel. This implies that, as more samples are drawn, more brain regions are likely to eventually become connected given a threshold at zero. Alternatively, the number of streamlines can be interpreted as connection weight Bassett et al., 11; Zalesky et al., 10; Robinson et al., 10), or a relative threshold can be applied Kaden et al., 07). This way, the relative differences between connections remain respected. Unfortunately, the connection weights do not have a straightforward probabilistic) interpretation. Simply normalizing these weights does not yield a true notion of connection probability. At most, it can be regarded as the conditional probability that a streamline ends in a particular voxel given the starting point of the streamline. In the case of a streamline distribution with, say, half of the streamlines starting at A ending in B, and the other half ending in C, normalized streamline counts cannot distinguish between one edge with an uncertain end point, or two edges with definite end points. Finally, several graph-theoretical measures such as characteristic path length and clustering coefficient are ill-defined for non-binary networks. In general, it is problematic to use thresholding since it ignores the relative differences between streamline counts. Intuitively, one would expect that if, say, ninety percent of the streamlines connect from voxel A to voxel B, and ten percent connect voxel A to voxel C, then at the least the former has a higher probability of having a corresponding edge in the network than the latter, but both edges are possible as well. This is related to the burstiness phenomenon of words in document retrieval, where the occurrence of a rare word in a document makes its repeated occurrence more likely Xu and Akella, 10). Summarizing, the issue with thresholding approaches is that they consider each tract in isolation. This ignores the information that can be gained from the possible symmetry in streamline counts, as well as from the relative differences within a streamline distribution. Another important observation is that the mentioned approaches do not easily support the integration of probabilistic streamlining data with other sources of information. Data is often not collected in isolation but rather acquired for multiple subjects, potentially using a multitude of imaging techniques. Multi-modal data fusion is needed in order to provide a coherent picture of brain function Horwitz and Poeppel, 02; Groves et al., 11). The integration of multi-subject data is required for group-level inference, where the interest is in estimating a network that characterizes a particular population, for example, when comparing patients with controls in a clinical setting Simpson et al., 11). In the following, we provide a Bayesian framework for the inference of whole-brain networks from streamline distributions. In our approach, we consider the distribution of binary) networks that are supported by our data, instead of generating a single network based on an arbitrary threshold. Our approach relies on defining a generative model for whole-brain networks which extends recent work on network inference in systems biology Mukherjee and Speed, 08) and consists of two ingredients. First, a network prior is defined in terms of the classical Erdős-Rényi model Erdős and Rényi, 19). This prior is later extended to handle multi-subject data, capturing the notion that different subjects brains tend to be similar. Second, we propose a forward model based on a Dirichlet compound multinomial distribution which views the streamline distributions produced by probabilistic tractography as noisy data, thus completing the generative model. In order to validate our Bayesian framework we make use of the often reported observation that restingstate functional connectivity reflects structural connectivity Koch et al., 02; Greicius et al., 09; Honey et al., 09; Lv et al., 10; Skudlarski et al., 08; Park et al., 08; Damoiseaux and Greicius, 09). We show that structural networks that derive from our generative model informed by the connectivity for other subjects provide a better fit to the in)dependencies in resting-state functional MRI rs-fmri) data than the standard thresholding approach. 2. Material and methods 2.1. Data acquisition Twenty healthy volunteers were scanned after giving informed written consent in accordance with the guidelines of the local ethics committee. A T1 structural scan, resting-state functional data and diffusion-weighted images were obtained using a Siemens Magnetom Trio 3T system at the Donders Centre for Cognitive Neuroimaging, Radboud University Nijmegen, The Netherlands. The rs-fmri data were acquired at 3 Tesla using a multi echo echo planar imaging ME-EPI) sequence voxel size 3.5 mm isotropic, matrix size 64 64, TR = 00 ms, TEs = 6.9, 16.2, 25, 35 and 45 ms, 39 slices, GRAPPA factor 3, 6/8 partial Fourier). A total of 1030 volumes were obtained. An optimized acquisition order described by Cook et al. 06) was used in the DWI protocol voxel size 2.0 mm isotropic, matrix size , TR = ms, TE = 101 ms, 70 slices, 256 directions at b = 1500 s/mm 2 and 24 directions at b=0) Preprocessing of resting-state data The multi-echo images obtained using the rs-fmri acquisition protocol were combined using a custom Matlab script MATLAB 7.7, The MathWorks Inc., Natick, MA, USA) which implements the procedure described by Poser et al. 06) and also incorporates motion correction using functions from the SPM5 software package Wellcome Department of Imaging Neuroscience, University College London, UK). Of the 1030 combined volumes, the first six were discarded to allow the system to reach a steady state.

3 1 a) b) A 0 R L 100 P Figure 1: a) Covariance matrices for the resting-state data for three randomly selected subjects. b) Axial view of RGB-FA maps for the diffusion weighted images, again for three randomly selected subjects. The s in the matrices are shown in the order they appear in the AAL atlas. Tools from the Oxford FMRIB Software Library FSL, FMRIB, Oxford, UK) were used for further processing. Brain extraction was performed using FSL BET Smith, 02). For each subject, probabilistic brain tissue maps were obtained using FSL FAST Zhang et al., 01). A zero-lag 6th order Butterworth bandpass filter was applied to the functional data to retain only frequencies between 0.01 and 0.08 Hz. After preprocessing, the fmri data were parcellated according to the Automated Anatomical Labeling AAL) atlas Tzourio-Mazoyer et al., 02). Regions without voxels with gray-matter probability 0.5 were discarded. This resulted in an average region count of ± 0.1. For these regions the functional data was summed and then standardized to have zero mean and unit standard deviation. The resulting data were used to compute the empirical covariance matrix ˆΣ. Example covariance matrices are shown in Fig. 1a Framework for structural connectivity estimation In this section we derive our Bayesian approach to the inference of whole-brain structural networks. The quantity of interest in our framework is the posterior over structural networks represented by the adjacency matrix A given observed probabilistic streamlining data N and hyperparameters ξ. An element a ij {0, 1} represents the absence or presence of an edge between brain region i and j. A is taken to be a simple graph, such that a ij = a ji and a ii = 0. A brain region can either be interpreted as a voxel or as an aggregation of voxels as defined by a gray matter parcellation. The posterior expresses our knowledge on structural connectivity given the data and background knowledge and is given by: P A N, ξ) P N A, a +, a ) P A p) 1) 2.3. Preprocessing of diffusion imaging data The preprocessing steps for the diffusion data were conducted using FSL FDT Behrens et al., 03) and consisted of correction for eddy currents and estimation of the diffusion parameters. Raw color-coded fractional anisotropy maps are shown in Fig 1b. To obtain a measure of white-matter connectivity, we used FDT Probtrackx 2.0 Behrens et al., 03, 07). As seed voxels for tractography we used those voxels that live on the boundary between white matter and gray matter. For each of these voxels 5000 streamlines were drawn, with a maximum length of 00 steps. The streamlines were restricted by the fractional anisotropy to prevent them from wandering around in gray matter. Streamlines in which a sharp angle > degrees) occurred or that had a length less than 2 mm were discarded. The output thus obtained is a matrix N with n ij the number of streamlines drawn from voxel i to voxel j. To transform this into the parcellated scheme as dictated by the AAL atlas, the streamlines were summed over all voxels per region, resulting in an aggregated connectivity matrix which ranges over regions instead of voxels. Regions that had been removed after preprocessing the fmri data were removed from the aggregated connectivity matrix as well. 3 with hyperparameters ξ = a +, a, p), for which an interpretation will be given later on. In the following, for convenience, we will sometimes suppress the dependence on the hyperparameters. To infer the posterior distribution, we must specify a prior P A) and a forward model P N A) which together define a generative model of probabilistic streamlining data. Given these components, the posterior can be approximated using a Markov chain Monte Carlo algorithm, as described in detail in Section 2.5. We now proceed to formally define the components of the generative model as shown in Fig Forward model We begin with a specification of the forward model P N A). Here, we describe how the observed streamline distributions N depend on the underlying network A through latent streamline probabilities X. Assume there are K brain regions for which we want to estimate the structural connectivity. We start by considering one region i and the possible targets in which a postulated tract may terminate. Let n ik denote the number of streamlines which start in region i and terminate in region k. We assume that n ii = 0. Probabilistic tractography produces a distribution over target vertices n i = n i1,...,n ik ) T by drawing S streamlines,

4 p A a +,a X Figure 2: The generative model that describes how the observed streamline distribution N depends on the hidden) connectivity probabilities X. These in turn depend on the hyperparameters a + and a as well as the connectivity A, which is determined by hyperparameter p of the prior. N i = K k=1 n ik S of them ending up in a target region. 1 A particular distribution n i depends on the streamline probabilities. That is, we expect many streamlines between two regions when there is a high streamline probability and vice versa. This is captured by expressing the probability of a distribution n i in terms of a multinomial distribution P n i x i ) K j=1 x nij ij, in which x i = x i1,...,x ik ) is a probability vector with j x ij = 1. Each x ij represents the probability of drawing a streamline from region i to region j. This streamlining probability itself depends on whether or not there actually exists a physical tract between region i and region j. Let a i denote the i-th row of A indicating the connectivity between region i and all other regions. Intuitively, we expect a high streamline probability when there is an edge in the network. Conversely, we expect a low probability when two regions are disconnected. Thus, the streamline probabilities depend on the actual white-matter connectivity as modeled by A. This is captured by modeling the distribution of streamline probabilities using a Dirichlet distribution P x i a i, a +, a ) K j=1 N bij 1 xij, where shorthand notation b ij a ij a + +1 a ij )a is used. The b ij can be interpreted as the parameters that determine the probability of streamlining from region i to region j when an edge a ij is either present a + ) or absent a ). To obtain a single expression for the likelihood of an adjacency matrix, let N = n 1 ;...;n K ) represent the combined probabilistic tractography data, i.e. for each of the K s a distribution of streamlines to all other s. Similarly, let X = x 1 ;... ;x K ) denote the combined hidden connection probabilities and A = a 1 ;...;a K ) the 1 It is possible that streamlines end up in voxels outside any region of the parcellation, hence the inequality. 4 adjacency matrix for all brain regions. The likelihood of the network A is expressed as P N A, a +, a ) = P N X)P X A, a +, a ) dx. 2) By recognizing that the Dirichlet distribution is the conjugate prior for the multinomial distribution, it follows that Eq. 2) is a product of Dirichlet compound multinomial distributions Madsen et al., 05; Xu and Akella, 10; Minka, 00). The DCM distribution assumes that, given a network, a probability vector can be drawn with large values where the network has edges and small values where the network is disconnected. This probability vector, in turn, can be used to sample from a multinomial distribution that reflects the probabilistic tractography outcome. For sufficiently small choices of the hyperparameters of the DCM, sampling from this multinomial reflects the burstiness behavior we observe in the streamline distributions, where some pairs of s are connected by many streamlines, while most pairs have few or even zero streamlines Network prior In order to define a prior on adjacency matrices, we adopt the Erdős-Rényi model which states that the probability of an edge between region i and j is given by parameter p Erdős and Rényi, 19). This allows the prior to be expressed in terms of a product of binomial distributions: P A p) = i<j p aij 1 p) 1 aij. Recall that a ji a ij by definition, such that choosing p = 0.5 gives rise to a flat prior on simple graphs Hierarchical model So far, we assumed that data for each subject is analyzed independently. However, in practice, data for multiple subjects may be available and data for one subject might inform the inference for another subject. The intuition is that brain connectivity will, to a certain extent, be similar across subjects. Therefore, borrowing statistical strength from other subjects should lower the susceptibility to noise and artifacts in a single subject. This can be achieved by formulating a hierarchical model, where subject-dependent parameters at the first level are tied by subject-independent parameters at the second level. Figure 3 depicts this hierarchical model. Suppose streamline data N = N 1),...,N M) ) is acquired for M subjects. Let A = A 1),...,A M) ) denote a vector whose elements A m) refers to the connectivity matrix for subject m. In the hierarchical model, we assume that the different subjects are related through parent connectivity Ā. The different Am) are conditionally independent given Ā. For a new subject M +1, the quantity

5 p a +,a Ā A m) X m) N m) m = 1 : M Figure 3: The hierarchical model describes how the connectivity for a subject depends on its streamline distribution but also on the connectivity in other subjects as mediated through parent network Ā. of interest is the posterior marginal ) P A M+1) N,N M+1), ξ P N M+1) A M+1), a +, a ) ) P A M+1) N, ξ. We could approximate this marginal by sampling from the hierarchical model. However, this is a computationally demanding task as it requires the simultaneous estimation of all of the adjacency matrices belonging to each of the subjects, as well as the parent network Ā. Instead, we specify a prior based on the connectivity obtained for other subjects. This improvement over the Erdős-Rényi model defines a separate connection probability for each individual edge instead of using a single parameter p to specify the connection probability for complete networks. This multi-subject prior is derived from the hierarchical model in Appendix A and is equal to: P ) A M+1) N, ξ = i<j p am+1) ij ij 1 p ij ) 1 am+1) ij ), 3) where p ij M m=1 âm) ij +1)/M +2) with â m) ij the maximum likelihood ML) estimate for subject m. Hence, we derive a prior for subject M + 1 from the ML estimates for subjects 1,...,M. These estimates can be obtained by running the single-subject models together with a flat prior. The multi-subject prior can subsequently be plugged into Eq. 1) to produce the posterior for subject M Approximate inference Since the posterior 1) cannot be calculated analytically, we resort to an MCMC scheme to sample from this distribution Mukherjee and Speed, 08). We always start the sampling chain with a random symmetric adjacency matrix without self-loops. A new sample is proposed based on a previous network A by flipping an edge, resulting in a network A which, because of the symmetry of A, implies a ij = 1 a ij and a ji = 1 a ji). The acceptance of the proposed sample is determined by the ratio γ = P A N, ξ) P A N, ξ). 5 A proposed network becomes a new sample with probability min1, γ) with log γ = L kl + P kl. Here, L kl and P kl define the change in log-likelihood and log-prior respectively, after flipping edge a kl. A complete derivation of these terms is given in Appendix B. The sample distributions were obtained for each subject by drawing ten parallel chains of 300,000 samples discarding the first,000 samples as burn-in phase and keeping only each 0th sample to assure independence). The collection of T accepted samples {A 1),...,A T) } forms an approximation of the posterior P A N, ξ). The samples can be used to estimate posterior probabilities of network features, such as the probability of a specific connection. Assuming the Markov chain has converged, the posterior probability of a single connection is given by E[a ij N] = 1 T T t=1 at) ij. Other summary statistics for the distribution may be estimated in a similar manner Validation of structural connectivity estimates Functional connectivity is constrained by structural connectivity Honey et al., 10; Cabral et al., 12). In other words, when there is functional connectivity, there is often structural connectivity, although structural connectivity is not a necessary requirement for functional connectivity Honey et al., 09). We exploit this relationship in the validation of structural connectivity estimates. This is achieved by constraining the conditional independence structure of functional activity by structural connectivity Marrelec et al., 06; Smith et al., 10; Varoquaux et al., 10; Deligianni et al., 11). Assume that a K 1 vector of BOLD responses y can be modeled by a zeromean Gaussian density with inverse covariance matrix Q. That is, { Py Q) = 2π) K/2 Q 1/2 exp 1 } 2 y Qy. 4) Then, given acquired resting-state data D = y 1 ;... ;y T ) for T time points, model estimation reduces to finding the maximum likelihood solution ˆQ = arg max Q t P yt Q). However, in general for fmri data, K > T, which implies that the covariance matrix is not full rank. Hence, finding its inverse requires suboptimal solutions such as the generalized inverse or pseudo-inverse Ryali et al., 11). As a solution to this problem, regularization approaches have been suggested to find sparse approximations of the inverse covariance matrix Friedman et al., 08; Huang et al., 10). In our setup, the sparsity structure is readily available in the form of structural connectivity A. In order to use A as a constraint when estimating ˆQ, we can make use of the fact that variables y i and y j are conditionally independent if and only if q ij = 0 Dempster, 1972). That is, we can interpret Eq. 4) as a Gaussian Markov random field with respect to network A such that a ij = 0 q ij = 0 for all i j Whittaker, 1990). We will use notation Q A to

6 denote that the independence structure in Q is dictated by A. Let ˆΣ = 1 T T t=1 yt y t ) denote the empirical covariance matrix. As shown by Dahl et al. 08), the ML estimate can be formulated as the following convex optimization problem: ˆQ = arg max Q A l Q) s.t. {q ij = 0 a ij = 0}, where l Q) = T/2) log detq traceq ˆΣ) ) is, up to a constant, the log-likelihood function of Q. We made use of a standard convex solver to find this constrained maximum likelihood estimate Schmidt et al., 07) and use it to define the score for a particular matrix A: SA) l ˆQ). By comparing scores for different structural connectivity estimates, we are able to quantify the performance of a structural network in terms of how well it fits the functional data. Since we compare different models, we have to take model complexity into account. We could opt for the use of a penalty term such as the Bayesian information criterion. Here, however, we use a more stringent approach, where we enforce constant model complexity. This is implemented by constraining the number of edges for all networks from one subject to be equal to that of the maximum likelihood ML) solution A ML = arg max A P N A, a +, a ) of that particular subject. Recall that this maximum likelihood solution is equivalent to the solution obtained when using a flat prior in our generative model. For the multisubject prior, the constraint on edge count is achieved by starting out with the converged ML solution and, subsequently, drawing new samples by simultaneously adding and removing an edge. For the thresholded networks, we choose a threshold such that the resulting number of edges is the same as that of the ML solution. Note that this approach is only a way to obtain a fair comparison between different structural networks and not a requirement of the model itself. The threshold was applied to the asymmetric streamline data, normalized according to the number of streamlines emanating from each. Note that all added edges were symmetric. 3. Results In order to validate our framework, we made use of resting-state functional data which was acquired in conjunction with the diffusion imaging data. Specifically, we compared the fit to the functional data for structural networks either obtained by the standard thresholded approach or obtained using the developed generative model. The fit to the functional data is quantified in terms of the score SA). We performed a comparison using either a flat prior by choosing p = 0.5) or the multi-subject prior. For simplicity, the hyperparameters a + and a were manually set to 1 and 0.1, respectively, as small values for the hyperparameters capture the burstiness phenomenon described 6 in Section 1. For the thresholded approaches, we have one structural network estimate, denoted by A T. In contrast, for our generative model, we have a posterior over structural networks, which gives rise to a distribution of scores SA t) ) where t denotes sample index Comparing ML estimates with thresholded networks The sparsity of the maximum likelihood estimates A ML, as obtained with the flat prior, was fairly constant ±39.4 out of 6670 possible edges). As an example, Fig. 4 shows connectivity results for one subject. Although thresholding of streamline distributions is common practice, how exactly the threshold is applied varies between studies. To have a fair comparison, we investigated the impact of different thresholding approaches. We considered applying the threshold to the maximum, the mean and the minimum of n ij and n ji, respectively. To compare our generative model with these approaches, we computed for each subject the fraction of samples of the posterior network distributions that scored higher than thresholding. Let f F-T be the fraction of samples where the generative model with a flat prior scored higher than the thresholded network. The results for the distribution of f F-T over subjects, given the different threshold methods, are shown in Table 1. When the threshold is applied to the maximum of n ij and n ji, the generative model outperforms thresholding. However, when either the mean or the minimum of n ij and n ji is used, samples obtained from the posterior with a flat prior score the same as thresholded networks, on average. To explain this behavior, it is instructive to consider Eq. B.3) in Appendix B. Given hyperparameters a + and a very small compared to elements of N, the change in log-likelihood after flipping edge a ij from absent to present boils down to L ij a + a ) [ log nij k n ik ) + log )] nji k n jk This expression nicely summarizes the ramifications of our model. When sampling over networks, the generative model takes symmetry between streamlines into account which follows from the sum) and it considers the relative distribution of streamlines which follows from the fractions). Note that the latter is equivalent to normalizing the streamlines; a required step for thresholding. Thresholding approaches can imitate the behavior of the DCM by thresholding on either the mean or the minimum of n ij and n ji and by normalizing the streamline distribution by the number of outgoing streamlines Multi-subject prior With optimal threshold settings, it is possible to have thresholded networks that perform similar to the networks we infer through the posterior distribution with a flat prior. However, our model is capable of incorporating additional constraints, such as the multi-subject prior. Let f M-T be the fraction of samples where the DCM with the.

7 a) Thresholded network b) Flat model c) 1 Multi-subject model Streamlines Prior Salient differences d) e) f) Figure 4: a e) Connectivity results for that subject for which sampling in conjunction with the multi-subject prior showed the largest improvement. Shown are a) the network that is obtained through the thresholding approach, b) the posterior connection probabilities according to the flat model, c) the posterior connection probabilities according to the multi-subject model, d) the streamline distribution on a log scale and e) the multi-subject prior based on the other subjects as used in the multi-subject model. Panel f) shows the most salient differences in connectivity between the maximum a posteriori networks and the thresholding approach, across all subjects. The edges are color-coded. White edges indicate those connections that were present in at least 6 subjects whereas these edges would not be part of the thresholded network. Orange edges show converse findings. All matrices are ordered according to the order of the AAL atlas. Table 1: The fraction of samples that have a higher score than thresholded networks. The fraction of samples from the distribution with a flat and an multi-subject prior are represented by f F-T and f M-T, respectively. The different threshold approaches are max, mean and min. The p-values were obtained using a one-sample t-test with µ 0 = 0.5. f F-T p f M-T p max 0. ± ± 0.07 <0.001 mean 0.50 ± ± min 0.49 ± ± multi-subject prior scored higher than the thresholded network. The results for the distribution of f M-T over subjects, given the different threshold methods, are shown in Table 1. In addition, we compared the fraction of samples obtained with the multi-subject prior that scored higher than samples with the flat prior, f M-F. We found that this distribution had a mean of 0.64 ± 0.04 p < 10 3 ). The likelihood scores estimated for the distributions over samples, obtained using our approach in the presence of either the flat prior or the multi-subject prior, are shown in Fig. 5. In addition, the figure shows the score for the thresholded network, with a threshold applied to the minimum of n ij and n ji. The distributions obtained using the multi-subject prior are narrower and therefore more consistent than those obtained with the flat prior. Moreover, likelihood scores obtained using the multi-subject prior tend to be of higher magnitude than those obtained using the flat prior. From these results we can conclude that our model is up to par with the most optimal threshold approaches, but that it is capable of surpassing thresholded networks by using informative priors. 7 Lastly, Fig. 6 shows the connections for which our multisubject approach differs most from those of thresholding, across all subjects. The edges correspond with those shown in Fig. 4f). The figure shows edges that are present in the maximum a posteriori networks while being absent in the corresponding thresholded networks for at least 6 subjects and vice versa. The edges consistently and exclusively included by either of the approaches do not differ much in length. In fact, the mean edge lengths are very close: 17.6 ± 1.6 mm for threshold-favored edges and 17.0 ± 1.4 mm for DCM-favored edges. However, we do observe that when using the multi-subject prior, consistency for cerebellar and anterior cortical tracts is increased.

8 # f F-T f M-T f M-F Multi-subject prior Flat prior Thresholded network Figure 5: The scores SA) horizontal axis) for the thresholded network A T, samples from the generative model given a flat prior lower histogram, red) and given the multi-subject prior upper histogram, green). The fraction of each bin that has a bright color corresponds with the fraction of the other distribution that is outperformed by this bin. These fractions are also shown in the table to the right; f F-T is the fraction of samples with a flat prior that outperform thresholding, f M-T is the fraction of samples with the multi-subject prior that outperform thresholding and f M-F is the fraction of samples with the multi-subject prior that outperform samples with the flat prior. The subjects are ordered according to the performance of the multi-subject prior approach relative to the thresholded network. 4. Discussion Standard thresholding approaches for the inference of whole-brain structural networks suffer from the fact that they rely on arbitrary thresholds while assuming independence between tracts and ignoring prior knowledge. In order to overcome these problems, we have put forward a Bayesian framework for inference of structural brain networks from diffusion-weighted imaging. Our approach makes use of a Dirichlet compound multinomial distribution to model the streamline distribution obtained by probabilistic tractography. In addition, we defined a simple prior on degrees as well as a multi-subject prior that uses connectivity estimates from other subjects as an additional source of information. The proposed methodology was validated using simultaneously acquired resting-state functional MRI data. The outcome of our experiments revealed that the generative model combined with a flat prior performs equally well as the most optimal thresholded network. The use of an informative multi-subject prior instead created networks that significantly outperformed the thresholding approach. A comparison between the networks obtained with the multisubject or flat prior showed that the former improved on the latter, thereby motivating the use of the multi-subject prior. In our setup, the hyperparameters a + and a were set by hand and the edge probability p was chosen to result in a flat prior. Instead these parameters could have been estimated from the streamline data using empirical Bayes, they could have been integrated out entirely in a full Bayesian sampling approach, or they could be optimized according to the resting-state functional data. Note further that a fair comparison between networks required model complexity to be controlled. This was achieved via the constraint that networks obtained with either the multi-subject prior or with the thresholding approach had the same number of edges as the most probable network with a flat prior. While this is to the advantage of the thresholding approach, since no arbitrary threshold needs to be chosen, it can only impede networks obtained using the multi-subject prior since that might support a different number of tracts. Even given optimal settings for the thresholding approach, our approach shows clear benefits. Foremost, the DCM model intuitively assigns probabilities to the existence of edges in the inferred networks, providing a mechanism to cope with the uncertainty in the data. Moreover, the proposed generative model allows for intuitive and well founded priors, such as the described multi-subject prior. The hierarchical model in Fig. 3 also allows for grouplevel inference Robinson et al., 10). This means that, 8

9 Figure 6: The most salient differences in connectivity between the maximum a posteriori networks with the multi-subject prior and the thresholded networks, across all subjects. The edges are color-coded. Blue, thick edges indicate those connections that were present in at least 6 subjects whereas these edges would not be part of the thresholded network. Red, thin edges show converse findings. Nodes that are not adjacent to any of these edges are omitted. given streamline data for multiple subjects, the generative model can be used to infer individual subject connectivity A as well as the group-level parent network Ā. This allows one to get a handle on group differences, for instance, in the context of clinical neuroscience. The current work focused mainly on the empirical validation of our theoretical framework using functional data. In future work, we will focus more on interpretation of the obtained structural connectivity estimates. In this paper we used resting-state fmri data as a means to validate whole-brain structural networks derived from diffusion-weighted imaging. A logical extension of our work is to derive connectivity based on the integration of these two imaging modalities. This example of Bayesian data fusion requires that we extend the generative model to take functional data into account as well Rykhlevskaia et al., 08; Sui et al., 11). We can then use structural networks as an informed prior for inference of functional connectivity or infer structural connectivity from both modalities simultaneously. An additional benefit of our framework is that the network sparsity follows directly from optimizing Eq. 1). In the thresholding approach, the network sparsity is a consequence of the specific threshold setting. As a byproduct of our study, we have observed that thresholding of streamlines benefits from considering the mean or minimum of the number of streamlines connecting A to B and vice versa. This in itself may lead to improvements in the analysis of structural connectivity. Summarizing, we proposed a Bayesian framework which lays the foundations for a theoretically sound approach to the inference of whole-brain structural networks. This framework does not suffer from the issues which plague current thresholding approaches to structural connectivity estimation and has been shown to give rise to substantially 9 improved structural connectivity estimates. The proposed generative model is easily modified to incorporate other sources of information, thereby further facilitating the estimation of whole-brain structural networks in vivo. Acknowledgements The authors gratefully acknowledge the support of the BrainGain Smart Mix Programme of the Netherlands Ministry of Economic Affairs and the Netherlands Ministry of Education, Culture and Science. The authors thank Erik van Oort and David Norris for the acquisition of the rsfmri and DWI data and the anonymous reviewers for their valuable suggestions and comments to improve the quality of the paper. Appendix A. Derivation of the multi-subject prior We describe here the derivation of the multi-subject prior based on the maximum likelihood estimates for previously seen subjects, as described in Section The

10 prior on A A M+1) is given by P A N, ξ) = P A Ā) P Ā N, ξ ) Ā P A Ā) P Ā p ) Ā A Ā P P N A, a +, a )P A Ā) P A Ā) P Ā p ) M P m=1 A m) A m) Ā N m) A m), a +, a ) We approximate this quantity by assuming that the main contribution in the sum over A m) is due to the ML solution  m) = arg max P N m) A m), a +, a ). A m) Following the Erdős-Rényi model with p = 0.5 for P Ā p ) gives a flat prior on simple graphs. Up to irrelevant constants and keeping in mind that  depends on N m), the prior is rewritten as P A N, ξ) Ā ) P M A Ā) Âm) Ā) P m=1 with  = {Â1),...,ÂM) } the different ML solutions. We assume that the prior factorizes into ) P A M+1) N, ξ = ) P a M+1) ij N, ξ. A.1) i<j Next, we define the probability that a m) ij, m = 1,..., M + 1), inherits the connectivity from the parent network ā ij by P a m) ij = 1 ā ij = 1 ) = P. ) a m) ij = 0 ā ij = 0 q ij, m=1 âm) ij with q ij close to 1. That is, each a m) ij is a copy of ā ij with unknown probability q ij. The copying probabilities are assumed to be independent and have a flat prior. Estimating the prior probability for each edge is then nothing but an instance of Laplace s rule of succession. This says that, if we repeat an experiment that we know can result in a success presence of an edge) or failure absence of an edge) m times independently, and get M successes, then our best estimate of the probability that the next repetition a M+1) ij will be a success is: M Pa M+1) ij = 1 â 1) ij,...,âm) m=1 ij ) = âm) ij + 1 p ij. M + 2 Plugging this into Eq. A.1), we obtain the prior ) P A M+1) N, ξ = p am+1) ij ij 1 p ij ) 1 am+1) ij ). i<j, 10 Appendix B. MCMC sampling We derive here the acceptance rate γ of a sample A in the sampling chain as a function of one edge flip in A see Section 2.5). Note that each of the 2 KK 1)/2 possible networks A has a probability greater than zero of being constructed, which guarantees that the Markov chain is irreducible. The log acceptance rate of a suggested sample can be calculated as log γ = L kl + P kl, with L kl and P kl the change in log-likelihood and log-prior respectively, after flipping edge a kl. The sampling approach requires that we can efficiently update both the likelihood and the prior for new samples in the Markov chain. The log-likelihood is given by L i log N i! j n ij! + log + j log Γ b ij + n ij ) Γ b ij ) ) Γ j b ij ) Γ j b ij + n ij ) B.1) with b ij a ij a + +1 a ij )a. The change in log-likelihood as a consequence of flipping an edge a kl is defined as L kl = log P N A, a +, a ) log P N A, a +, a ), B.2) with the sole difference that a kl = a lk = 1 a kl). Plugging B.1) into B.2) yields [ ] [ ] Γ b L kl = log kl + n kl ) Γ b + log lk + n lk ) Γ b kl + n kl ) Γ b lk + n lk ) ) + log Γ j b kj ) + log Γ j b lj ) Γ j b kj Γ j b lj ) log Γ j b kj + n kj) ) Γ j b kj + n kj ) ) log Γ j b lj + n [ ] lj) ) Γ b 2 log kl ). Γ j b lj + n lj ) Γ b kl ) B.3) The change in the log-prior as a consequence of flipping a kl to 1 a kl for the prior follows from its definition in Eq. 3) P kl = log P A N, ξ) log P A N, ξ) [ ) )] pkl plk = 4a kl 2) log + log 1 p kl 1 p lk Here the edge probability p kl is the same for all edges in the case of the Erdős-Rényi model and estimated separately per edge in case of the multi-subject prior..

11 References Bassett, D.S., Brown, J.A., Deshpande, V., Carlson, J.M., Grafton, S.T., 11. Conserved and variable architecture of human white matter connectivity. NeuroImage 54, Behrens, T.E.J., Johansen-Berg, H., Jbabdi, S., Rushworth, M.F.S., Woolrich, M.W., 07. Probabilistic diffusion tractography with multiple fibre orientations: What can we gain? NeuroImage 34, Behrens, T.E.J., Woolrich, M.W., Jenkinson, M., Johansen-Berg, H., Nunes, R.G., Clare, S., Matthews, P.M., Brady, J.M., Smith, S.M., 03. Characterization and propagation of uncertainty in diffusion-weighted MR imaging. Magnet. Reson. Med. 50, Cabral, J., Hugues, E., Kringelbach, M.L., Deco, G., 12. Modeling the outcome of structural disconnection on resting-state functional connectivity. NeuroImage 62, Catani, M., 07. From hodology to function. Brain 130, 2 5. Chung, H.W., Chou, M.C., Chen, C.Y., 10. Principles and limitations of computational algorithms in clinical diffusion tensor MR tractography. Am. J. Neuroradiol. 32, Chung, M.K., Adluru, N., Dalton, K.M., Alexander, A.L., Davidson, R.J., 11. Scalable brain network construction on white matter fibers, in: SPIE Medical Imaging, pp Conturo, T.E., Lori, N.F., Cull, T.S., Akbudak, E., Snyder, A.Z., Shimony, J.S., McKinstry, R.C., Burton, H., Raichle, M.E., Tracking neuronal fiber pathways in the living human brain. Proc. Natl. Acad. Sci. USA 96, Cook, P.A., Bai, Y., Seunarine, K.K., Hall, M.G., Parker, G.J., Alexander, D.C., 06. Camino: open-source diffusion-mri reconstruction and processing, in: 14th Scientific Meeting of the International Society for Magnetic Resonance in Medicine, Seattle, WA, USA. p Dahl, J., Vandenberghe, L., Roychowdhury, V., 08. Covariance selection for non-chordal graphs via chordal embedding. Optim. Method. Softw. 23, Damoiseaux, J.S., Greicius, M.D., 09. Greater than the sum of its parts: a review of studies combining structural connectivity and resting-state functional connectivity. Brain Struct. Funct. 213, Deligianni, F., Varoquaux, G., Thirion, B., Robinson, E., Sharp, D.J., Edwards, A.D., Rueckert, D., 11. A probabilistic framework to infer brain functional connectivity from anatomical connections, in: Information Processing in Medical Imaging, Springer, Kaufbeuren, Germany. pp Dempster, A.P., Covariance selection. Biometrics 28, Erdős, P., Rényi, A., 19. On the evolution of random graphs. Publications of the Mathematical Institute of the Hungarian Academy of Sciences 5, Friedman, J., Hastie, T., Tibshirani, R., 08. Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9, Friman, O., Farnebäck, G., Westin, C.F., 06. A Bayesian approach for stochastic white matter tractography. IEEE Trans. Med. Imag. 25, Gong, G., Rosa-Neto, P., Carbonell, F., Chen, Z.J., He, Y., Evans, A., 09. Age- and gender-related differences in the cortical anatomical network. J. Neurosci. 29. Greicius, M.D., Supekar, K., Menon, V., Dougherty, R.F., 09. Resting-state functional connectivity reflects structural connectivity in the default mode network. Cereb. cortex 19, Groves, A.R., Beckmann, C.F., Smith, S.M., Woolrich, M.W., 11. Linked independent component analysis for multimodal data fusion. NeuroImage 54, Hagmann, P., Cammoun, L., Gigandet, X., Meuli, R., Honey, C.J., Wedeen, V., Sporns, O., 08. Mapping the structural core of human cerebral cortex. PLoS Biol. 6, e159. Hagmann, P., Kurant, M., Gigandet, X., Thiran, P., Wedeen, V., Meuli, R., Thiran, J.P., 07. Mapping human whole-brain structural networks with diffusion MRI. PLoS ONE 2, e597. Honey, C.J., J.-P., T., Sporns, O., 10. Can structure predict function in the human brain? NeuroImage 52, Honey, C.J., Sporns, O., Cammoun, L., Gigandet, X., Thiran, J.P., Meuli, R., Hagmann, P., 09. Predicting human resting-state functional connectivity from structural connectivity. Proc. Natl. Acad. Sci. USA 106, 35. Horwitz, B., Poeppel, D., 02. How can EEG/MEG and fmri/pet data be combined? Hum. Brain Mapp. 17, 1 3. Huang, S., Li, J., Sun, L., Ye, J., Fleisher, A., Wu, T., Chen, K., Reiman, E., 10. Learning brain connectivity of Alzheimer s disease by sparse inverse covariance estimation. NeuroImage 50, Jbabdi, S., Woolrich, M.W., Andersson, J.L.R., Behrens, T.E.J., 07. A Bayesian framework for global tractography. NeuroImage 37, Kaden, E., Knösche, T.R., Anwander, A., 07. Parametric spherical deconvolution: Inferring anatomical connectivity using diffusion MR imaging. NeuroImage 37, Koch, M., Norris, D.G., Hund-Georgiadis, M., 02. An investigation of functional and anatomical connectivity using magnetic resonance imaging. NeuroImage 16, Lv, J., Guo, L., Hu, X., Zhang, T., Li, K., Zhang, D., Yang, J., Liu, T., 10. Fiber-centered analysis of brain connectivities using DTI and resting state fmri data. Med. Image Comput. Comput. Assist. Interv. 13, Madsen, R.E., Kauchak, D., Elkan, C., 05. Modeling word burstiness using the Dirichlet distribution, in: Proceedings of the 22nd International Conference on Machine Learning, ACM, New York, NY, USA. pp Marrelec, G., Krainik, A., Duffau, H., Pélégrini-Issac, M., Lehéricy, S., Doyon, J., Benali, H., 06. Partial correlation for functional brain interactivity investigation in functional MRI. NeuroImage 32, Minka, T.P., 00. Estimating a Dirichlet distribution. Technical Report. MIT. Mukherjee, S., Speed, T.P., 08. Network inference using informative priors. Proc. Natl. Acad. Sci. USA 105, Park, C., Kim, S., Kim, Y., Kim, K., 08. Comparison of the smallworld topology between anatomical and functional connectivity in the human brain. Physica A 387, Penny, W.D., Friston, K.J., Ashburner, J.T., Kiebel, S.J., Nichols, T.E. Eds.), 06. Statistical Parametric Mapping: The Analysis of Functional Brain Images. Academic Press Inc.. 1st edition. Robinson, E.C., Hammers, A., Ericsson, A., Edwards, A.D., Rueckert, D., 10. Identifying population differences in whole-brain structural networks: a machine learning approach. NeuroImage 50, Robinson, E.C., Valstar, M., Hammers, A., Ericsson, A., Edwards, A.D., Rueckert, D., 08. Multivariate statistical analysis of whole brain structural networks obtained using probabilistic tractography. Med. Image Compute. Comput. Assist. Interv. 11, Ryali, S., Chen, T., Supekar, K., Menon, V., 11. Estimation of functional connectivity in fmri data using stability selectionbased sparse partial correlation with elastic net penalty. NeuroImage 59, Rykhlevskaia, E., Gratton, G., Fabiani, M., 08. Combining structural and functional neuroimaging data for studying brain connectivity: A review. Psychophysiology 45, Schmidt, M., Fung, G., Rosales, R., 07. Fast optimization methods for L1 regularization: A comparative study and two new approaches. Machine Learning: ECML 07, Shu, N., Liu, Y., Li, K., Duan, Y., Wang, J., Yu, C., Dong, H., Ye, J., He, Y., 11. Diffusion tensor tractography reveals disrupted topological efficiency in white matter structural networks in multiple sclerosis. Cereb. Cortex 21. Simpson, S.L., Hayasaka, S., Laurienti, P.J., 11. Exponential Random Graph Modeling for complex brain networks. PLoS ONE 6, e039. Skudlarski, P., Jagannathan, K., Calhoun, V., Hampson, M., Skudlarska, B.A., Pearlson, G., 08. Measuring brain connectivity: diffusion tensor imaging validates resting state temporal correlations. NeuroImage 43, Smith, S.M., 02. Fast robust automated brain extraction. Hum.

Connectomics analysis and parcellation of the brain based on diffusion-weighted fiber tractography

Connectomics analysis and parcellation of the brain based on diffusion-weighted fiber tractography Connectomics analysis and parcellation of the brain based on diffusion-weighted fiber tractography Alfred Anwander Max Planck Institute for Human Cognitive and Brain Sciences Leipzig, Germany What is the

More information

A Neurosurgeon s Perspectives of Diffusion Tensor Imaging(DTI) Diffusion Tensor MRI (DTI) Background and Relevant Physics.

A Neurosurgeon s Perspectives of Diffusion Tensor Imaging(DTI) Diffusion Tensor MRI (DTI) Background and Relevant Physics. A Neurosurgeon s Perspectives of Diffusion Tensor Imaging(DTI) Kalai Arasu Muthusamy, D.Phil(Oxon) Senior Lecturer & Consultant Neurosurgeon. Division of Neurosurgery. University Malaya Medical Centre.

More information

Bayesian multi-tensor diffusion MRI and tractography

Bayesian multi-tensor diffusion MRI and tractography Bayesian multi-tensor diffusion MRI and tractography Diwei Zhou 1, Ian L. Dryden 1, Alexey Koloydenko 1, & Li Bai 2 1 School of Mathematical Sciences, Univ. of Nottingham 2 School of Computer Science and

More information

Bayesian Analysis. Bayesian Analysis: Bayesian methods concern one s belief about θ. [Current Belief (Posterior)] (Prior Belief) x (Data) Outline

Bayesian Analysis. Bayesian Analysis: Bayesian methods concern one s belief about θ. [Current Belief (Posterior)] (Prior Belief) x (Data) Outline Bayesian Analysis DuBois Bowman, Ph.D. Gordana Derado, M. S. Shuo Chen, M. S. Department of Biostatistics and Bioinformatics Center for Biomedical Imaging Statistics Emory University Outline I. Introduction

More information

Dynamic Causal Modelling for fmri

Dynamic Causal Modelling for fmri Dynamic Causal Modelling for fmri André Marreiros Friday 22 nd Oct. 2 SPM fmri course Wellcome Trust Centre for Neuroimaging London Overview Brain connectivity: types & definitions Anatomical connectivity

More information

CIND Pre-Processing Pipeline For Diffusion Tensor Imaging. Overview

CIND Pre-Processing Pipeline For Diffusion Tensor Imaging. Overview CIND Pre-Processing Pipeline For Diffusion Tensor Imaging Overview The preprocessing pipeline of the Center for Imaging of Neurodegenerative Diseases (CIND) prepares diffusion weighted images (DWI) and

More information

Anisotropy of HARDI Diffusion Profiles Based on the L 2 -Norm

Anisotropy of HARDI Diffusion Profiles Based on the L 2 -Norm Anisotropy of HARDI Diffusion Profiles Based on the L 2 -Norm Philipp Landgraf 1, Dorit Merhof 1, Mirco Richter 1 1 Institute of Computer Science, Visual Computing Group, University of Konstanz philipp.landgraf@uni-konstanz.de

More information

Bayesian Learning in Undirected Graphical Models

Bayesian Learning in Undirected Graphical Models Bayesian Learning in Undirected Graphical Models Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London, UK http://www.gatsby.ucl.ac.uk/ Work with: Iain Murray and Hyun-Chul

More information

Bayesian modelling of fmri time series

Bayesian modelling of fmri time series Bayesian modelling of fmri time series Pedro A. d. F. R. Højen-Sørensen, Lars K. Hansen and Carl Edward Rasmussen Department of Mathematical Modelling, Building 321 Technical University of Denmark DK-28

More information

Part 2: Multivariate fmri analysis using a sparsifying spatio-temporal prior

Part 2: Multivariate fmri analysis using a sparsifying spatio-temporal prior Chalmers Machine Learning Summer School Approximate message passing and biomedicine Part 2: Multivariate fmri analysis using a sparsifying spatio-temporal prior Tom Heskes joint work with Marcel van Gerven

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project

More information

Spectral Perturbation of Small-World Networks with Application to Brain Disease Detection

Spectral Perturbation of Small-World Networks with Application to Brain Disease Detection Spectral Perturbation of Small-World Networks with Application to Brain Disease Detection Chenhui Hu May 4, 22 Introduction Many real life systems can be described by complex networks, which usually consist

More information

Diffusion Tensor Imaging (DTI): An overview of key concepts

Diffusion Tensor Imaging (DTI): An overview of key concepts Diffusion Tensor Imaging (DTI): An overview of key concepts (Supplemental material for presentation) Prepared by: Nadia Barakat BMB 601 Chris Conklin Thursday, April 8 th 2010 Diffusion Concept [1,2]:

More information

Development of Stochastic Artificial Neural Networks for Hydrological Prediction

Development of Stochastic Artificial Neural Networks for Hydrological Prediction Development of Stochastic Artificial Neural Networks for Hydrological Prediction G. B. Kingston, M. F. Lambert and H. R. Maier Centre for Applied Modelling in Water Engineering, School of Civil and Environmental

More information

From Diffusion Data to Bundle Analysis

From Diffusion Data to Bundle Analysis From Diffusion Data to Bundle Analysis Gabriel Girard gabriel.girard@epfl.ch Computational Brain Connectivity Mapping Juan-les-Pins, France 20 November 2017 Gabriel Girard gabriel.girard@epfl.ch CoBCoM2017

More information

Modelling temporal structure (in noise and signal)

Modelling temporal structure (in noise and signal) Modelling temporal structure (in noise and signal) Mark Woolrich, Christian Beckmann*, Salima Makni & Steve Smith FMRIB, Oxford *Imperial/FMRIB temporal noise: modelling temporal autocorrelation temporal

More information

Inference and estimation in probabilistic time series models

Inference and estimation in probabilistic time series models 1 Inference and estimation in probabilistic time series models David Barber, A Taylan Cemgil and Silvia Chiappa 11 Time series The term time series refers to data that can be represented as a sequence

More information

Understanding the relationship between Functional and Structural Connectivity of Brain Networks

Understanding the relationship between Functional and Structural Connectivity of Brain Networks Understanding the relationship between Functional and Structural Connectivity of Brain Networks Sashank J. Reddi Machine Learning Department, Carnegie Mellon University SJAKKAMR@CS.CMU.EDU Abstract Background.

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Variational Scoring of Graphical Model Structures

Variational Scoring of Graphical Model Structures Variational Scoring of Graphical Model Structures Matthew J. Beal Work with Zoubin Ghahramani & Carl Rasmussen, Toronto. 15th September 2003 Overview Bayesian model selection Approximations using Variational

More information

16 : Approximate Inference: Markov Chain Monte Carlo

16 : Approximate Inference: Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models 10-708, Spring 2017 16 : Approximate Inference: Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Yuan Yang, Chao-Ming Yen 1 Introduction As the target distribution

More information

Contrast Mechanisms in MRI. Michael Jay Schillaci

Contrast Mechanisms in MRI. Michael Jay Schillaci Contrast Mechanisms in MRI Michael Jay Schillaci Overview Image Acquisition Basic Pulse Sequences Unwrapping K-Space Image Optimization Contrast Mechanisms Static and Motion Contrasts T1 & T2 Weighting,

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate

More information

Recurrent Latent Variable Networks for Session-Based Recommendation

Recurrent Latent Variable Networks for Session-Based Recommendation Recurrent Latent Variable Networks for Session-Based Recommendation Panayiotis Christodoulou Cyprus University of Technology paa.christodoulou@edu.cut.ac.cy 27/8/2017 Panayiotis Christodoulou (C.U.T.)

More information

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Jonathan Gruhl March 18, 2010 1 Introduction Researchers commonly apply item response theory (IRT) models to binary and ordinal

More information

Stochastic Dynamic Causal Modelling for resting-state fmri

Stochastic Dynamic Causal Modelling for resting-state fmri Stochastic Dynamic Causal Modelling for resting-state fmri Ged Ridgway, FIL, Wellcome Trust Centre for Neuroimaging, UCL Institute of Neurology, London Overview Connectivity in the brain Introduction to

More information

Network Modeling and Functional Data Methods for Brain Functional Connectivity Studies. Kehui Chen

Network Modeling and Functional Data Methods for Brain Functional Connectivity Studies. Kehui Chen Network Modeling and Functional Data Methods for Brain Functional Connectivity Studies Kehui Chen Department of Statistics, University of Pittsburgh Nonparametric Statistics Workshop, Ann Arbor, Oct 06,

More information

An Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models

An Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models Proceedings 59th ISI World Statistics Congress, 25-30 August 2013, Hong Kong (Session CPS023) p.3938 An Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models Vitara Pungpapong

More information

A prior distribution over model space p(m) (or hypothesis space ) can be updated to a posterior distribution after observing data y.

A prior distribution over model space p(m) (or hypothesis space ) can be updated to a posterior distribution after observing data y. June 2nd 2011 A prior distribution over model space p(m) (or hypothesis space ) can be updated to a posterior distribution after observing data y. This is implemented using Bayes rule p(m y) = p(y m)p(m)

More information

A hierarchical group ICA model for assessing covariate effects on brain functional networks

A hierarchical group ICA model for assessing covariate effects on brain functional networks A hierarchical group ICA model for assessing covariate effects on brain functional networks Ying Guo Joint work with Ran Shi Department of Biostatistics and Bioinformatics Emory University 06/03/2013 Ying

More information

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) = Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,

More information

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008 Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

More information

On the errors introduced by the naive Bayes independence assumption

On the errors introduced by the naive Bayes independence assumption On the errors introduced by the naive Bayes independence assumption Author Matthijs de Wachter 3671100 Utrecht University Master Thesis Artificial Intelligence Supervisor Dr. Silja Renooij Department of

More information

Advanced Topics and Diffusion MRI

Advanced Topics and Diffusion MRI Advanced Topics and Diffusion MRI Slides originally by Karla Miller, FMRIB Centre Modified by Mark Chiew (mark.chiew@ndcn.ox.ac.uk) Slides available at: http://users.fmrib.ox.ac.uk/~mchiew/teaching/ MRI

More information

Multivariate Regression Generalized Likelihood Ratio Tests for FMRI Activation

Multivariate Regression Generalized Likelihood Ratio Tests for FMRI Activation Multivariate Regression Generalized Likelihood Ratio Tests for FMRI Activation Daniel B Rowe Division of Biostatistics Medical College of Wisconsin Technical Report 40 November 00 Division of Biostatistics

More information

Bayesian time series classification

Bayesian time series classification Bayesian time series classification Peter Sykacek Department of Engineering Science University of Oxford Oxford, OX 3PJ, UK psyk@robots.ox.ac.uk Stephen Roberts Department of Engineering Science University

More information

Sparse Linear Models (10/7/13)

Sparse Linear Models (10/7/13) STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine

More information

Computing FMRI Activations: Coefficients and t-statistics by Detrending and Multiple Regression

Computing FMRI Activations: Coefficients and t-statistics by Detrending and Multiple Regression Computing FMRI Activations: Coefficients and t-statistics by Detrending and Multiple Regression Daniel B. Rowe and Steven W. Morgan Division of Biostatistics Medical College of Wisconsin Technical Report

More information

The effect of different number of diffusion gradients on SNR of diffusion tensor-derived measurement maps

The effect of different number of diffusion gradients on SNR of diffusion tensor-derived measurement maps J. Biomedical Science and Engineering, 009,, 96-101 The effect of different number of diffusion gradients on SNR of diffusion tensor-derived measurement maps Na Zhang 1, Zhen-Sheng Deng 1*, Fang Wang 1,

More information

Segmenting Thalamic Nuclei: What Can We Gain From HARDI?

Segmenting Thalamic Nuclei: What Can We Gain From HARDI? Segmenting Thalamic Nuclei: What Can We Gain From HARDI? Thomas Schultz Computation Institute, University of Chicago, Chicago IL, USA Abstract. The contrast provided by diffusion MRI has been exploited

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

The connection of dropout and Bayesian statistics

The connection of dropout and Bayesian statistics The connection of dropout and Bayesian statistics Interpretation of dropout as approximate Bayesian modelling of NN http://mlg.eng.cam.ac.uk/yarin/thesis/thesis.pdf Dropout Geoffrey Hinton Google, University

More information

Effective Connectivity & Dynamic Causal Modelling

Effective Connectivity & Dynamic Causal Modelling Effective Connectivity & Dynamic Causal Modelling Hanneke den Ouden Donders Centre for Cognitive Neuroimaging Radboud University Nijmegen Advanced SPM course Zurich, Februari 13-14, 2014 Functional Specialisation

More information

6.047 / Computational Biology: Genomes, Networks, Evolution Fall 2008

6.047 / Computational Biology: Genomes, Networks, Evolution Fall 2008 MIT OpenCourseWare http://ocw.mit.edu 6.047 / 6.878 Computational Biology: Genomes, etworks, Evolution Fall 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

More information

9 Forward-backward algorithm, sum-product on factor graphs

9 Forward-backward algorithm, sum-product on factor graphs Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 9 Forward-backward algorithm, sum-product on factor graphs The previous

More information

10708 Graphical Models: Homework 2

10708 Graphical Models: Homework 2 10708 Graphical Models: Homework 2 Due Monday, March 18, beginning of class Feburary 27, 2013 Instructions: There are five questions (one for extra credit) on this assignment. There is a problem involves

More information

A graph contains a set of nodes (vertices) connected by links (edges or arcs)

A graph contains a set of nodes (vertices) connected by links (edges or arcs) BOLTZMANN MACHINES Generative Models Graphical Models A graph contains a set of nodes (vertices) connected by links (edges or arcs) In a probabilistic graphical model, each node represents a random variable,

More information

Regularization Parameter Selection for a Bayesian Multi-Level Group Lasso Regression Model with Application to Imaging Genomics

Regularization Parameter Selection for a Bayesian Multi-Level Group Lasso Regression Model with Application to Imaging Genomics Regularization Parameter Selection for a Bayesian Multi-Level Group Lasso Regression Model with Application to Imaging Genomics arxiv:1603.08163v1 [stat.ml] 7 Mar 016 Farouk S. Nathoo, Keelin Greenlaw,

More information

MULTISCALE MODULARITY IN BRAIN SYSTEMS

MULTISCALE MODULARITY IN BRAIN SYSTEMS MULTISCALE MODULARITY IN BRAIN SYSTEMS Danielle S. Bassett University of California Santa Barbara Department of Physics The Brain: A Multiscale System Spatial Hierarchy: Temporal-Spatial Hierarchy: http://www.idac.tohoku.ac.jp/en/frontiers/column_070327/figi-i.gif

More information

Chris Bishop s PRML Ch. 8: Graphical Models

Chris Bishop s PRML Ch. 8: Graphical Models Chris Bishop s PRML Ch. 8: Graphical Models January 24, 2008 Introduction Visualize the structure of a probabilistic model Design and motivate new models Insights into the model s properties, in particular

More information

25 : Graphical induced structured input/output models

25 : Graphical induced structured input/output models 10-708: Probabilistic Graphical Models 10-708, Spring 2016 25 : Graphical induced structured input/output models Lecturer: Eric P. Xing Scribes: Raied Aljadaany, Shi Zong, Chenchen Zhu Disclaimer: A large

More information

Variational Principal Components

Variational Principal Components Variational Principal Components Christopher M. Bishop Microsoft Research 7 J. J. Thomson Avenue, Cambridge, CB3 0FB, U.K. cmbishop@microsoft.com http://research.microsoft.com/ cmbishop In Proceedings

More information

Directed acyclic graphs and the use of linear mixed models

Directed acyclic graphs and the use of linear mixed models Directed acyclic graphs and the use of linear mixed models Siem H. Heisterkamp 1,2 1 Groningen Bioinformatics Centre, University of Groningen 2 Biostatistics and Research Decision Sciences (BARDS), MSD,

More information

Extracting the Core Structural Connectivity Network: Guaranteeing Network Connectedness Through a Graph-Theoretical Approach

Extracting the Core Structural Connectivity Network: Guaranteeing Network Connectedness Through a Graph-Theoretical Approach Extracting the Core Structural Connectivity Network: Guaranteeing Network Connectedness Through a Graph-Theoretical Approach Demian Wassermann, Dorian Mazauric, Guillermo Gallardo-Diez, Rachid Deriche

More information

Bayesian Learning in Undirected Graphical Models

Bayesian Learning in Undirected Graphical Models Bayesian Learning in Undirected Graphical Models Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London, UK http://www.gatsby.ucl.ac.uk/ and Center for Automated Learning and

More information

13: Variational inference II

13: Variational inference II 10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational

More information

Automatic Differentiation Equipped Variable Elimination for Sensitivity Analysis on Probabilistic Inference Queries

Automatic Differentiation Equipped Variable Elimination for Sensitivity Analysis on Probabilistic Inference Queries Automatic Differentiation Equipped Variable Elimination for Sensitivity Analysis on Probabilistic Inference Queries Anonymous Author(s) Affiliation Address email Abstract 1 2 3 4 5 6 7 8 9 10 11 12 Probabilistic

More information

Model Comparison. Course on Bayesian Inference, WTCN, UCL, February Model Comparison. Bayes rule for models. Linear Models. AIC and BIC.

Model Comparison. Course on Bayesian Inference, WTCN, UCL, February Model Comparison. Bayes rule for models. Linear Models. AIC and BIC. Course on Bayesian Inference, WTCN, UCL, February 2013 A prior distribution over model space p(m) (or hypothesis space ) can be updated to a posterior distribution after observing data y. This is implemented

More information

Human Brain Networks. Aivoaakkoset BECS-C3001"

Human Brain Networks. Aivoaakkoset BECS-C3001 Human Brain Networks Aivoaakkoset BECS-C3001" Enrico Glerean (MSc), Brain & Mind Lab, BECS, Aalto University" www.glerean.com @eglerean becs.aalto.fi/bml enrico.glerean@aalto.fi" Why?" 1. WHY BRAIN NETWORKS?"

More information

bound on the likelihood through the use of a simpler variational approximating distribution. A lower bound is particularly useful since maximization o

bound on the likelihood through the use of a simpler variational approximating distribution. A lower bound is particularly useful since maximization o Category: Algorithms and Architectures. Address correspondence to rst author. Preferred Presentation: oral. Variational Belief Networks for Approximate Inference Wim Wiegerinck David Barber Stichting Neurale

More information

Bayesian System Identification based on Hierarchical Sparse Bayesian Learning and Gibbs Sampling with Application to Structural Damage Assessment

Bayesian System Identification based on Hierarchical Sparse Bayesian Learning and Gibbs Sampling with Application to Structural Damage Assessment Bayesian System Identification based on Hierarchical Sparse Bayesian Learning and Gibbs Sampling with Application to Structural Damage Assessment Yong Huang a,b, James L. Beck b,* and Hui Li a a Key Lab

More information

Lecture 13 : Variational Inference: Mean Field Approximation

Lecture 13 : Variational Inference: Mean Field Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2017 Lecture 13 : Variational Inference: Mean Field Approximation Lecturer: Willie Neiswanger Scribes: Xupeng Tong, Minxing Liu 1 Problem Setup 1.1

More information

Functional Connectivity and Network Methods

Functional Connectivity and Network Methods 18/Sep/2013" Functional Connectivity and Network Methods with functional magnetic resonance imaging" Enrico Glerean (MSc), Brain & Mind Lab, BECS, Aalto University" www.glerean.com @eglerean becs.aalto.fi/bml

More information

EEG/MEG Inverse Solution Driven by fmri

EEG/MEG Inverse Solution Driven by fmri EEG/MEG Inverse Solution Driven by fmri Yaroslav Halchenko CS @ NJIT 1 Functional Brain Imaging EEG ElectroEncephaloGram MEG MagnetoEncephaloGram fmri Functional Magnetic Resonance Imaging others 2 Functional

More information

Chapter 4 Dynamic Bayesian Networks Fall Jin Gu, Michael Zhang

Chapter 4 Dynamic Bayesian Networks Fall Jin Gu, Michael Zhang Chapter 4 Dynamic Bayesian Networks 2016 Fall Jin Gu, Michael Zhang Reviews: BN Representation Basic steps for BN representations Define variables Define the preliminary relations between variables Check

More information

Hierarchical Clustering Identifies Hub Nodes in a Model of Resting-State Brain Activity

Hierarchical Clustering Identifies Hub Nodes in a Model of Resting-State Brain Activity WCCI 22 IEEE World Congress on Computational Intelligence June, -5, 22 - Brisbane, Australia IJCNN Hierarchical Clustering Identifies Hub Nodes in a Model of Resting-State Brain Activity Mark Wildie and

More information

New Machine Learning Methods for Neuroimaging

New Machine Learning Methods for Neuroimaging New Machine Learning Methods for Neuroimaging Gatsby Computational Neuroscience Unit University College London, UK Dept of Computer Science University of Helsinki, Finland Outline Resting-state networks

More information

BMI/STAT 768 : Lecture 13 Sparse Models in Images

BMI/STAT 768 : Lecture 13 Sparse Models in Images BMI/STAT 768 : Lecture 13 Sparse Models in Images Moo K. Chung mkchung@wisc.edu March 25, 2018 1 Why sparse models are needed? If we are interested quantifying the voxel measurements in every voxel in

More information

Variational Bayesian Inference Techniques

Variational Bayesian Inference Techniques Advanced Signal Processing 2, SE Variational Bayesian Inference Techniques Johann Steiner 1 Outline Introduction Sparse Signal Reconstruction Sparsity Priors Benefits of Sparse Bayesian Inference Variational

More information

Bayesian Inference for the Multivariate Normal

Bayesian Inference for the Multivariate Normal Bayesian Inference for the Multivariate Normal Will Penny Wellcome Trust Centre for Neuroimaging, University College, London WC1N 3BG, UK. November 28, 2014 Abstract Bayesian inference for the multivariate

More information

Bayesian Models in Machine Learning

Bayesian Models in Machine Learning Bayesian Models in Machine Learning Lukáš Burget Escuela de Ciencias Informáticas 2017 Buenos Aires, July 24-29 2017 Frequentist vs. Bayesian Frequentist point of view: Probability is the frequency of

More information

17 : Markov Chain Monte Carlo

17 : Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models, Spring 2015 17 : Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Heran Lin, Bin Deng, Yun Huang 1 Review of Monte Carlo Methods 1.1 Overview Monte Carlo

More information

Will Penny. SPM for MEG/EEG, 15th May 2012

Will Penny. SPM for MEG/EEG, 15th May 2012 SPM for MEG/EEG, 15th May 2012 A prior distribution over model space p(m) (or hypothesis space ) can be updated to a posterior distribution after observing data y. This is implemented using Bayes rule

More information

Diffusion Tensor Imaging I. Jennifer Campbell

Diffusion Tensor Imaging I. Jennifer Campbell Diffusion Tensor Imaging I Jennifer Campbell Diffusion Imaging Molecular diffusion The diffusion tensor Diffusion weighting in MRI Alternatives to the tensor Overview of applications Diffusion Imaging

More information

Expectation propagation for signal detection in flat-fading channels

Expectation propagation for signal detection in flat-fading channels Expectation propagation for signal detection in flat-fading channels Yuan Qi MIT Media Lab Cambridge, MA, 02139 USA yuanqi@media.mit.edu Thomas Minka CMU Statistics Department Pittsburgh, PA 15213 USA

More information

Bayesian inference J. Daunizeau

Bayesian inference J. Daunizeau Bayesian inference J. Daunizeau Brain and Spine Institute, Paris, France Wellcome Trust Centre for Neuroimaging, London, UK Overview of the talk 1 Probabilistic modelling and representation of uncertainty

More information

Bayesian Networks in Educational Assessment

Bayesian Networks in Educational Assessment Bayesian Networks in Educational Assessment Estimating Parameters with MCMC Bayesian Inference: Expanding Our Context Roy Levy Arizona State University Roy.Levy@asu.edu 2017 Roy Levy MCMC 1 MCMC 2 Posterior

More information

Learning latent structure in complex networks

Learning latent structure in complex networks Learning latent structure in complex networks Lars Kai Hansen www.imm.dtu.dk/~lkh Current network research issues: Social Media Neuroinformatics Machine learning Joint work with Morten Mørup, Sune Lehmann

More information

NCNC FAU. Modeling the Network Architecture of the Human Brain

NCNC FAU. Modeling the Network Architecture of the Human Brain NCNC 2010 - FAU Modeling the Network Architecture of the Human Brain Olaf Sporns Department of Psychological and Brain Sciences Indiana University, Bloomington, IN 47405 http://www.indiana.edu/~cortex,

More information

Dynamic causal modeling for fmri

Dynamic causal modeling for fmri Dynamic causal modeling for fmri Methods and Models for fmri, HS 2015 Jakob Heinzle Structural, functional & effective connectivity Sporns 2007, Scholarpedia anatomical/structural connectivity - presence

More information

Introduction to Probabilistic Machine Learning

Introduction to Probabilistic Machine Learning Introduction to Probabilistic Machine Learning Piyush Rai Dept. of CSE, IIT Kanpur (Mini-course 1) Nov 03, 2015 Piyush Rai (IIT Kanpur) Introduction to Probabilistic Machine Learning 1 Machine Learning

More information

Part 1: Expectation Propagation

Part 1: Expectation Propagation Chalmers Machine Learning Summer School Approximate message passing and biomedicine Part 1: Expectation Propagation Tom Heskes Machine Learning Group, Institute for Computing and Information Sciences Radboud

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2014 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

MINIMUM EXPECTED RISK PROBABILITY ESTIMATES FOR NONPARAMETRIC NEIGHBORHOOD CLASSIFIERS. Maya Gupta, Luca Cazzanti, and Santosh Srivastava

MINIMUM EXPECTED RISK PROBABILITY ESTIMATES FOR NONPARAMETRIC NEIGHBORHOOD CLASSIFIERS. Maya Gupta, Luca Cazzanti, and Santosh Srivastava MINIMUM EXPECTED RISK PROBABILITY ESTIMATES FOR NONPARAMETRIC NEIGHBORHOOD CLASSIFIERS Maya Gupta, Luca Cazzanti, and Santosh Srivastava University of Washington Dept. of Electrical Engineering Seattle,

More information

Discrete Mathematics and Probability Theory Fall 2015 Lecture 21

Discrete Mathematics and Probability Theory Fall 2015 Lecture 21 CS 70 Discrete Mathematics and Probability Theory Fall 205 Lecture 2 Inference In this note we revisit the problem of inference: Given some data or observations from the world, what can we infer about

More information

Basic Sampling Methods

Basic Sampling Methods Basic Sampling Methods Sargur Srihari srihari@cedar.buffalo.edu 1 1. Motivation Topics Intractability in ML How sampling can help 2. Ancestral Sampling Using BNs 3. Transforming a Uniform Distribution

More information

Learning and Memory in Neural Networks

Learning and Memory in Neural Networks Learning and Memory in Neural Networks Guy Billings, Neuroinformatics Doctoral Training Centre, The School of Informatics, The University of Edinburgh, UK. Neural networks consist of computational units

More information

RaRE: Social Rank Regulated Large-scale Network Embedding

RaRE: Social Rank Regulated Large-scale Network Embedding RaRE: Social Rank Regulated Large-scale Network Embedding Authors: Yupeng Gu 1, Yizhou Sun 1, Yanen Li 2, Yang Yang 3 04/26/2018 The Web Conference, 2018 1 University of California, Los Angeles 2 Snapchat

More information

Multiple-step Time Series Forecasting with Sparse Gaussian Processes

Multiple-step Time Series Forecasting with Sparse Gaussian Processes Multiple-step Time Series Forecasting with Sparse Gaussian Processes Perry Groot ab Peter Lucas a Paul van den Bosch b a Radboud University, Model-Based Systems Development, Heyendaalseweg 135, 6525 AJ

More information

Chapter 35 Bayesian STAI Anxiety Index Predictions Based on Prefrontal Cortex NIRS Data for the Resting State

Chapter 35 Bayesian STAI Anxiety Index Predictions Based on Prefrontal Cortex NIRS Data for the Resting State Chapter 35 Bayesian STAI Anxiety Index Predictions Based on Prefrontal Cortex NIRS Data for the Resting State Masakaze Sato, Wakana Ishikawa, Tomohiko Suzuki, Takashi Matsumoto, Takeo Tsujii, and Kaoru

More information

Based on slides by Richard Zemel

Based on slides by Richard Zemel CSC 412/2506 Winter 2018 Probabilistic Learning and Reasoning Lecture 3: Directed Graphical Models and Latent Variables Based on slides by Richard Zemel Learning outcomes What aspects of a model can we

More information

Graphical Models and Kernel Methods

Graphical Models and Kernel Methods Graphical Models and Kernel Methods Jerry Zhu Department of Computer Sciences University of Wisconsin Madison, USA MLSS June 17, 2014 1 / 123 Outline Graphical Models Probabilistic Inference Directed vs.

More information

Introduction to Probabilistic Graphical Models

Introduction to Probabilistic Graphical Models Introduction to Probabilistic Graphical Models Sargur Srihari srihari@cedar.buffalo.edu 1 Topics 1. What are probabilistic graphical models (PGMs) 2. Use of PGMs Engineering and AI 3. Directionality in

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Bayesian Model Comparison Zoubin Ghahramani zoubin@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, and MSc in Intelligent Systems, Dept Computer Science University College

More information

Model Selection and Estimation of Multi-Compartment Models in Diffusion MRI with a Rician Noise Model

Model Selection and Estimation of Multi-Compartment Models in Diffusion MRI with a Rician Noise Model Model Selection and Estimation of Multi-Compartment Models in Diffusion MRI with a Rician Noise Model Xinghua Zhu, Yaniv Gur, Wenping Wang, P. Thomas Fletcher The University of Hong Kong, Department of

More information

Algorithmisches Lernen/Machine Learning

Algorithmisches Lernen/Machine Learning Algorithmisches Lernen/Machine Learning Part 1: Stefan Wermter Introduction Connectionist Learning (e.g. Neural Networks) Decision-Trees, Genetic Algorithms Part 2: Norman Hendrich Support-Vector Machines

More information

Decoding conceptual representations

Decoding conceptual representations Decoding conceptual representations!!!! Marcel van Gerven! Computational Cognitive Neuroscience Lab (www.ccnlab.net) Artificial Intelligence Department Donders Centre for Cognition Donders Institute for

More information

From Pixels to Brain Networks: Modeling Brain Connectivity and Its Changes in Disease. Polina Golland

From Pixels to Brain Networks: Modeling Brain Connectivity and Its Changes in Disease. Polina Golland From Pixels to Brain Networks: Modeling Brain Connectivity and Its Changes in Disease Polina Golland MIT Computer Science and Artificial Intelligence Laboratory Joint work with Archana Venkataraman C.-F.

More information