Topological Data Analysis for Brain Networks

Similar documents
Kernel Partial Least Squares for Nonlinear Regression and Discrimination

Learning from persistence diagrams

Multivariate Statistical Analysis of Deformation Momenta Relating Anatomical Shape to Neuropsychological Measures

Dynamic-Inner Partial Least Squares for Dynamic Data Modeling

Analysis of Longitudinal Shape Variability via Subject Specific Growth Modeling

Regression. Goal: Learn a mapping from observations (features) to continuous labels given a training set (supervised learning)

Introduction to Machine Learning

Regression. Goal: Learn a mapping from observations (features) to continuous labels given a training set (supervised learning)

Clustering and Model Integration under the Wasserstein Metric. Jia Li Department of Statistics Penn State University

Nonparametric regression for topology. applied to brain imaging data

Statistical Methods for Integration of Multiple Omics Data

A MULTIVARIATE MODEL FOR COMPARISON OF TWO DATASETS AND ITS APPLICATION TO FMRI ANALYSIS

December 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis

Canonical Correlation Analysis with Kernels

Dimensionality Reduction

A Kernel on Persistence Diagrams for Machine Learning

Distances and similarities Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining

Principal Component Analysis (PCA) Principal Component Analysis (PCA)

Distributions of Persistence Diagrams and Approximations

Spatiotemporal Anatomical Atlas Building

Neuroscience Introduction

Discriminative Direction for Kernel Classifiers

Linear Regression (9/11/13)

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

Statistics 202: Data Mining. c Jonathan Taylor. Week 2 Based in part on slides from textbook, slides of Susan Holmes. October 3, / 1

Part I. Other datatypes, preprocessing. Other datatypes. Other datatypes. Week 2 Based in part on slides from textbook, slides of Susan Holmes

EXTENDING PARTIAL LEAST SQUARES REGRESSION

Gromov-Hausdorff stable signatures for shapes using persistence

Nonlinear Dimensionality Reduction

Joint Emotion Analysis via Multi-task Gaussian Processes

ROBUST PARTIAL LEAST SQUARES REGRESSION AND OUTLIER DETECTION USING REPEATED MINIMUM COVARIANCE DETERMINANT METHOD AND A RESAMPLING METHOD

FERMENTATION BATCH PROCESS MONITORING BY STEP-BY-STEP ADAPTIVE MPCA. Ning He, Lei Xie, Shu-qing Wang, Jian-ming Zhang

Testing for group differences in brain functional connectivity

Unsupervised Machine Learning and Data Mining. DS 5230 / DS Fall Lecture 7. Jan-Willem van de Meent

A Selective Review of Sufficient Dimension Reduction

Statistical Machine Learning

On Inverse Problems in TDA

Stabilizing the unstable output of persistent homology computations

DISCRIMINATION AND CLASSIFICATION IN NIR SPECTROSCOPY. 1 Dept. Chemistry, University of Rome La Sapienza, Rome, Italy

STATISTICAL LEARNING SYSTEMS

PCA, Kernel PCA, ICA

Uncertainty quantification and visualization for functional random variables

Apprentissage non supervisée

Overview. Motivation for the inner product. Question. Definition

Gaussian Process Regression with K-means Clustering for Very Short-Term Load Forecasting of Individual Buildings at Stanford

Towards a Regression using Tensors

Machine Learning. CUNY Graduate Center, Spring Lectures 11-12: Unsupervised Learning 1. Professor Liang Huang.

Geometric Inference on Kernel Density Estimates

Longitudinal growth analysis of early childhood brain using deformation based morphometry

Contributions to partial least squares regression and supervised principal component analysis modeling

Effects of Postprocessing on Topology of FMRI Connectivity in Spatial and Temporal Domains

Population-shrinkage of covariance to estimate better brain functional connectivity

Package sgpca. R topics documented: July 6, Type Package. Title Sparse Generalized Principal Component Analysis. Version 1.0.

DIMENSION REDUCTION AND CLUSTER ANALYSIS

Principal Components Analysis (PCA)

Kernel Methods. Barnabás Póczos

Part I. Linear Discriminant Analysis. Discriminant analysis. Discriminant analysis

ANALYSIS OF NONLINEAR PARTIAL LEAST SQUARES ALGORITHMS

Vector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.

Linear Models for Regression. Sargur Srihari

Inverse Power Method for Non-linear Eigenproblems

Role and treatment of categorical variables in PLS Path Models for Composite Indicators

Probabilistic Graphical Models

A Hypothesis Testing Framework for High-Dimensional Shape Models

Nearest Correlation Louvain Method for Fast and Good Selection of Input Variables of Statistical Model

Linear Regression (continued)

Drift Reduction For Metal-Oxide Sensor Arrays Using Canonical Correlation Regression And Partial Least Squares

A New Generation of Brain-Computer Interfaces Driven by Discovery of Latent EEG-fMRI Linkages Using Tensor Decomposition

Processing Big Data Matrix Sketching

c 4, < y 2, 1 0, otherwise,

PRINCIPAL COMPONENTS ANALYSIS

LVM METHODS IN THE PRESENCE OF STRUCTURED NOISE

New Machine Learning Methods for Neuroimaging

Optimizing Model Development and Validation Procedures of Partial Least Squares for Spectral Based Prediction of Soil Properties

Kernel Sliced Inverse Regression With Applications to Classification

A New Bayesian Variable Selection Method: The Bayesian Lasso with Pseudo Variables

Latent Factor Regression Models for Grouped Outcomes

Sufficient Dimension Reduction for Longitudinally Measured Predictors

Modeling Mutagenicity Status of a Diverse Set of Chemical Compounds by Envelope Methods

The prediction of house price

VCMC: Variational Consensus Monte Carlo

Advanced Introduction to Machine Learning

Principal component analysis (PCA) for clustering gene expression data

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Principal Component Analysis

Vector Space Models. wine_spectral.r

CSC 411: Lecture 03: Linear Classification

Non-Asymptotic Theory of Random Matrices Lecture 4: Dimension Reduction Date: January 16, 2007

FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING

Dimensionality Reduction for Exponential Family Data

Optimization of Designs for fmri

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395

Final Exam, Machine Learning, Spring 2009

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works

An Introduction to Matrix Algebra

Neural Networks, Computation Graphs. CMSC 470 Marine Carpuat

Experimental Design and Data Analysis for Biologists

-Principal components analysis is by far the oldest multivariate technique, dating back to the early 1900's; ecologists have used PCA since the

Scalable Gaussian process models on matrices and tensors

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26

Transcription:

Topological Data Analysis for Brain Networks Relating Functional Brain Network Topology to Clinical Measures of Behavior Bei Wang Phillips 1,2 University of Utah Joint work with Eleanor Wong 1,2, Sourabh Palande 1,2, Brandon Zielinski 3, Jeffrey Anderson 4, P. Thomas Fletcher 1,2 1 Scientific Computing and Imaging (SCI) Institute 2 School of Computing 3 Pediatrics and Neurology 4 Radiology October 2, 2016

Big Picture Goal: Quantify the relationship between brain functional networks and behavioral measures. Our Contribution: Use topological features based on persistent homology. Result: Combining correlations with topological features gives better prediction of autism severity than using correlations alone.

Motivation About Autism Spectrum Disorders (ASD): No cure, causes unknown Diagnosis: No systematic method ADOS (Autism Diagnostic Observation Schedule) Correlate functional brain network to ADOS scores Early diagnosis Treatment tracking

What is a Brain Network? Represents brain regions and pairwise associations Computation of Correlation Matrices: Resting state functional MRI (R-fMRI) Preprocessing Define regions of interest (ROIs) Estimate time series signals Compute pairwise associations - Pearson Correlation

Why Topology? How to use this data? Graph and graph theoretic measures (e.g. small worldness) Require binary associations (thresholding) Correlations as features High dimensionality, not enough samples Dimensionality reduction: PCA, random projections May lose structures in higher dimensions

Why Topology Projection - may lose structures in higher dimensions Topology captures structure In higher dimensions Across all continuous thresholds

Persistent Homology What are topological features? Homological features: Dim 0 - Connected Components Dim 1 - Tunnels / Loops Dim 2 - Voids How to compute them (in a nutshell)? Begin with point cloud Grow balls of diameter t around each point Track features of the union of balls as t increases

Persistent Homology

Persistent Homology

Persistent Homology

Persistent Homology

Persistent Homology

Persistent Homology

Persistence Diagrams Persistent homological features - encoded as barcodes or persistent diagrams Figure: Barcode Figure: Persistence Diagram

Interpretation of Connected Components Dim 0 features - hierarchical clustering

Computing Topological Features for Brain Networks

Partial Least Squares (PLS) Regression A dimensionality reduction technique that finds two sets of latent dimensions from datasets X and Y such that their projections on the latent dimensions are maximally co-varying. X - features from brain imaging: correlations, topological features (zero mean) Y - clinical measure of behavior: ADOS scores (zero mean) PLS models the relations between X and Y by means of score vectors.

PLS Regression n - number of data points X - predictor/regressor (n N), Y - response (n M) PLS - decompose X, Y such that: X = TP T + E Y = UQ T + F Where T, U - latent variables/score vectors (n p), factor matrices P (N p), Q (M p) - orthogonal loading matrices E (n N), F (n M) - residuals/errors T, U are chosen such that projections of X, Y, that is, T and U, are maximally co-varying.

PLS Regression: the Algorithm Iterative NIPALS 1 algorithm Find first latent dimension i.e. find vectors w, c such that t = Xw, u = Yc have maximal covariance Deflate previous latent dimensions from X, Y and repeat 1 Nonlinear iterative partial least squares; [Wold 1975].

Kernel PLS Kernel form of NIPALS algorithm (kpls) 1. Initialize random vector u 2. Repeat until convergence (a) t = Ku/ Ku (b) c = Y T t (c) u = Yc/ Yc 3. Deflate K = (I tt T )K(I tt t ) 4. Repeat to compute subsequent latent dimensions

Data 87 Subjects: 30 Control, 57 ASD ADOS scores: 0 to 21 264 ROIs (Power regions) 264 264 correlation matrix. 34,716 distinct pairwise correlations per subject.

Experiments Given: Correlation matrices Map to metric space d(x, y) = 1 Cor(x, y) Compute persistence diagrams Define inner product of persistence diagrams 2 (i.e. kernel): Given two persistence diagrams F, G k σ (F, G) = 1 8πσ p F q G e p q 2 8σ where for every q = (x, y) G, q = (y, x) e p q 2 8σ 2 [Reininghaus Huber Bauer Kwitt 2015].

Experiments Performed experiments with 3 kernels: 1. K Cor - Euclidean dot product of vectorized correlations 2. K TDA = w 0 K TDA0 + (1 w 0 )K TDA1 K TDA 0 K TDA 1 - using only Dim 0 features - using only Dim 1 features 3. K TDA+Cor = w 0 K TDA0 + w 1 K TDA1 + (1 w 0 w 1 )K Cor Baseline predictor - mean ADOS score

Experiments Leave one out cross validation over parameters σ 0, σ 1 - (log 10 σ) from -8.0 to 6.0 by 0.2 w 0, w 1 - from 0.0 to 1.0 by 0.05 k TDA parameters: σ 0 = 6.6, σ 1 = 1.8, w 1 = 0.95 k TDA+Cor parameters: σ 0 = 7.8, σ 1 = 2.8, w 0 = 0.1, w 1 = 0.4 Compute RMSE Permutation test for significance

Results Result Highlights: Baseline RMSE: 6.4302 K TDA+Cor : Only method statistically significant over baseline Permutation test p-value: 0.048 RMSE: 6.0156

Conclusion Augmenting correlations with topological features gives a better prediction of autism severity than using correlations alone Topological features derived from R-fMRI have the potential to explain the connection between functional brain networks and autism severity

Future Work Many things to try Alternatives to correlation Different distance metric Different kernel Multi-site data

Publication Kernel Partial Least Squares Regression for Relating Functional Brian Network Topology to Clinical Measures of Behavior Authors: Eleanor Wong, Sourabh Palande, Bei Wang, Brandon Zielinski, Jeffrey Anderson and P. Thomas Fletcher IEEE International Symposium on Biomedical Imaging (ISBI), 2016

Acknowledgements This work was partially supported by NSF grant IIS-1513616 and IIS-1251049. Attending ACM-BCB is partially supported by NIH-1R01EB022876-01.

Thank you! Bei Wang Phillips beiwang@sci.utah.edu