Fast algorithms for dimensionality reduction and data visualization

Size: px
Start display at page:

Download "Fast algorithms for dimensionality reduction and data visualization"

Transcription

1 Fast algorithms for dimensionality reduction and data visualization Manas Rachh Yale University 1/33

2 Acknowledgements George Linderman (Yale) Jeremy Hoskins (Yale) Stefan Steinerberger (Yale) Yuval Kluger (Yale) Vladimir Rokhlin (Yale) Mark Tygert (Facebook) 2/33

3 Introduction Applications Single-cell RNA-sequencing (scrna-seq) Latent representations in deep learning Astronomy...and much more t-sne implementations scale poorly to large datasets (e.g. 8 hours for dataset of 1 million points in 500 dimensional space) FFT-accelerated Interpolation-based t-sne (FIt-SNE), for faster t-sne (30 min for same dataset) Out-of-Core PCA (oocpca) for datasets that don t fit in memory 3/33

4 Applications: scrna-seq Bulk RNA-seq averages expression across all cells Single cell RNA-seq measures expression in individual cells Results tabulated as an expression matrix columns are genes ( 30,000) rows are cells ( 10 3 to 10 6 ) Number of Cells 1 M 100 k 10 k 1 k Islam et al. Tang et al. Jaitin et al. Macosko et al. Dixit et al. 10X Year Number of cells growing rapidly 4/33

5 Applications: scrna-seq For example, t-sne of 1.3 million brain cells1 1 5/33 10X Genomics (2016)

6 t-sne Optimization Input: d-dimensional dataset X = {x 1, x 2,..., x N } R d Output: s-dimensional embedding Y = {y 1, y 2,..., y N } R s, s d Goal: x i and x j close in the input space = y i and y j are also close Affinities between points x i and x j in the input space, p ij - Gaussian exp ( x i x j 2 /2σi 2 p i j = ) k =i exp ( x i x k 2 /2σi 2) and p ij = p i j + p j i. 2N Affinities between points y i and y j - Cauchy kernel q ij = Minimize Kullback-Leibler divergence (1 + y i y j 2 ) 1 k =l (1 + y k y l 2 ) 1. C (Y) = p ij log p ij. q i =j ij 6/33

7 Gradient Descent Minimize C (Y) via gradient descent Z is a global normalization constant Split into two parts C = 4Z y i (p ij q ij )q ij (y i y j ) j =i N N Z = j=1 l=1 l =j 1 (1 + y l y j 2 ) 1 C = Z 4 y i p ij q ij (y i y j ) Z qij 2(y i y j ) j =i j =i }{{}}{{} F attr,i F rep,i Direct calculation: O(N 2 ) 7/33

8 Repulsion term - F rep F rep,k (m) = N l=1 l =k / y l (m) y k (m) (1 + y l y k 2 ) 2 N N j=1 l=1 l =j 1 (1 + y l y j 2 ), Combinations of where N K (y i, y j )σ j j=1 K (y, z) = y z 2 or K (y, z) = 1 (1 + y z 2 ) 2 Existing methods: Tree Codes/ Fast-multipole methods (FMMs) L. Greengard, V. Rokhlin (1987) 8/33

9 FMM illustration 9/33

10 FMM illustration 9/33

11 FMM illustration 9/33

12 FMM illustration 9/33

13 FMM matrices F i = N K (y i, z j )σ j j=1 Low rank Low rank Low rank Low rank Low rank K (y, z) singular when y = z Self-interaction: full-rank Low rank Low rank Low rank Low rank Low rank Low rank Low rank Tree refinement strategy: O(1) particles per leaf box Low rank Low rank Low rank Low rank Low rank Low rank Low rank Low rank Low rank Low rank Low rank Low rank 10/33

14 Self-interaction - smooth kernels t-sne kernels - smooth even for y = z Even self-interaction can be compressed! (0, 1) (1, 1) / y z 1/(1 + y z 2 ) 10 0 σ(k) (0, 0) (1, 0) /33

15 Polynomial interpolation based fast algorithms K p (y, z) - polynomial interpolant of K (y, z) of order p ỹ l, z m - interpolation nodes, L - Lagrange polynomials; then Replace Relative error: φ l = p 2 p 2 K p (y, z) = K (ỹ l, z m )L j,ỹ (y)l l, z (z) j=1 l=1 N N K (y l, z j )σ j with φ l = K p (y l, z j )σ j j=1 j=1 φ l φ l φ l sup K (y, z) K p (y, z) For fixed tolerance ε, p depends on smoothness of K, independent of N Greengard, Rokhlin, Gimbutas, Ying, Darve, Zorin, Biros, Barnett, Ho, Gillman, Martinsson,... 12/33

16 j K p (y l, z j )σ j p 2 p 2 N φ l = K (ỹ m, z n )L m,ỹ (y l )L n, z (z j )σ j j=1 m=1 n=1 ( p 2 N = L m,ỹ (y l ) p2 K (ỹ m, z n ) m=1 n=1 j=1 L n, z (z j )σ j ) Step 1: Step 2: Step 3: N w n = L n, z (z j )σ j Work: O(N p 2 ) j=1 p 2 v m = K (ỹ m, z n )w n Work: O(p 4 ) n=1 φ l = p 2 m=1 L m,ỹ (y l )v m Work: O(M p 2 ) 13/33

17 Algorithm illustration Step 2 z ỹ Step 1 Step 3 z j y i 14/33

18 FFT accelerated interpolation based t-sne (FIt-SNE) Subdivide domain into N int N int boxes Given ε, determine p Equispaced interpolation nodes In each box, compute effective charges at interpolation nodes N w n,l = L n,ỹ l (y j )σ j Work: O(N p 2 ) y j B l Interaction between equispaced nodes - via FFT Nint 2 v m,n = j=1 p 2 l=1 K (ỹ m,n, ỹ l,j )w l,j Work: O((N int p) 2 log (N int p)) Interpolate, for y i B l, p 2 φ i = L m,ỹ l (y i )v m,l Work: O(N p 2 ) j=1 15/33

19 Choosing N int and p Large p with equispaced nodes - unstable t-sne kernels archetypical examples of Runge phenomenon L 1.4, p < 10 works For fixed accuracy, N int p constant = Computational complexity O(N p 2 ) 16/33

20 Runge phenomenon and equispaced interpolation Interpolation SVD L=0.5 L=1.5 L= Error Error p p 17/33

21 Error estimates 1-D interpolation: x j = L/2 + (j 1/2) L/p j = 1, 2,... p f (x) = 1/(1 + x 2 ) or f (x) = 1/(1 + x 2 ) 2 Interpolation error: p f (x) L j,{ xj } (x)f ( x j ) f p p (ζ) p! x x j j=1 j=1 }{{} π p (x) Estimates: f p p + 2 p! π p (x) (2p)! 2 2 2p p! ( ) L p p Error in 1-D p f (x) L j,{ xj } (x)f ( x j ) p + 2 ( L j=1 2 e ) p e 1 24p. 18/33

22 Error estimates - II In d dimensions f (x) = 1/(1 + x 2 ) or f (x) = 1/(1 + x 2 ) 2 In d dimensional interpolation, estimates via error estimates along lines Interpolation error: p f (x) L j,{ xj } (x)f ( x j ) p + 2 ( 2 d ) p L e 1 24p. j=1 2 e Not sharp for d > 1 19/33

23 Algorithm Illustration - Step 1 ( p 2 p 2 L m,ỹ (y l ) m=1 n=1 K (ỹ m, z n ) ( N ) ) L n, z (z j )σ j j=1 } {{ } w n Spread 20/33

24 Algorithm Illustration - Step 2 ( ) p 2 p 2 L m,ỹ (y l ) K (ỹ m, z n )w n m=1 n=1 } {{ } v m FFT 21/33

25 Algorithm Illustration - Step 3 p 2 L m,ỹ (y l )v m m=1 Interpolate 22/33

26 Matrix decomposition Matrix K block separable, all submatrices low rank K i,j = U i S i,j U T j U U S 1,1 S 1,2... S 1,N 2 int S 2,1 S 2,2... S 2,N 2 int U1 T U2 T U N 2 }{{ int } U S N 2 int,1 S N 2 int,2... S N 2 int,nint 2 }{{} S U T N 2 int U i : n i p 2 matrix S - Toeplitz (almost) 23/33

27 Attractive forces - F attr p ij exp ( x i x j 2 /σ). Computing p ij - a local calculation Attractive forces F attr,i = p ij q ij Z (y i y j ) p ij q ij Z (y i y j ). j =i j KNN of i One time computation - doesn t need to be computed every iteration of gradient descent 24/33

28 Nearest Neighbors bhtsne: exact nearest neighbors Using vantage point-trees Slows down in high dimensions FIt-SNE: approximate nearest neighbors Using ANNOY 2 Random projections Smoothing effect 3 from using near neighbors? G. Linderman and S. Steinerberger (2017) arxiv: /33

29 oocpca for Big Data What if dataset is extremely large? Computers without enough memory to load data cannot visualize it e.g. 1 million cells with 30, 000 genes requires 240GB! Out-of-core implementation of randomized PCA Compute the top few ( 50) principal components of a dataset without loading it entirely Mundane computers can visualize/analyze the largest datasets oocpca computes top 50 principal components with varying memory limitations: Memory (GB) Time (Min) /33

30 MNIST data 106 digit images from Infinite MNIST data-set Late exaggeration to separate clusters more effectively 27/33 G. Lindermann and S. Steinerberger (2017)

31 28/33 Retinal cells and t-sne heatmaps VSX1 OPN1MW PECAM1 Cluster 1D t-sne heatmaps (left) vs 2D t-sne (right) for retinal cells Data from Macosko et al (2016)

32 Numerical results - FIt-SNE 1 Dimensional Embedding 2 Dimensional Embedding Runtime (hours) Runtime (hours) k 100k 1M Number of Points k 100k 1M Number of Points BH FI 29/33

33 Numerical results - Fast nearest neighbors Dimensions Dimensions Dimensions Runtime (minutes) k 100 k 1 M 10 k 100 k 1 M Number of Points 10 k 100 k 1 M vptree now approx end 30/33

34 Summary We developed fast algorithms for data visualization and dimensionality reduction using t-sne which is roughly 15 times faster than the state of the art We presented interpolation based fast algorithms for N body interactions with smooth kernels Late exaggeration for better separation of clusters Out of core PCA for visualizing extremely large data sets on laptops Github: 31/33

35 Future work Better convergence estimates and theoretical framework Different affinities for input and target spaces Fast multipole style multi-level schemes 32/33

36 Questions? 33/33

37 Questions? Thank you 33/33

t-sne and its theoretical guarantee

t-sne and its theoretical guarantee t-sne and its theoretical guarantee Ziyuan Zhong Columbia University July 4, 2018 Ziyuan Zhong (Columbia University) t-sne July 4, 2018 1 / 72 Overview Timeline: PCA (Karl Pearson, 1901) Manifold Learning(Isomap

More information

Fast Multipole Methods: Fundamentals & Applications. Ramani Duraiswami Nail A. Gumerov

Fast Multipole Methods: Fundamentals & Applications. Ramani Duraiswami Nail A. Gumerov Fast Multipole Methods: Fundamentals & Applications Ramani Duraiswami Nail A. Gumerov Week 1. Introduction. What are multipole methods and what is this course about. Problems from physics, mathematics,

More information

Research Statement. James Bremer Department of Mathematics, University of California, Davis

Research Statement. James Bremer Department of Mathematics, University of California, Davis Research Statement James Bremer Department of Mathematics, University of California, Davis Email: bremer@math.ucdavis.edu Webpage: https.math.ucdavis.edu/ bremer I work in the field of numerical analysis,

More information

A randomized algorithm for approximating the SVD of a matrix

A randomized algorithm for approximating the SVD of a matrix A randomized algorithm for approximating the SVD of a matrix Joint work with Per-Gunnar Martinsson (U. of Colorado) and Vladimir Rokhlin (Yale) Mark Tygert Program in Applied Mathematics Yale University

More information

ON INTERPOLATION AND INTEGRATION IN FINITE-DIMENSIONAL SPACES OF BOUNDED FUNCTIONS

ON INTERPOLATION AND INTEGRATION IN FINITE-DIMENSIONAL SPACES OF BOUNDED FUNCTIONS COMM. APP. MATH. AND COMP. SCI. Vol., No., ON INTERPOLATION AND INTEGRATION IN FINITE-DIMENSIONAL SPACES OF BOUNDED FUNCTIONS PER-GUNNAR MARTINSSON, VLADIMIR ROKHLIN AND MARK TYGERT We observe that, under

More information

PCA, Kernel PCA, ICA

PCA, Kernel PCA, ICA PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per

More information

A fast randomized algorithm for overdetermined linear least-squares regression

A fast randomized algorithm for overdetermined linear least-squares regression A fast randomized algorithm for overdetermined linear least-squares regression Vladimir Rokhlin and Mark Tygert Technical Report YALEU/DCS/TR-1403 April 28, 2008 Abstract We introduce a randomized algorithm

More information

Machine Learning (BSMC-GA 4439) Wenke Liu

Machine Learning (BSMC-GA 4439) Wenke Liu Machine Learning (BSMC-GA 4439) Wenke Liu 02-01-2018 Biomedical data are usually high-dimensional Number of samples (n) is relatively small whereas number of features (p) can be large Sometimes p>>n Problems

More information

Normalized power iterations for the computation of SVD

Normalized power iterations for the computation of SVD Normalized power iterations for the computation of SVD Per-Gunnar Martinsson Department of Applied Mathematics University of Colorado Boulder, Co. Per-gunnar.Martinsson@Colorado.edu Arthur Szlam Courant

More information

randomized block krylov methods for stronger and faster approximate svd

randomized block krylov methods for stronger and faster approximate svd randomized block krylov methods for stronger and faster approximate svd Cameron Musco and Christopher Musco December 2, 25 Massachusetts Institute of Technology, EECS singular value decomposition n d left

More information

A fast randomized algorithm for approximating an SVD of a matrix

A fast randomized algorithm for approximating an SVD of a matrix A fast randomized algorithm for approximating an SVD of a matrix Joint work with Franco Woolfe, Edo Liberty, and Vladimir Rokhlin Mark Tygert Program in Applied Mathematics Yale University Place July 17,

More information

arxiv: v1 [cs.lg] 26 Jul 2017

arxiv: v1 [cs.lg] 26 Jul 2017 Updating Singular Value Decomposition for Rank One Matrix Perturbation Ratnik Gandhi, Amoli Rajgor School of Engineering & Applied Science, Ahmedabad University, Ahmedabad-380009, India arxiv:70708369v

More information

A fast randomized algorithm for computing a Hierarchically Semi-Separable representation of a matrix

A fast randomized algorithm for computing a Hierarchically Semi-Separable representation of a matrix A fast randomized algorithm for computing a Hierarchically Semi-Separable representation of a matrix P.G. Martinsson, Department of Applied Mathematics, University of Colorado at Boulder Abstract: Randomized

More information

A Randomized Algorithm for the Approximation of Matrices

A Randomized Algorithm for the Approximation of Matrices A Randomized Algorithm for the Approximation of Matrices Per-Gunnar Martinsson, Vladimir Rokhlin, and Mark Tygert Technical Report YALEU/DCS/TR-36 June 29, 2006 Abstract Given an m n matrix A and a positive

More information

Fast Multipole Methods for The Laplace Equation. Outline

Fast Multipole Methods for The Laplace Equation. Outline Fast Multipole Methods for The Laplace Equation Ramani Duraiswami Nail Gumerov Outline 3D Laplace equation and Coulomb potentials Multipole and local expansions Special functions Legendre polynomials Associated

More information

Rapid evaluation of electrostatic interactions in multi-phase media

Rapid evaluation of electrostatic interactions in multi-phase media Rapid evaluation of electrostatic interactions in multi-phase media P.G. Martinsson, The University of Colorado at Boulder Acknowledgements: Some of the work presented is joint work with Mark Tygert and

More information

Report Documentation Page

Report Documentation Page Report Documentation Page Form Approved OMB No. 0704-0188 Public reporting burden for the collection of information is estimated to average 1 hour per response, including the time for reviewing instructions,

More information

Randomized algorithms for the low-rank approximation of matrices

Randomized algorithms for the low-rank approximation of matrices Randomized algorithms for the low-rank approximation of matrices Yale Dept. of Computer Science Technical Report 1388 Edo Liberty, Franco Woolfe, Per-Gunnar Martinsson, Vladimir Rokhlin, and Mark Tygert

More information

Data Exploration and Unsupervised Learning with Clustering

Data Exploration and Unsupervised Learning with Clustering Data Exploration and Unsupervised Learning with Clustering Paul F Rodriguez,PhD San Diego Supercomputer Center Predictive Analytic Center of Excellence Clustering Idea Given a set of data can we find a

More information

Randomized Algorithms

Randomized Algorithms Randomized Algorithms Saniv Kumar, Google Research, NY EECS-6898, Columbia University - Fall, 010 Saniv Kumar 9/13/010 EECS6898 Large Scale Machine Learning 1 Curse of Dimensionality Gaussian Mixture Models

More information

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted

More information

Nonnegative Matrix Factorization

Nonnegative Matrix Factorization Nonnegative Matrix Factorization Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

A Crash Course on Fast Multipole Method (FMM)

A Crash Course on Fast Multipole Method (FMM) A Crash Course on Fast Multipole Method (FMM) Hanliang Guo Kanso Biodynamical System Lab, 2018 April What is FMM? A fast algorithm to study N body interactions (Gravitational, Coulombic interactions )

More information

Applications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices

Applications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices Applications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices Vahid Dehdari and Clayton V. Deutsch Geostatistical modeling involves many variables and many locations.

More information

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin 1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)

More information

A direct solver for elliptic PDEs in three dimensions based on hierarchical merging of Poincaré-Steklov operators

A direct solver for elliptic PDEs in three dimensions based on hierarchical merging of Poincaré-Steklov operators (1) A direct solver for elliptic PDEs in three dimensions based on hierarchical merging of Poincaré-Steklov operators S. Hao 1, P.G. Martinsson 2 Abstract: A numerical method for variable coefficient elliptic

More information

LECTURE NOTE #11 PROF. ALAN YUILLE

LECTURE NOTE #11 PROF. ALAN YUILLE LECTURE NOTE #11 PROF. ALAN YUILLE 1. NonLinear Dimension Reduction Spectral Methods. The basic idea is to assume that the data lies on a manifold/surface in D-dimensional space, see figure (1) Perform

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning Christoph Lampert Spring Semester 2015/2016 // Lecture 12 1 / 36 Unsupervised Learning Dimensionality Reduction 2 / 36 Dimensionality Reduction Given: data X = {x 1,..., x

More information

A Fast Regularized Boundary Integral Method for Practical Acoustic Problems

A Fast Regularized Boundary Integral Method for Practical Acoustic Problems Copyright 2013 Tech Science Press CMES, vol.91, no.6, pp.463-484, 2013 A Fast Regularized Boundary Integral Method for Practical Acoustic Problems Z.Y. Qian, Z.D. Han 1, and S.N. Atluri 1,2 Abstract: To

More information

Fast numerical methods for solving linear PDEs

Fast numerical methods for solving linear PDEs Fast numerical methods for solving linear PDEs P.G. Martinsson, The University of Colorado at Boulder Acknowledgements: Some of the work presented is joint work with Vladimir Rokhlin and Mark Tygert at

More information

Course Requirements. Course Mechanics. Projects & Exams. Homework. Week 1. Introduction. Fast Multipole Methods: Fundamentals & Applications

Course Requirements. Course Mechanics. Projects & Exams. Homework. Week 1. Introduction. Fast Multipole Methods: Fundamentals & Applications Week 1. Introduction. Fast Multipole Methods: Fundamentals & Applications Ramani Duraiswami Nail A. Gumerov What are multipole methods and what is this course about. Problems from phsics, mathematics,

More information

Fast multipole method

Fast multipole method 203/0/ page Chapter 4 Fast multipole method Originally designed for particle simulations One of the most well nown way to tae advantage of the low-ran properties Applications include Simulation of physical

More information

A Tour of the Lanczos Algorithm and its Convergence Guarantees through the Decades

A Tour of the Lanczos Algorithm and its Convergence Guarantees through the Decades A Tour of the Lanczos Algorithm and its Convergence Guarantees through the Decades Qiaochu Yuan Department of Mathematics UC Berkeley Joint work with Prof. Ming Gu, Bo Li April 17, 2018 Qiaochu Yuan Tour

More information

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen PCA. Tobias Scheffer

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen PCA. Tobias Scheffer Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen PCA Tobias Scheffer Overview Principal Component Analysis (PCA) Kernel-PCA Fisher Linear Discriminant Analysis t-sne 2 PCA: Motivation

More information

(1) u i = g(x i, x j )q j, i = 1, 2,..., N, where g(x, y) is the interaction potential of electrostatics in the plane. 0 x = y.

(1) u i = g(x i, x j )q j, i = 1, 2,..., N, where g(x, y) is the interaction potential of electrostatics in the plane. 0 x = y. Encyclopedia entry on Fast Multipole Methods. Per-Gunnar Martinsson, University of Colorado at Boulder, August 2012 Short definition. The Fast Multipole Method (FMM) is an algorithm for rapidly evaluating

More information

Unsupervised learning: beyond simple clustering and PCA

Unsupervised learning: beyond simple clustering and PCA Unsupervised learning: beyond simple clustering and PCA Liza Rebrova Self organizing maps (SOM) Goal: approximate data points in R p by a low-dimensional manifold Unlike PCA, the manifold does not have

More information

A direct solver for variable coefficient elliptic PDEs discretized via a composite spectral collocation method Abstract: Problem formulation.

A direct solver for variable coefficient elliptic PDEs discretized via a composite spectral collocation method Abstract: Problem formulation. A direct solver for variable coefficient elliptic PDEs discretized via a composite spectral collocation method P.G. Martinsson, Department of Applied Mathematics, University of Colorado at Boulder Abstract:

More information

Computer Vision Group Prof. Daniel Cremers. 2. Regression (cont.)

Computer Vision Group Prof. Daniel Cremers. 2. Regression (cont.) Prof. Daniel Cremers 2. Regression (cont.) Regression with MLE (Rep.) Assume that y is affected by Gaussian noise : t = f(x, w)+ where Thus, we have p(t x, w, )=N (t; f(x, w), 2 ) 2 Maximum A-Posteriori

More information

Lecture 8: Boundary Integral Equations

Lecture 8: Boundary Integral Equations CBMS Conference on Fast Direct Solvers Dartmouth College June 23 June 27, 2014 Lecture 8: Boundary Integral Equations Gunnar Martinsson The University of Colorado at Boulder Research support by: Consider

More information

Entropic Affinities: Properties and Efficient Numerical Computation

Entropic Affinities: Properties and Efficient Numerical Computation Entropic Affinities: Properties and Efficient Numerical Computation Max Vladymyrov and Miguel Carreira-Perpiñán Electrical Engineering and Computer Science University of California, Merced http://eecs.ucmerced.edu

More information

A Fast Augmented Lagrangian Algorithm for Learning Low-Rank Matrices

A Fast Augmented Lagrangian Algorithm for Learning Low-Rank Matrices A Fast Augmented Lagrangian Algorithm for Learning Low-Rank Matrices Ryota Tomioka 1, Taiji Suzuki 1, Masashi Sugiyama 2, Hisashi Kashima 1 1 The University of Tokyo 2 Tokyo Institute of Technology 2010-06-22

More information

Machine Learning with Quantum-Inspired Tensor Networks

Machine Learning with Quantum-Inspired Tensor Networks Machine Learning with Quantum-Inspired Tensor Networks E.M. Stoudenmire and David J. Schwab Advances in Neural Information Processing 29 arxiv:1605.05775 RIKEN AICS - Mar 2017 Collaboration with David

More information

Lecture 2: The Fast Multipole Method

Lecture 2: The Fast Multipole Method CBMS Conference on Fast Direct Solvers Dartmouth College June 23 June 27, 2014 Lecture 2: The Fast Multipole Method Gunnar Martinsson The University of Colorado at Boulder Research support by: Recall:

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Yingyu Liang yliang@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison [based on slides from Nina Balcan] slide 1 Goals for the lecture you should understand

More information

A Fast N-Body Solver for the Poisson(-Boltzmann) Equation

A Fast N-Body Solver for the Poisson(-Boltzmann) Equation A Fast N-Body Solver for the Poisson(-Boltzmann) Equation Robert D. Skeel Departments of Computer Science (and Mathematics) Purdue University http://bionum.cs.purdue.edu/2008december.pdf 1 Thesis Poisson(-Boltzmann)

More information

Neural networks and optimization

Neural networks and optimization Neural networks and optimization Nicolas Le Roux INRIA 8 Nov 2011 Nicolas Le Roux (INRIA) Neural networks and optimization 8 Nov 2011 1 / 80 1 Introduction 2 Linear classifier 3 Convolutional neural networks

More information

The spectral transform method

The spectral transform method The spectral transform method by Nils Wedi European Centre for Medium-Range Weather Forecasts wedi@ecmwf.int Advanced Numerical Methods for Earth-System Modelling Slide 1 Advanced Numerical Methods for

More information

Math 671: Tensor Train decomposition methods

Math 671: Tensor Train decomposition methods Math 671: Eduardo Corona 1 1 University of Michigan at Ann Arbor December 8, 2016 Table of Contents 1 Preliminaries and goal 2 Unfolding matrices for tensorized arrays The Tensor Train decomposition 3

More information

RESEARCH STATEMENT MANAS RACHH

RESEARCH STATEMENT MANAS RACHH RESEARCH STATEMENT MANAS RACHH A dominant theme in my work is the development of numerical methods for the solution of the partial differential equations (PDEs) of mathematical physics using integral equation-based

More information

Principal components analysis COMS 4771

Principal components analysis COMS 4771 Principal components analysis COMS 4771 1. Representation learning Useful representations of data Representation learning: Given: raw feature vectors x 1, x 2,..., x n R d. Goal: learn a useful feature

More information

FastMMLib: a generic Fast Multipole Method library. Eric DARRIGRAND. IRMAR Université de Rennes 1

FastMMLib: a generic Fast Multipole Method library. Eric DARRIGRAND. IRMAR Université de Rennes 1 FastMMLib: a generic Fast Multipole Method library Eric DARRIGRAND joint work with Yvon LAFRANCHE IRMAR Université de Rennes 1 Outline Introduction to FMM Historically SVD principle A first example of

More information

Methods for sparse analysis of high-dimensional data, II

Methods for sparse analysis of high-dimensional data, II Methods for sparse analysis of high-dimensional data, II Rachel Ward May 26, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 55 High dimensional

More information

Randomized algorithms for the approximation of matrices

Randomized algorithms for the approximation of matrices Randomized algorithms for the approximation of matrices Luis Rademacher The Ohio State University Computer Science and Engineering (joint work with Amit Deshpande, Santosh Vempala, Grant Wang) Two topics

More information

CPSC 340 Assignment 4 (due November 17 ATE)

CPSC 340 Assignment 4 (due November 17 ATE) CPSC 340 Assignment 4 due November 7 ATE) Multi-Class Logistic The function example multiclass loads a multi-class classification datasetwith y i {,, 3, 4, 5} and fits a one-vs-all classification model

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 Exam policy: This exam allows two one-page, two-sided cheat sheets; No other materials. Time: 2 hours. Be sure to write your name and

More information

Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials

Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials Philipp Krähenbühl and Vladlen Koltun Stanford University Presenter: Yuan-Ting Hu 1 Conditional Random Field (CRF) E x I = φ u

More information

Fast Multipole Methods

Fast Multipole Methods An Introduction to Fast Multipole Methods Ramani Duraiswami Institute for Advanced Computer Studies University of Maryland, College Park http://www.umiacs.umd.edu/~ramani Joint work with Nail A. Gumerov

More information

CPSC 340: Machine Learning and Data Mining. Sparse Matrix Factorization Fall 2018

CPSC 340: Machine Learning and Data Mining. Sparse Matrix Factorization Fall 2018 CPSC 340: Machine Learning and Data Mining Sparse Matrix Factorization Fall 2018 Last Time: PCA with Orthogonal/Sequential Basis When k = 1, PCA has a scaling problem. When k > 1, have scaling, rotation,

More information

Processing Big Data Matrix Sketching

Processing Big Data Matrix Sketching Processing Big Data Matrix Sketching Dimensionality reduction Linear Principal Component Analysis: SVD-based Compressed sensing Matrix sketching Non-linear Kernel PCA Isometric mapping Matrix sketching

More information

Analysis of Spectral Kernel Design based Semi-supervised Learning

Analysis of Spectral Kernel Design based Semi-supervised Learning Analysis of Spectral Kernel Design based Semi-supervised Learning Tong Zhang IBM T. J. Watson Research Center Yorktown Heights, NY 10598 Rie Kubota Ando IBM T. J. Watson Research Center Yorktown Heights,

More information

Final Exam, Machine Learning, Spring 2009

Final Exam, Machine Learning, Spring 2009 Name: Andrew ID: Final Exam, 10701 Machine Learning, Spring 2009 - The exam is open-book, open-notes, no electronics other than calculators. - The maximum possible score on this exam is 100. You have 3

More information

Integral Equations Methods: Fast Algorithms and Applications

Integral Equations Methods: Fast Algorithms and Applications Integral Equations Methods: Fast Algorithms and Applications Alexander Barnett (Dartmouth College), Leslie Greengard (New York University), Shidong Jiang (New Jersey Institute of Technology), Mary Catherine

More information

Data Mining Techniques

Data Mining Techniques Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 12 Jan-Willem van de Meent (credit: Yijun Zhao, Percy Liang) DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Linear Dimensionality

More information

A fast randomized algorithm for the approximation of matrices preliminary report

A fast randomized algorithm for the approximation of matrices preliminary report DRAFT A fast randomized algorithm for the approximation of matrices preliminary report Yale Department of Computer Science Technical Report #1380 Franco Woolfe, Edo Liberty, Vladimir Rokhlin, and Mark

More information

Introduction to the Tensor Train Decomposition and Its Applications in Machine Learning

Introduction to the Tensor Train Decomposition and Its Applications in Machine Learning Introduction to the Tensor Train Decomposition and Its Applications in Machine Learning Anton Rodomanov Higher School of Economics, Russia Bayesian methods research group (http://bayesgroup.ru) 14 March

More information

Fast evaluation of mixed derivatives and calculation of optimal weights for integration. Hernan Leovey

Fast evaluation of mixed derivatives and calculation of optimal weights for integration. Hernan Leovey Fast evaluation of mixed derivatives and calculation of optimal weights for integration Humboldt Universität zu Berlin 02.14.2012 MCQMC2012 Tenth International Conference on Monte Carlo and Quasi Monte

More information

Expectation Maximization

Expectation Maximization Expectation Maximization Machine Learning CSE546 Carlos Guestrin University of Washington November 13, 2014 1 E.M.: The General Case E.M. widely used beyond mixtures of Gaussians The recipe is the same

More information

unsupervised learning

unsupervised learning unsupervised learning INF5860 Machine Learning for Image Analysis Ole-Johan Skrede 18.04.2018 University of Oslo Messages Mandatory exercise 3 is hopefully out next week. No lecture next week. But there

More information

Nonlinear Methods. Data often lies on or near a nonlinear low-dimensional curve aka manifold.

Nonlinear Methods. Data often lies on or near a nonlinear low-dimensional curve aka manifold. Nonlinear Methods Data often lies on or near a nonlinear low-dimensional curve aka manifold. 27 Laplacian Eigenmaps Linear methods Lower-dimensional linear projection that preserves distances between all

More information

Interpreting Deep Classifiers

Interpreting Deep Classifiers Ruprecht-Karls-University Heidelberg Faculty of Mathematics and Computer Science Seminar: Explainable Machine Learning Interpreting Deep Classifiers by Visual Distillation of Dark Knowledge Author: Daniela

More information

Fast multipole boundary element method for the analysis of plates with many holes

Fast multipole boundary element method for the analysis of plates with many holes Arch. Mech., 59, 4 5, pp. 385 401, Warszawa 2007 Fast multipole boundary element method for the analysis of plates with many holes J. PTASZNY, P. FEDELIŃSKI Department of Strength of Materials and Computational

More information

Fast Matrix Computations via Randomized Sampling. Gunnar Martinsson, The University of Colorado at Boulder

Fast Matrix Computations via Randomized Sampling. Gunnar Martinsson, The University of Colorado at Boulder Fast Matrix Computations via Randomized Sampling Gunnar Martinsson, The University of Colorado at Boulder Computational science background One of the principal developments in science and engineering over

More information

Regression. Goal: Learn a mapping from observations (features) to continuous labels given a training set (supervised learning)

Regression. Goal: Learn a mapping from observations (features) to continuous labels given a training set (supervised learning) Linear Regression Regression Goal: Learn a mapping from observations (features) to continuous labels given a training set (supervised learning) Example: Height, Gender, Weight Shoe Size Audio features

More information

Harmonic Analysis and Geometries of Digital Data Bases

Harmonic Analysis and Geometries of Digital Data Bases Harmonic Analysis and Geometries of Digital Data Bases AMS Session Special Sesson on the Mathematics of Information and Knowledge, Ronald Coifman (Yale) and Matan Gavish (Stanford, Yale) January 14, 2010

More information

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data.

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data. Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two

More information

Principal Component Analysis

Principal Component Analysis Machine Learning Michaelmas 2017 James Worrell Principal Component Analysis 1 Introduction 1.1 Goals of PCA Principal components analysis (PCA) is a dimensionality reduction technique that can be used

More information

Using SVD to Recommend Movies

Using SVD to Recommend Movies Michael Percy University of California, Santa Cruz Last update: December 12, 2009 Last update: December 12, 2009 1 / Outline 1 Introduction 2 Singular Value Decomposition 3 Experiments 4 Conclusion Last

More information

Regression. Goal: Learn a mapping from observations (features) to continuous labels given a training set (supervised learning)

Regression. Goal: Learn a mapping from observations (features) to continuous labels given a training set (supervised learning) Linear Regression Regression Goal: Learn a mapping from observations (features) to continuous labels given a training set (supervised learning) Example: Height, Gender, Weight Shoe Size Audio features

More information

December 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis

December 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis .. December 20, 2013 Todays lecture. (PCA) (PLS-R) (LDA) . (PCA) is a method often used to reduce the dimension of a large dataset to one of a more manageble size. The new dataset can then be used to make

More information

TENSOR LAYERS FOR COMPRESSION OF DEEP LEARNING NETWORKS. Cris Cecka Senior Research Scientist, NVIDIA GTC 2018

TENSOR LAYERS FOR COMPRESSION OF DEEP LEARNING NETWORKS. Cris Cecka Senior Research Scientist, NVIDIA GTC 2018 TENSOR LAYERS FOR COMPRESSION OF DEEP LEARNING NETWORKS Cris Cecka Senior Research Scientist, NVIDIA GTC 2018 Tensors Computations and the GPU AGENDA Tensor Networks and Decompositions Tensor Layers in

More information

Nature Biotechnology: doi: /nbt Supplementary Figure 1

Nature Biotechnology: doi: /nbt Supplementary Figure 1 Supplementary Figure 1 MNN corrects nonconstant batch effects. By using locally linear corrections, MNN can handle non-constant batch effects, here simulated as a small angle rotation of data on twodimensional

More information

( nonlinear constraints)

( nonlinear constraints) Wavelet Design & Applications Basic requirements: Admissibility (single constraint) Orthogonality ( nonlinear constraints) Sparse Representation Smooth functions well approx. by Fourier High-frequency

More information

Generalized fast multipole method

Generalized fast multipole method Home Search Collections Journals About Contact us My IOPscience Generalized fast multipole method This article has been downloaded from IOPscience. Please scroll down to see the full text article. 2 IOP

More information

Dimension reduction methods: Algorithms and Applications Yousef Saad Department of Computer Science and Engineering University of Minnesota

Dimension reduction methods: Algorithms and Applications Yousef Saad Department of Computer Science and Engineering University of Minnesota Dimension reduction methods: Algorithms and Applications Yousef Saad Department of Computer Science and Engineering University of Minnesota Université du Littoral- Calais July 11, 16 First..... to the

More information

Chenhan D. Yu The 3rd BLIS Retreat Sep 28, 2015

Chenhan D. Yu The 3rd BLIS Retreat Sep 28, 2015 GSKS GSKNN BLIS-Based High Performance Computing Kernels in N-body Problems Chenhan D. Yu The 3rd BLIS Retreat Sep 28, 2015 N-body Problems Hellstorm Astronomy and 3D https://youtu.be/bllwkx_mrfk 2 N-body

More information

FARSI CHARACTER RECOGNITION USING NEW HYBRID FEATURE EXTRACTION METHODS

FARSI CHARACTER RECOGNITION USING NEW HYBRID FEATURE EXTRACTION METHODS FARSI CHARACTER RECOGNITION USING NEW HYBRID FEATURE EXTRACTION METHODS Fataneh Alavipour 1 and Ali Broumandnia 2 1 Department of Electrical, Computer and Biomedical Engineering, Qazvin branch, Islamic

More information

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations. Previously Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations y = Ax Or A simply represents data Notion of eigenvectors,

More information

Singular value decomposition (SVD) of large random matrices. India, 2010

Singular value decomposition (SVD) of large random matrices. India, 2010 Singular value decomposition (SVD) of large random matrices Marianna Bolla Budapest University of Technology and Economics marib@math.bme.hu India, 2010 Motivation New challenge of multivariate statistics:

More information

Fast Multipole BEM for Structural Acoustics Simulation

Fast Multipole BEM for Structural Acoustics Simulation Fast Boundary Element Methods in Industrial Applications Fast Multipole BEM for Structural Acoustics Simulation Matthias Fischer and Lothar Gaul Institut A für Mechanik, Universität Stuttgart, Germany

More information

Math 307 Learning Goals. March 23, 2010

Math 307 Learning Goals. March 23, 2010 Math 307 Learning Goals March 23, 2010 Course Description The course presents core concepts of linear algebra by focusing on applications in Science and Engineering. Examples of applications from recent

More information

Singular Value Decompsition

Singular Value Decompsition Singular Value Decompsition Massoud Malek One of the most useful results from linear algebra, is a matrix decomposition known as the singular value decomposition It has many useful applications in almost

More information

Green s Functions, Boundary Integral Equations and Rotational Symmetry

Green s Functions, Boundary Integral Equations and Rotational Symmetry Green s Functions, Boundary Integral Equations and Rotational Symmetry...or, How to Construct a Fast Solver for Stokes Equation Saibal De Advisor: Shravan Veerapaneni University of Michigan, Ann Arbor

More information

CS598 Machine Learning in Computational Biology (Lecture 5: Matrix - part 2) Professor Jian Peng Teaching Assistant: Rongda Zhu

CS598 Machine Learning in Computational Biology (Lecture 5: Matrix - part 2) Professor Jian Peng Teaching Assistant: Rongda Zhu CS598 Machine Learning in Computational Biology (Lecture 5: Matrix - part 2) Professor Jian Peng Teaching Assistant: Rongda Zhu Feature engineering is hard 1. Extract informative features from domain knowledge

More information

CPSC 340: Machine Learning and Data Mining. Sparse Matrix Factorization Fall 2017

CPSC 340: Machine Learning and Data Mining. Sparse Matrix Factorization Fall 2017 CPSC 340: Machine Learning and Data Mining Sparse Matrix Factorization Fall 2017 Admin Assignment 4: Due Friday. Assignment 5: Posted, due Monday of last week of classes Last Time: PCA with Orthogonal/Sequential

More information

Improved Fast Gauss Transform. Fast Gauss Transform (FGT)

Improved Fast Gauss Transform. Fast Gauss Transform (FGT) 10/11/011 Improved Fast Gauss Transform Based on work by Changjiang Yang (00), Vikas Raykar (005), and Vlad Morariu (007) Fast Gauss Transform (FGT) Originally proposed by Greengard and Strain (1991) to

More information

CS 450 Numerical Analysis. Chapter 8: Numerical Integration and Differentiation

CS 450 Numerical Analysis. Chapter 8: Numerical Integration and Differentiation Lecture slides based on the textbook Scientific Computing: An Introductory Survey by Michael T. Heath, copyright c 2018 by the Society for Industrial and Applied Mathematics. http://www.siam.org/books/cl80

More information

Beyond Scalar Affinities for Network Analysis or Vector Diffusion Maps and the Connection Laplacian

Beyond Scalar Affinities for Network Analysis or Vector Diffusion Maps and the Connection Laplacian Beyond Scalar Affinities for Network Analysis or Vector Diffusion Maps and the Connection Laplacian Amit Singer Princeton University Department of Mathematics and Program in Applied and Computational Mathematics

More information

Hierarchical Matrices. Jon Cockayne April 18, 2017

Hierarchical Matrices. Jon Cockayne April 18, 2017 Hierarchical Matrices Jon Cockayne April 18, 2017 1 Sources Introduction to Hierarchical Matrices with Applications [Börm et al., 2003] 2 Sources Introduction to Hierarchical Matrices with Applications

More information

a Short Introduction

a Short Introduction Collaborative Filtering in Recommender Systems: a Short Introduction Norm Matloff Dept. of Computer Science University of California, Davis matloff@cs.ucdavis.edu December 3, 2016 Abstract There is a strong

More information

Dimensionality Reduction

Dimensionality Reduction Lecture 5 1 Outline 1. Overview a) What is? b) Why? 2. Principal Component Analysis (PCA) a) Objectives b) Explaining variability c) SVD 3. Related approaches a) ICA b) Autoencoders 2 Example 1: Sportsball

More information