Breaking Computational Barriers: Multi-GPU High-Order RBF Kernel Problems with Millions of Points

Size: px
Start display at page:

Download "Breaking Computational Barriers: Multi-GPU High-Order RBF Kernel Problems with Millions of Points"

Transcription

1 Breaking Computational Barriers: Multi-GPU High-Order RBF Kernel Problems with Millions of Points Michael Griebel Christian Rieger Peter Zaspel Institute for Numerical Simulation Rheinische Friedrich-Wilhelms-Universität Bonn GPU Technnology Conference 2014 March 24-27, 2014, San José, CA, USA

2 Outline 1 Motivation 2 Radial basis function interpolation 3 Preconditioning for large-scale kernel interpolation problems 4 Summary

3 Motivation Meshfree Interpolation interpolation reconstruction of continuous function from point evaluations meshfree evaluation points at arbitrary locations, no mesh Fields of application classical applications computer graphics, signal processing, computer vision,... large scale data analysis Big Data, data mining, machine learning,... solving PDEs by collocation methods compuational fluid dynamics, stochastic collocation,...

4 My personal motivation: Uncertainty quantification in CFD (1) Current standard in computational fluid dynamics simulations for fixed and known input parameters Problem nature phenomena: input data not known exactly engineering: constructions / measurements always subject to perturbations

5 My personal motivation: Uncertainty quantification in CFD (2) Algorithmic idea 1 sampling of stochastic input parameters according to some distribution 2 computation of hundreds or thousands of stochastic realizations (high-resolution simulations) 3 extraction of averaged data (expectation value), variance information,... as post-processing step CFD solver E[u] sampling of stochatic space config. file generator CFD solver stochastics tool Var[u]. CFD solver Cov[u] K.-L.

6 Outline 1 Motivation 2 Radial basis function interpolation 3 Preconditioning for large-scale kernel interpolation problems 4 Summary

7 Basic facts Interpolation problem given: function f : Ω R, sampling points, X := {y 1,... y N } Ω, Ω R d target: function s f,x : Ω R such that Radial basis functions s f,x (y j ) = f (y j ) for all j = 1,..., N Gaussian: φ j (y) := e ɛ2 y y j 2 Matérn function: φ j (y) := K β d ( y y j ) y y j β d 2 2, β > d 2 β 1 Γ(β) 2, Kernel functions functions k of type k : Ω Ω R radial basis function case: k(y, y j ) := ψ( y y j ), e.g. ψ(r) = e ɛ2 r 2

8 Interpolation with kernel functions (1) Kernel-based Interpolation problem F Hilbert function space, f : Ω R, points w. func. eval.: X := {y 1,..., y N } Ω, f j := f (y j ) j = 1... N. for kernel function k : R R R looking for s f,x F with N s f,x (y) := α j k(y, y j ) y Ω j=1 Solution of interpolation problem s f,x (y j ) = f j, 1 j N. A k,x α = f k(y 1, y 1 ) k(y 1, y N ) A k,x = k(y N, y 1 ) k(y N, y N ), f = f (y 1 ). f (y N )

9 Interpolation with kernel functions (2) Interpolation by Lagrange basis s f,x (y) = N f(y i )L i (y), L i (y) = i=1 {L i } N i=1, L i : Γ R, with L i (y j ) = N αjk(y, i y j ) j=1 { 1 i = j 0 i j Construction of Lagrange basis A L matrix of coefficients A L := ( α i j ) N j,i=1 A L = A 1 k,x Γ

10 Error estimates (in native spaces) Requirement f N kɛ (Ω), Ω cube in R s Definition (Fill distance) h X,Ω := sup min y y i 2 y j X y Ω Theorem (Gaussian kernel k ɛ (y i, y j ) = e ɛ2 y i y j 2 ) Theorem (Matérn kernels) s X,f c log h X,Ω f s X,f L (Ω) e h X,Ω f Nkɛ (Ω) D α f (y) D α s f,x (y) Ch β d/2 α X,Ω f Nkd,k (Ω) (1) function f interpolated by Lagrange interpolation with collocation points X

11 Outline 1 Motivation 2 Radial basis function interpolation 3 Preconditioning for large-scale kernel interpolation problems 4 Summary

12 Preconditioning motivation (1) Objective solution of special dense linear systems with unknowns Standard approach A k,x α = f k(y 1, y 1 ) k(y 1, y N ) A k,x = k(y N, y 1 ) k(y N, y N ), f = solution of dense linear system by direct factorization complexity O(N 3 ) f (y 1 ). f (y N )

13 Preconditioning motivation (2) Iterative approach Krylov iterative linear solver such as CG or BiCG for dense matrices still complexity O(N 3 ), but... Preconditioned iterative approach use of Krylov iterative solver for dense linear system preconditioner based on localized Lagrange basis functions often few or even constant number of iterations possible optional: fast multipole (FFM) for dense MatVec product final complexity in optimal case: w. FFM O(N log(n)), w/o FFM O(N 2 ) [Beatson, Cherrie, Mouat, 1999], [Faul, Powell, 2000], [Gumerov, Duraiswami, 2007], [Fuselier, Hangelbroek, Narcowich, Ward, Wright, 2012]

14 Preconditioning idea Algorithm for each point: find lokal point neighborhood / subset solve interpolation problem on local neighborhood construct local Lagrange basis use local solutions as preconditioner for full system Properties many small problems very local source: [Fuselier et. al., 2012]

15 Choice of local subsets subsets by radius X i points within given radius of y i non-fixed size of subset clear geometric view subset by next neighbors X i points next n neighbors of y i fixed size of subset geometric view unclear n = κ log(n) 2 source: [Fuselier et. al., 2012]

16 Approximated / localized Lagrange basis N N s f,x (y) = f(y i )L i (y), L i (y) = αjk(y, i y j ) i=1 {L i } N i=1, L i : Γ R, with L i (y j ) = j=1 { 1 i = j 0 i j Local subsets X i X i X, Xi := {y i1,..., y ini }, s.th. y i X i, i {1,..., N} } N Approximate / localized Lagrange basis { Li i=1 Li (y) = { 1 if y = yi 0 if y X i \ y i, Li (y) := N i j=1 α i jk(y, y ij ), Li N k (Ω) A k, Xi α i = e i i = 1... N

17 Preconditioning by localized Lagrange basis Li (y) = { 1 if y = yi 0 if y X i \ y i, Li (y) := N i α jk(y, i y ij ) j=1 Coefficient matrix describing localized Lagrange basis A L := (a ik ) N i,k=1, with a i,k := maximum of X i non-zero entries in row i { α i j if k = i j 0 otherwise or A L := (α 1 ). (α N ) Preconditioned linear system A LA k,x α = A Lf (A L =A k,x 1 A L A k,x 1 )

18 Properties for Exascale systems Locality of preconditioner local point set used for each localized Lagrange basis application of preconditioner local per construction no energy wasting global transfer operations Parallelism of preconditioner construction many similar / equally sized small problems solved in parallel without communication optimal for fine-graind parallelism, deep memory hierarchies Error resilience of overall method iterative methods error resilient by construction

19 Multi-GPU implementation (1) Iterative solver for dense linear systems parla in-house multi-gpu parallel library CUDA-aware MPI and overlapping of compuation and communication for optimal scaling iterative solvers impelemented with variable preconditioner and black-box or memory-based matrix-vector product currently available solvers: CG, CGS, Lanczos (EV problems) on-the-fly application of matrix-vector product in CUDA-kernel with no additional memory consumption (besides of points) libraries: CUB, CUDA 5.0, CUBLAS 5.0, OpenMPI Domain decomposition of points currently preprocessing step with clustering alorithm in external tool and ghost point layer

20 Multi-GPU implementation (2) Preconditioner setup (under development / optimization) setup within stochastic collocation tool brute-force GPU based knn-search (arbitrary dim. no lib.) for now: small systems solved by LU decomp. from CULA currently development for Fermi-type GPUs (sm 20) no concurrent kernel execution cublas<t>gemmbatched()? (performance for > ?) more libraries providing batched / device kernel functions? (CUB is great, but not yet feature complete / too low-level) libraries: Thrust, CUBLAS, CULA, OpenMPI Multi-GPU preconditioner setup & application purely local no par. comm.

21 First performance results runtime in min precond. solver small LUs perfect scaling number of GPUs strong scaling for interpolation points Parameters κ = 2 k: Matérn kernel solver stops at r < intp. points subset size 311 iteration count 45 overall time on 32 proc.: 8.73 min 1M intp. points subset size 381 iteration count 67 overall time on 32 proc.: min

22 Summary Meshfree interpolation / collocation needs scaling solvers RBF kernel methods with higher-order convergence Strong preconditioning techniques with high potential for Exascale systems Thank you!

Multi-GPU Parallel Numerical Methods for Uncertainty Quantification in Computational Fluid Dynamics

Multi-GPU Parallel Numerical Methods for Uncertainty Quantification in Computational Fluid Dynamics Multi-GPU Parallel Numerical Methods for Uncertainty Quantification in Computational Fluid Dynamics Michael Griebel Christian Rieger Peter Zaspel Institute for Numerical Simulation Rheinische Friedrich-Wilhelms-Universität

More information

MATH 590: Meshfree Methods

MATH 590: Meshfree Methods MATH 590: Meshfree Methods Chapter 34: Improving the Condition Number of the Interpolation Matrix Greg Fasshauer Department of Applied Mathematics Illinois Institute of Technology Fall 2010 fasshauer@iit.edu

More information

APPROXIMATING GAUSSIAN PROCESSES

APPROXIMATING GAUSSIAN PROCESSES 1 / 23 APPROXIMATING GAUSSIAN PROCESSES WITH H 2 -MATRICES Steffen Börm 1 Jochen Garcke 2 1 Christian-Albrechts-Universität zu Kiel 2 Universität Bonn and Fraunhofer SCAI 2 / 23 OUTLINE 1 GAUSSIAN PROCESSES

More information

Parallel Asynchronous Hybrid Krylov Methods for Minimization of Energy Consumption. Langshi CHEN 1,2,3 Supervised by Serge PETITON 2

Parallel Asynchronous Hybrid Krylov Methods for Minimization of Energy Consumption. Langshi CHEN 1,2,3 Supervised by Serge PETITON 2 1 / 23 Parallel Asynchronous Hybrid Krylov Methods for Minimization of Energy Consumption Langshi CHEN 1,2,3 Supervised by Serge PETITON 2 Maison de la Simulation Lille 1 University CNRS March 18, 2013

More information

Stabilization and Acceleration of Algebraic Multigrid Method

Stabilization and Acceleration of Algebraic Multigrid Method Stabilization and Acceleration of Algebraic Multigrid Method Recursive Projection Algorithm A. Jemcov J.P. Maruszewski Fluent Inc. October 24, 2006 Outline 1 Need for Algorithm Stabilization and Acceleration

More information

Preconditioned Krylov solvers for kernel regression

Preconditioned Krylov solvers for kernel regression Preconditioned Krylov solvers for kernel regression Balaji Vasan Srinivasan 1, Qi Hu 2, Nail A. Gumerov 3, Ramani Duraiswami 3 1 Adobe Research Labs, Bangalore, India; 2 Facebook Inc., San Francisco, CA,

More information

Fast Multipole Methods: Fundamentals & Applications. Ramani Duraiswami Nail A. Gumerov

Fast Multipole Methods: Fundamentals & Applications. Ramani Duraiswami Nail A. Gumerov Fast Multipole Methods: Fundamentals & Applications Ramani Duraiswami Nail A. Gumerov Week 1. Introduction. What are multipole methods and what is this course about. Problems from physics, mathematics,

More information

A orthonormal basis for Radial Basis Function approximation

A orthonormal basis for Radial Basis Function approximation A orthonormal basis for Radial Basis Function approximation 9th ISAAC Congress Krakow, August 5-9, 2013 Gabriele Santin, joint work with S. De Marchi Department of Mathematics. Doctoral School in Mathematical

More information

Lattice Boltzmann simulations on heterogeneous CPU-GPU clusters

Lattice Boltzmann simulations on heterogeneous CPU-GPU clusters Lattice Boltzmann simulations on heterogeneous CPU-GPU clusters H. Köstler 2nd International Symposium Computer Simulations on GPU Freudenstadt, 29.05.2013 1 Contents Motivation walberla software concepts

More information

MATH 590: Meshfree Methods

MATH 590: Meshfree Methods MATH 590: Meshfree Methods Chapter 14: The Power Function and Native Space Error Estimates Greg Fasshauer Department of Applied Mathematics Illinois Institute of Technology Fall 2010 fasshauer@iit.edu

More information

MARCH 24-27, 2014 SAN JOSE, CA

MARCH 24-27, 2014 SAN JOSE, CA MARCH 24-27, 2014 SAN JOSE, CA Sparse HPC on modern architectures Important scientific applications rely on sparse linear algebra HPCG a new benchmark proposal to complement Top500 (HPL) To solve A x =

More information

Greedy Kernel Techniques with Applications to Machine Learning

Greedy Kernel Techniques with Applications to Machine Learning Greedy Kernel Techniques with Applications to Machine Learning Robert Schaback Jochen Werner Göttingen University Institut für Numerische und Angewandte Mathematik http://www.num.math.uni-goettingen.de/schaback

More information

Positive Definite Kernels: Opportunities and Challenges

Positive Definite Kernels: Opportunities and Challenges Positive Definite Kernels: Opportunities and Challenges Michael McCourt Department of Mathematical and Statistical Sciences University of Colorado, Denver CUNY Mathematics Seminar CUNY Graduate College

More information

Contents. Preface... xi. Introduction...

Contents. Preface... xi. Introduction... Contents Preface... xi Introduction... xv Chapter 1. Computer Architectures... 1 1.1. Different types of parallelism... 1 1.1.1. Overlap, concurrency and parallelism... 1 1.1.2. Temporal and spatial parallelism

More information

Faster Kinetics: Accelerate Your Finite-Rate Combustion Simulation with GPUs

Faster Kinetics: Accelerate Your Finite-Rate Combustion Simulation with GPUs Faster Kinetics: Accelerate Your Finite-Rate Combustion Simulation with GPUs Christopher P. Stone, Ph.D. Computational Science and Engineering, LLC Kyle Niemeyer, Ph.D. Oregon State University 2 Outline

More information

Algebraic Multigrid as Solvers and as Preconditioner

Algebraic Multigrid as Solvers and as Preconditioner Ò Algebraic Multigrid as Solvers and as Preconditioner Domenico Lahaye domenico.lahaye@cs.kuleuven.ac.be http://www.cs.kuleuven.ac.be/ domenico/ Department of Computer Science Katholieke Universiteit Leuven

More information

Modelling and implementation of algorithms in applied mathematics using MPI

Modelling and implementation of algorithms in applied mathematics using MPI Modelling and implementation of algorithms in applied mathematics using MPI Lecture 3: Linear Systems: Simple Iterative Methods and their parallelization, Programming MPI G. Rapin Brazil March 2011 Outline

More information

Schwarz Preconditioner for the Stochastic Finite Element Method

Schwarz Preconditioner for the Stochastic Finite Element Method Schwarz Preconditioner for the Stochastic Finite Element Method Waad Subber 1 and Sébastien Loisel 2 Preprint submitted to DD22 conference 1 Introduction The intrusive polynomial chaos approach for uncertainty

More information

Case Study: Quantum Chromodynamics

Case Study: Quantum Chromodynamics Case Study: Quantum Chromodynamics Michael Clark Harvard University with R. Babich, K. Barros, R. Brower, J. Chen and C. Rebbi Outline Primer to QCD QCD on a GPU Mixed Precision Solvers Multigrid solver

More information

Meshfree Approximation Methods with MATLAB

Meshfree Approximation Methods with MATLAB Interdisciplinary Mathematical Sc Meshfree Approximation Methods with MATLAB Gregory E. Fasshauer Illinois Institute of Technology, USA Y f? World Scientific NEW JERSEY LONDON SINGAPORE BEIJING SHANGHAI

More information

Stability of Kernel Based Interpolation

Stability of Kernel Based Interpolation Stability of Kernel Based Interpolation Stefano De Marchi Department of Computer Science, University of Verona (Italy) Robert Schaback Institut für Numerische und Angewandte Mathematik, University of Göttingen

More information

Kernel-based Approximation. Methods using MATLAB. Gregory Fasshauer. Interdisciplinary Mathematical Sciences. Michael McCourt.

Kernel-based Approximation. Methods using MATLAB. Gregory Fasshauer. Interdisciplinary Mathematical Sciences. Michael McCourt. SINGAPORE SHANGHAI Vol TAIPEI - Interdisciplinary Mathematical Sciences 19 Kernel-based Approximation Methods using MATLAB Gregory Fasshauer Illinois Institute of Technology, USA Michael McCourt University

More information

Boundary Value Problems - Solving 3-D Finite-Difference problems Jacob White

Boundary Value Problems - Solving 3-D Finite-Difference problems Jacob White Introduction to Simulation - Lecture 2 Boundary Value Problems - Solving 3-D Finite-Difference problems Jacob White Thanks to Deepak Ramaswamy, Michal Rewienski, and Karen Veroy Outline Reminder about

More information

OUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative methods ffl Krylov subspace methods ffl Preconditioning techniques: Iterative methods ILU

OUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative methods ffl Krylov subspace methods ffl Preconditioning techniques: Iterative methods ILU Preconditioning Techniques for Solving Large Sparse Linear Systems Arnold Reusken Institut für Geometrie und Praktische Mathematik RWTH-Aachen OUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative

More information

MATH 590: Meshfree Methods

MATH 590: Meshfree Methods MATH 590: Meshfree Methods Chapter 2: Radial Basis Function Interpolation in MATLAB Greg Fasshauer Department of Applied Mathematics Illinois Institute of Technology Fall 2010 fasshauer@iit.edu MATH 590

More information

Course Requirements. Course Mechanics. Projects & Exams. Homework. Week 1. Introduction. Fast Multipole Methods: Fundamentals & Applications

Course Requirements. Course Mechanics. Projects & Exams. Homework. Week 1. Introduction. Fast Multipole Methods: Fundamentals & Applications Week 1. Introduction. Fast Multipole Methods: Fundamentals & Applications Ramani Duraiswami Nail A. Gumerov What are multipole methods and what is this course about. Problems from phsics, mathematics,

More information

Towards a highly-parallel PDE-Solver using Adaptive Sparse Grids on Compute Clusters

Towards a highly-parallel PDE-Solver using Adaptive Sparse Grids on Compute Clusters Towards a highly-parallel PDE-Solver using Adaptive Sparse Grids on Compute Clusters HIM - Workshop on Sparse Grids and Applications Alexander Heinecke Chair of Scientific Computing May 18 th 2011 HIM

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) Lecture 19: Computing the SVD; Sparse Linear Systems Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical

More information

Iterative Methods and Multigrid

Iterative Methods and Multigrid Iterative Methods and Multigrid Part 3: Preconditioning 2 Eric de Sturler Preconditioning The general idea behind preconditioning is that convergence of some method for the linear system Ax = b can be

More information

MATH 590: Meshfree Methods

MATH 590: Meshfree Methods MATH 590: Meshfree Methods Chapter 33: Adaptive Iteration Greg Fasshauer Department of Applied Mathematics Illinois Institute of Technology Fall 2010 fasshauer@iit.edu MATH 590 Chapter 33 1 Outline 1 A

More information

Gaussian Processes (10/16/13)

Gaussian Processes (10/16/13) STA561: Probabilistic machine learning Gaussian Processes (10/16/13) Lecturer: Barbara Engelhardt Scribes: Changwei Hu, Di Jin, Mengdi Wang 1 Introduction In supervised learning, we observe some inputs

More information

Recent progress on boundary effects in kernel approximation

Recent progress on boundary effects in kernel approximation Recent progress on boundary effects in kernel approximation Thomas Hangelbroek University of Hawaii at Manoa FoCM 2014 work supported by: NSF DMS-1413726 Boundary effects and error estimates Radial basis

More information

RBF-FD Approximation to Solve Poisson Equation in 3D

RBF-FD Approximation to Solve Poisson Equation in 3D RBF-FD Approximation to Solve Poisson Equation in 3D Jagadeeswaran.R March 14, 2014 1 / 28 Overview Problem Setup Generalized finite difference method. Uses numerical differentiations generated by Gaussian

More information

MATH 590: Meshfree Methods

MATH 590: Meshfree Methods MATH 590: Meshfree Methods Chapter 33: Adaptive Iteration Greg Fasshauer Department of Applied Mathematics Illinois Institute of Technology Fall 2010 fasshauer@iit.edu MATH 590 Chapter 33 1 Outline 1 A

More information

Scalable machine learning for massive datasets: Fast summation algorithms

Scalable machine learning for massive datasets: Fast summation algorithms Scalable machine learning for massive datasets: Fast summation algorithms Getting good enough solutions as fast as possible Vikas Chandrakant Raykar vikas@cs.umd.edu University of Maryland, CollegePark

More information

MATH 590: Meshfree Methods

MATH 590: Meshfree Methods MATH 590: Meshfree Methods Chapter 1 Part 3: Radial Basis Function Interpolation in MATLAB Greg Fasshauer Department of Applied Mathematics Illinois Institute of Technology Fall 2014 fasshauer@iit.edu

More information

From Direct to Iterative Substructuring: some Parallel Experiences in 2 and 3D

From Direct to Iterative Substructuring: some Parallel Experiences in 2 and 3D From Direct to Iterative Substructuring: some Parallel Experiences in 2 and 3D Luc Giraud N7-IRIT, Toulouse MUMPS Day October 24, 2006, ENS-INRIA, Lyon, France Outline 1 General Framework 2 The direct

More information

Multigrid Methods and their application in CFD

Multigrid Methods and their application in CFD Multigrid Methods and their application in CFD Michael Wurst TU München 16.06.2009 1 Multigrid Methods Definition Multigrid (MG) methods in numerical analysis are a group of algorithms for solving differential

More information

Kernel Method: Data Analysis with Positive Definite Kernels

Kernel Method: Data Analysis with Positive Definite Kernels Kernel Method: Data Analysis with Positive Definite Kernels 2. Positive Definite Kernel and Reproducing Kernel Hilbert Space Kenji Fukumizu The Institute of Statistical Mathematics. Graduate University

More information

TR A Comparison of the Performance of SaP::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems

TR A Comparison of the Performance of SaP::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems TR-0-07 A Comparison of the Performance of ::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems Ang Li, Omkar Deshmukh, Radu Serban, Dan Negrut May, 0 Abstract ::GPU is a

More information

Kernel Methods. Charles Elkan October 17, 2007

Kernel Methods. Charles Elkan October 17, 2007 Kernel Methods Charles Elkan elkan@cs.ucsd.edu October 17, 2007 Remember the xor example of a classification problem that is not linearly separable. If we map every example into a new representation, then

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

A Recursive Trust-Region Method for Non-Convex Constrained Minimization

A Recursive Trust-Region Method for Non-Convex Constrained Minimization A Recursive Trust-Region Method for Non-Convex Constrained Minimization Christian Groß 1 and Rolf Krause 1 Institute for Numerical Simulation, University of Bonn. {gross,krause}@ins.uni-bonn.de 1 Introduction

More information

Scalable Domain Decomposition Preconditioners For Heterogeneous Elliptic Problems

Scalable Domain Decomposition Preconditioners For Heterogeneous Elliptic Problems Scalable Domain Decomposition Preconditioners For Heterogeneous Elliptic Problems Pierre Jolivet, F. Hecht, F. Nataf, C. Prud homme Laboratoire Jacques-Louis Lions Laboratoire Jean Kuntzmann INRIA Rocquencourt

More information

Karhunen-Loève Approximation of Random Fields Using Hierarchical Matrix Techniques

Karhunen-Loève Approximation of Random Fields Using Hierarchical Matrix Techniques Institut für Numerische Mathematik und Optimierung Karhunen-Loève Approximation of Random Fields Using Hierarchical Matrix Techniques Oliver Ernst Computational Methods with Applications Harrachov, CR,

More information

GPU accelerated Arnoldi solver for small batched matrix

GPU accelerated Arnoldi solver for small batched matrix 15. 09. 22 GPU accelerated Arnoldi solver for small batched matrix Samsung Advanced Institute of Technology Hyung-Jin Kim Contents - Eigen value problems - Solution - Arnoldi Algorithm - Target - CUDA

More information

Distances & Similarities

Distances & Similarities Introduction to Data Mining Distances & Similarities CPSC/AMTH 445a/545a Guy Wolf guy.wolf@yale.edu Yale University Fall 2016 CPSC 445 (Guy Wolf) Distances & Similarities Yale - Fall 2016 1 / 22 Outline

More information

Radial basis function partition of unity methods for PDEs

Radial basis function partition of unity methods for PDEs Radial basis function partition of unity methods for PDEs Elisabeth Larsson, Scientific Computing, Uppsala University Credit goes to a number of collaborators Alfa Ali Alison Lina Victor Igor Heryudono

More information

Linear Solvers. Andrew Hazel

Linear Solvers. Andrew Hazel Linear Solvers Andrew Hazel Introduction Thus far we have talked about the formulation and discretisation of physical problems...... and stopped when we got to a discrete linear system of equations. Introduction

More information

Hierarchical Parallel Solution of Stochastic Systems

Hierarchical Parallel Solution of Stochastic Systems Hierarchical Parallel Solution of Stochastic Systems Second M.I.T. Conference on Computational Fluid and Solid Mechanics Contents: Simple Model of Stochastic Flow Stochastic Galerkin Scheme Resulting Equations

More information

Efficient implementation of the overlap operator on multi-gpus

Efficient implementation of the overlap operator on multi-gpus Efficient implementation of the overlap operator on multi-gpus Andrei Alexandru Mike Lujan, Craig Pelissier, Ben Gamari, Frank Lee SAAHPC 2011 - University of Tennessee Outline Motivation Overlap operator

More information

LA Support for Scalable Kernel Methods. David Bindel 29 Sep 2018

LA Support for Scalable Kernel Methods. David Bindel 29 Sep 2018 LA Support for Scalable Kernel Methods David Bindel 29 Sep 2018 Collaborators Kun Dong (Cornell CAM) David Eriksson (Cornell CAM) Jake Gardner (Cornell CS) Eric Lee (Cornell CS) Hannes Nickisch (Phillips

More information

A High-Performance Parallel Hybrid Method for Large Sparse Linear Systems

A High-Performance Parallel Hybrid Method for Large Sparse Linear Systems Outline A High-Performance Parallel Hybrid Method for Large Sparse Linear Systems Azzam Haidar CERFACS, Toulouse joint work with Luc Giraud (N7-IRIT, France) and Layne Watson (Virginia Polytechnic Institute,

More information

Solving the 3D Laplace Equation by Meshless Collocation via Harmonic Kernels

Solving the 3D Laplace Equation by Meshless Collocation via Harmonic Kernels Solving the 3D Laplace Equation by Meshless Collocation via Harmonic Kernels Y.C. Hon and R. Schaback April 9, Abstract This paper solves the Laplace equation u = on domains Ω R 3 by meshless collocation

More information

DIRECT ERROR BOUNDS FOR SYMMETRIC RBF COLLOCATION

DIRECT ERROR BOUNDS FOR SYMMETRIC RBF COLLOCATION Meshless Methods in Science and Engineering - An International Conference Porto, 22 DIRECT ERROR BOUNDS FOR SYMMETRIC RBF COLLOCATION Robert Schaback Institut für Numerische und Angewandte Mathematik (NAM)

More information

Preface to the Second Edition. Preface to the First Edition

Preface to the Second Edition. Preface to the First Edition n page v Preface to the Second Edition Preface to the First Edition xiii xvii 1 Background in Linear Algebra 1 1.1 Matrices................................. 1 1.2 Square Matrices and Eigenvalues....................

More information

NVIDIA MPI-enabled Iterative Solvers for Large Scale Problems. Joe Eaton Manager, AmgX CUDA Library NVIDIA

NVIDIA MPI-enabled Iterative Solvers for Large Scale Problems. Joe Eaton Manager, AmgX CUDA Library NVIDIA NVIDIA MPI-enabled Iterative Solvers for Large Scale Problems Joe Eaton Manager, AmgX CUDA Library NVIDIA ANSYS Fluent Fluent control flow Accelerate this first Non-linear iterations Assemble Linear System

More information

Scalable Non-blocking Preconditioned Conjugate Gradient Methods

Scalable Non-blocking Preconditioned Conjugate Gradient Methods Scalable Non-blocking Preconditioned Conjugate Gradient Methods Paul Eller and William Gropp University of Illinois at Urbana-Champaign Department of Computer Science Supercomputing 16 Paul Eller and William

More information

Computers and Mathematics with Applications

Computers and Mathematics with Applications Computers and Mathematics with Applications 68 (2014) 1151 1160 Contents lists available at ScienceDirect Computers and Mathematics with Applications journal homepage: www.elsevier.com/locate/camwa A GPU

More information

Scalable kernel methods and their use in black-box optimization

Scalable kernel methods and their use in black-box optimization with derivatives Scalable kernel methods and their use in black-box optimization David Eriksson Center for Applied Mathematics Cornell University dme65@cornell.edu November 9, 2018 1 2 3 4 1/37 with derivatives

More information

Solving Ax = b, an overview. Program

Solving Ax = b, an overview. Program Numerical Linear Algebra Improving iterative solvers: preconditioning, deflation, numerical software and parallelisation Gerard Sleijpen and Martin van Gijzen November 29, 27 Solving Ax = b, an overview

More information

Solving Symmetric Indefinite Systems with Symmetric Positive Definite Preconditioners

Solving Symmetric Indefinite Systems with Symmetric Positive Definite Preconditioners Solving Symmetric Indefinite Systems with Symmetric Positive Definite Preconditioners Eugene Vecharynski 1 Andrew Knyazev 2 1 Department of Computer Science and Engineering University of Minnesota 2 Department

More information

J.I. Aliaga 1 M. Bollhöfer 2 A.F. Martín 1 E.S. Quintana-Ortí 1. March, 2009

J.I. Aliaga 1 M. Bollhöfer 2 A.F. Martín 1 E.S. Quintana-Ortí 1. March, 2009 Parallel Preconditioning of Linear Systems based on ILUPACK for Multithreaded Architectures J.I. Aliaga M. Bollhöfer 2 A.F. Martín E.S. Quintana-Ortí Deparment of Computer Science and Engineering, Univ.

More information

Open-source finite element solver for domain decomposition problems

Open-source finite element solver for domain decomposition problems 1/29 Open-source finite element solver for domain decomposition problems C. Geuzaine 1, X. Antoine 2,3, D. Colignon 1, M. El Bouajaji 3,2 and B. Thierry 4 1 - University of Liège, Belgium 2 - University

More information

Randomized Algorithms

Randomized Algorithms Randomized Algorithms Saniv Kumar, Google Research, NY EECS-6898, Columbia University - Fall, 010 Saniv Kumar 9/13/010 EECS6898 Large Scale Machine Learning 1 Curse of Dimensionality Gaussian Mixture Models

More information

Accelerating linear algebra computations with hybrid GPU-multicore systems.

Accelerating linear algebra computations with hybrid GPU-multicore systems. Accelerating linear algebra computations with hybrid GPU-multicore systems. Marc Baboulin INRIA/Université Paris-Sud joint work with Jack Dongarra (University of Tennessee and Oak Ridge National Laboratory)

More information

Stability constants for kernel-based interpolation processes

Stability constants for kernel-based interpolation processes Dipartimento di Informatica Università degli Studi di Verona Rapporto di ricerca Research report 59 Stability constants for kernel-based interpolation processes Stefano De Marchi Robert Schaback Dipartimento

More information

A User Friendly Toolbox for Parallel PDE-Solvers

A User Friendly Toolbox for Parallel PDE-Solvers A User Friendly Toolbox for Parallel PDE-Solvers Gundolf Haase Institut for Mathematics and Scientific Computing Karl-Franzens University of Graz Manfred Liebmann Mathematics in Sciences Max-Planck-Institute

More information

COURSE DESCRIPTIONS. 1 of 5 8/21/2008 3:15 PM. (S) = Spring and (F) = Fall. All courses are 3 semester hours, unless otherwise noted.

COURSE DESCRIPTIONS. 1 of 5 8/21/2008 3:15 PM. (S) = Spring and (F) = Fall. All courses are 3 semester hours, unless otherwise noted. 1 of 5 8/21/2008 3:15 PM COURSE DESCRIPTIONS (S) = Spring and (F) = Fall All courses are 3 semester hours, unless otherwise noted. INTRODUCTORY COURSES: CAAM 210 (BOTH) INTRODUCTION TO ENGINEERING COMPUTATION

More information

Functional Gradient Descent

Functional Gradient Descent Statistical Techniques in Robotics (16-831, F12) Lecture #21 (Nov 14, 2012) Functional Gradient Descent Lecturer: Drew Bagnell Scribe: Daniel Carlton Smith 1 1 Goal of Functional Gradient Descent We have

More information

Neural Networks and Deep Learning

Neural Networks and Deep Learning Neural Networks and Deep Learning Professor Ameet Talwalkar November 12, 2015 Professor Ameet Talwalkar Neural Networks and Deep Learning November 12, 2015 1 / 16 Outline 1 Review of last lecture AdaBoost

More information

Parallel programming practices for the solution of Sparse Linear Systems (motivated by computational physics and graphics)

Parallel programming practices for the solution of Sparse Linear Systems (motivated by computational physics and graphics) Parallel programming practices for the solution of Sparse Linear Systems (motivated by computational physics and graphics) Eftychios Sifakis CS758 Guest Lecture - 19 Sept 2012 Introduction Linear systems

More information

Some Geometric and Algebraic Aspects of Domain Decomposition Methods

Some Geometric and Algebraic Aspects of Domain Decomposition Methods Some Geometric and Algebraic Aspects of Domain Decomposition Methods D.S.Butyugin 1, Y.L.Gurieva 1, V.P.Ilin 1,2, and D.V.Perevozkin 1 Abstract Some geometric and algebraic aspects of various domain decomposition

More information

ITERATIVE METHODS FOR SPARSE LINEAR SYSTEMS

ITERATIVE METHODS FOR SPARSE LINEAR SYSTEMS ITERATIVE METHODS FOR SPARSE LINEAR SYSTEMS YOUSEF SAAD University of Minnesota PWS PUBLISHING COMPANY I(T)P An International Thomson Publishing Company BOSTON ALBANY BONN CINCINNATI DETROIT LONDON MADRID

More information

Fast Krylov Methods for N-Body Learning

Fast Krylov Methods for N-Body Learning Fast Krylov Methods for N-Body Learning Nando de Freitas Department of Computer Science University of British Columbia nando@cs.ubc.ca Maryam Mahdaviani Department of Computer Science University of British

More information

Efficient domain decomposition methods for the time-harmonic Maxwell equations

Efficient domain decomposition methods for the time-harmonic Maxwell equations Efficient domain decomposition methods for the time-harmonic Maxwell equations Marcella Bonazzoli 1, Victorita Dolean 2, Ivan G. Graham 3, Euan A. Spence 3, Pierre-Henri Tournier 4 1 Inria Saclay (Defi

More information

Round-off error propagation and non-determinism in parallel applications

Round-off error propagation and non-determinism in parallel applications Round-off error propagation and non-determinism in parallel applications Vincent Baudoui (Argonne/Total SA) vincent.baudoui@gmail.com Franck Cappello (Argonne/INRIA/UIUC-NCSA) Georges Oppenheim (Paris-Sud

More information

M.A. Botchev. September 5, 2014

M.A. Botchev. September 5, 2014 Rome-Moscow school of Matrix Methods and Applied Linear Algebra 2014 A short introduction to Krylov subspaces for linear systems, matrix functions and inexact Newton methods. Plan and exercises. M.A. Botchev

More information

arxiv: v1 [math.na] 26 Oct 2018

arxiv: v1 [math.na] 26 Oct 2018 KERNEL BASED STOCHASTIC COLLOCATION FOR THE RANDOM TWO PHASE NAVIER-STOKES EQUATIONS arxiv:80.70v [math.na] 6 Oct 08 M. Griebel, C. Rieger, & P. Zaspel 3, Institute for Numerical Simulation, Bonn University,

More information

PRECONDITIONING IN THE PARALLEL BLOCK-JACOBI SVD ALGORITHM

PRECONDITIONING IN THE PARALLEL BLOCK-JACOBI SVD ALGORITHM Proceedings of ALGORITMY 25 pp. 22 211 PRECONDITIONING IN THE PARALLEL BLOCK-JACOBI SVD ALGORITHM GABRIEL OKŠA AND MARIÁN VAJTERŠIC Abstract. One way, how to speed up the computation of the singular value

More information

Solution to Laplace Equation using Preconditioned Conjugate Gradient Method with Compressed Row Storage using MPI

Solution to Laplace Equation using Preconditioned Conjugate Gradient Method with Compressed Row Storage using MPI Solution to Laplace Equation using Preconditioned Conjugate Gradient Method with Compressed Row Storage using MPI Sagar Bhatt Person Number: 50170651 Department of Mechanical and Aerospace Engineering,

More information

Inference For High Dimensional M-estimates. Fixed Design Results

Inference For High Dimensional M-estimates. Fixed Design Results : Fixed Design Results Lihua Lei Advisors: Peter J. Bickel, Michael I. Jordan joint work with Peter J. Bickel and Noureddine El Karoui Dec. 8, 2016 1/57 Table of Contents 1 Background 2 Main Results and

More information

PCA, Kernel PCA, ICA

PCA, Kernel PCA, ICA PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per

More information

ROBUST ESTIMATOR FOR MULTIPLE INLIER STRUCTURES

ROBUST ESTIMATOR FOR MULTIPLE INLIER STRUCTURES ROBUST ESTIMATOR FOR MULTIPLE INLIER STRUCTURES Xiang Yang (1) and Peter Meer (2) (1) Dept. of Mechanical and Aerospace Engineering (2) Dept. of Electrical and Computer Engineering Rutgers University,

More information

Mini-project in scientific computing

Mini-project in scientific computing Mini-project in scientific computing Eran Treister Computer Science Department, Ben-Gurion University of the Negev, Israel. March 7, 2018 1 / 30 Scientific computing Involves the solution of large computational

More information

Enhancing Performance of Tall-Skinny QR Factorization using FPGAs

Enhancing Performance of Tall-Skinny QR Factorization using FPGAs Enhancing Performance of Tall-Skinny QR Factorization using FPGAs Abid Rafique Imperial College London August 31, 212 Enhancing Performance of Tall-Skinny QR Factorization using FPGAs 1/18 Our Claim Common

More information

A robust multilevel approximate inverse preconditioner for symmetric positive definite matrices

A robust multilevel approximate inverse preconditioner for symmetric positive definite matrices DICEA DEPARTMENT OF CIVIL, ENVIRONMENTAL AND ARCHITECTURAL ENGINEERING PhD SCHOOL CIVIL AND ENVIRONMENTAL ENGINEERING SCIENCES XXX CYCLE A robust multilevel approximate inverse preconditioner for symmetric

More information

Preconditioned Parallel Block Jacobi SVD Algorithm

Preconditioned Parallel Block Jacobi SVD Algorithm Parallel Numerics 5, 15-24 M. Vajteršic, R. Trobec, P. Zinterhof, A. Uhl (Eds.) Chapter 2: Matrix Algebra ISBN 961-633-67-8 Preconditioned Parallel Block Jacobi SVD Algorithm Gabriel Okša 1, Marián Vajteršic

More information

Uniform Convergence of a Multilevel Energy-based Quantization Scheme

Uniform Convergence of a Multilevel Energy-based Quantization Scheme Uniform Convergence of a Multilevel Energy-based Quantization Scheme Maria Emelianenko 1 and Qiang Du 1 Pennsylvania State University, University Park, PA 16803 emeliane@math.psu.edu and qdu@math.psu.edu

More information

Preconditioning techniques to accelerate the convergence of the iterative solution methods

Preconditioning techniques to accelerate the convergence of the iterative solution methods Note Preconditioning techniques to accelerate the convergence of the iterative solution methods Many issues related to iterative solution of linear systems of equations are contradictory: numerical efficiency

More information

Intensity Analysis of Spatial Point Patterns Geog 210C Introduction to Spatial Data Analysis

Intensity Analysis of Spatial Point Patterns Geog 210C Introduction to Spatial Data Analysis Intensity Analysis of Spatial Point Patterns Geog 210C Introduction to Spatial Data Analysis Chris Funk Lecture 5 Topic Overview 1) Introduction/Unvariate Statistics 2) Bootstrapping/Monte Carlo Simulation/Kernel

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning Christoph Lampert Spring Semester 2015/2016 // Lecture 12 1 / 36 Unsupervised Learning Dimensionality Reduction 2 / 36 Dimensionality Reduction Given: data X = {x 1,..., x

More information

A Posteriori Adaptive Low-Rank Approximation of Probabilistic Models

A Posteriori Adaptive Low-Rank Approximation of Probabilistic Models A Posteriori Adaptive Low-Rank Approximation of Probabilistic Models Rainer Niekamp and Martin Krosche. Institute for Scientific Computing TU Braunschweig ILAS: 22.08.2011 A Posteriori Adaptive Low-Rank

More information

Fast Multipole Methods for Incompressible Flow Simulation

Fast Multipole Methods for Incompressible Flow Simulation Fast Multipole Methods for Incompressible Flow Simulation Nail A. Gumerov & Ramani Duraiswami Institute for Advanced Computer Studies University of Maryland, College Park Support of NSF awards 0086075

More information

Recent Results for Moving Least Squares Approximation

Recent Results for Moving Least Squares Approximation Recent Results for Moving Least Squares Approximation Gregory E. Fasshauer and Jack G. Zhang Abstract. We describe two experiments recently conducted with the approximate moving least squares (MLS) approximation

More information

Deep Learning: Approximation of Functions by Composition

Deep Learning: Approximation of Functions by Composition Deep Learning: Approximation of Functions by Composition Zuowei Shen Department of Mathematics National University of Singapore Outline 1 A brief introduction of approximation theory 2 Deep learning: approximation

More information

Multilevel low-rank approximation preconditioners Yousef Saad Department of Computer Science and Engineering University of Minnesota

Multilevel low-rank approximation preconditioners Yousef Saad Department of Computer Science and Engineering University of Minnesota Multilevel low-rank approximation preconditioners Yousef Saad Department of Computer Science and Engineering University of Minnesota SIAM CSE Boston - March 1, 2013 First: Joint work with Ruipeng Li Work

More information

Chapter 7 Iterative Techniques in Matrix Algebra

Chapter 7 Iterative Techniques in Matrix Algebra Chapter 7 Iterative Techniques in Matrix Algebra Per-Olof Persson persson@berkeley.edu Department of Mathematics University of California, Berkeley Math 128B Numerical Analysis Vector Norms Definition

More information

Efficient Solvers for Stochastic Finite Element Saddle Point Problems

Efficient Solvers for Stochastic Finite Element Saddle Point Problems Efficient Solvers for Stochastic Finite Element Saddle Point Problems Catherine E. Powell c.powell@manchester.ac.uk School of Mathematics University of Manchester, UK Efficient Solvers for Stochastic Finite

More information

Learning Multiple Tasks with a Sparse Matrix-Normal Penalty

Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Yi Zhang and Jeff Schneider NIPS 2010 Presented by Esther Salazar Duke University March 25, 2011 E. Salazar (Reading group) March 25, 2011 1

More information