Breaking Computational Barriers: Multi-GPU High-Order RBF Kernel Problems with Millions of Points
|
|
- Osborne Patterson
- 5 years ago
- Views:
Transcription
1 Breaking Computational Barriers: Multi-GPU High-Order RBF Kernel Problems with Millions of Points Michael Griebel Christian Rieger Peter Zaspel Institute for Numerical Simulation Rheinische Friedrich-Wilhelms-Universität Bonn GPU Technnology Conference 2014 March 24-27, 2014, San José, CA, USA
2 Outline 1 Motivation 2 Radial basis function interpolation 3 Preconditioning for large-scale kernel interpolation problems 4 Summary
3 Motivation Meshfree Interpolation interpolation reconstruction of continuous function from point evaluations meshfree evaluation points at arbitrary locations, no mesh Fields of application classical applications computer graphics, signal processing, computer vision,... large scale data analysis Big Data, data mining, machine learning,... solving PDEs by collocation methods compuational fluid dynamics, stochastic collocation,...
4 My personal motivation: Uncertainty quantification in CFD (1) Current standard in computational fluid dynamics simulations for fixed and known input parameters Problem nature phenomena: input data not known exactly engineering: constructions / measurements always subject to perturbations
5 My personal motivation: Uncertainty quantification in CFD (2) Algorithmic idea 1 sampling of stochastic input parameters according to some distribution 2 computation of hundreds or thousands of stochastic realizations (high-resolution simulations) 3 extraction of averaged data (expectation value), variance information,... as post-processing step CFD solver E[u] sampling of stochatic space config. file generator CFD solver stochastics tool Var[u]. CFD solver Cov[u] K.-L.
6 Outline 1 Motivation 2 Radial basis function interpolation 3 Preconditioning for large-scale kernel interpolation problems 4 Summary
7 Basic facts Interpolation problem given: function f : Ω R, sampling points, X := {y 1,... y N } Ω, Ω R d target: function s f,x : Ω R such that Radial basis functions s f,x (y j ) = f (y j ) for all j = 1,..., N Gaussian: φ j (y) := e ɛ2 y y j 2 Matérn function: φ j (y) := K β d ( y y j ) y y j β d 2 2, β > d 2 β 1 Γ(β) 2, Kernel functions functions k of type k : Ω Ω R radial basis function case: k(y, y j ) := ψ( y y j ), e.g. ψ(r) = e ɛ2 r 2
8 Interpolation with kernel functions (1) Kernel-based Interpolation problem F Hilbert function space, f : Ω R, points w. func. eval.: X := {y 1,..., y N } Ω, f j := f (y j ) j = 1... N. for kernel function k : R R R looking for s f,x F with N s f,x (y) := α j k(y, y j ) y Ω j=1 Solution of interpolation problem s f,x (y j ) = f j, 1 j N. A k,x α = f k(y 1, y 1 ) k(y 1, y N ) A k,x = k(y N, y 1 ) k(y N, y N ), f = f (y 1 ). f (y N )
9 Interpolation with kernel functions (2) Interpolation by Lagrange basis s f,x (y) = N f(y i )L i (y), L i (y) = i=1 {L i } N i=1, L i : Γ R, with L i (y j ) = N αjk(y, i y j ) j=1 { 1 i = j 0 i j Construction of Lagrange basis A L matrix of coefficients A L := ( α i j ) N j,i=1 A L = A 1 k,x Γ
10 Error estimates (in native spaces) Requirement f N kɛ (Ω), Ω cube in R s Definition (Fill distance) h X,Ω := sup min y y i 2 y j X y Ω Theorem (Gaussian kernel k ɛ (y i, y j ) = e ɛ2 y i y j 2 ) Theorem (Matérn kernels) s X,f c log h X,Ω f s X,f L (Ω) e h X,Ω f Nkɛ (Ω) D α f (y) D α s f,x (y) Ch β d/2 α X,Ω f Nkd,k (Ω) (1) function f interpolated by Lagrange interpolation with collocation points X
11 Outline 1 Motivation 2 Radial basis function interpolation 3 Preconditioning for large-scale kernel interpolation problems 4 Summary
12 Preconditioning motivation (1) Objective solution of special dense linear systems with unknowns Standard approach A k,x α = f k(y 1, y 1 ) k(y 1, y N ) A k,x = k(y N, y 1 ) k(y N, y N ), f = solution of dense linear system by direct factorization complexity O(N 3 ) f (y 1 ). f (y N )
13 Preconditioning motivation (2) Iterative approach Krylov iterative linear solver such as CG or BiCG for dense matrices still complexity O(N 3 ), but... Preconditioned iterative approach use of Krylov iterative solver for dense linear system preconditioner based on localized Lagrange basis functions often few or even constant number of iterations possible optional: fast multipole (FFM) for dense MatVec product final complexity in optimal case: w. FFM O(N log(n)), w/o FFM O(N 2 ) [Beatson, Cherrie, Mouat, 1999], [Faul, Powell, 2000], [Gumerov, Duraiswami, 2007], [Fuselier, Hangelbroek, Narcowich, Ward, Wright, 2012]
14 Preconditioning idea Algorithm for each point: find lokal point neighborhood / subset solve interpolation problem on local neighborhood construct local Lagrange basis use local solutions as preconditioner for full system Properties many small problems very local source: [Fuselier et. al., 2012]
15 Choice of local subsets subsets by radius X i points within given radius of y i non-fixed size of subset clear geometric view subset by next neighbors X i points next n neighbors of y i fixed size of subset geometric view unclear n = κ log(n) 2 source: [Fuselier et. al., 2012]
16 Approximated / localized Lagrange basis N N s f,x (y) = f(y i )L i (y), L i (y) = αjk(y, i y j ) i=1 {L i } N i=1, L i : Γ R, with L i (y j ) = j=1 { 1 i = j 0 i j Local subsets X i X i X, Xi := {y i1,..., y ini }, s.th. y i X i, i {1,..., N} } N Approximate / localized Lagrange basis { Li i=1 Li (y) = { 1 if y = yi 0 if y X i \ y i, Li (y) := N i j=1 α i jk(y, y ij ), Li N k (Ω) A k, Xi α i = e i i = 1... N
17 Preconditioning by localized Lagrange basis Li (y) = { 1 if y = yi 0 if y X i \ y i, Li (y) := N i α jk(y, i y ij ) j=1 Coefficient matrix describing localized Lagrange basis A L := (a ik ) N i,k=1, with a i,k := maximum of X i non-zero entries in row i { α i j if k = i j 0 otherwise or A L := (α 1 ). (α N ) Preconditioned linear system A LA k,x α = A Lf (A L =A k,x 1 A L A k,x 1 )
18 Properties for Exascale systems Locality of preconditioner local point set used for each localized Lagrange basis application of preconditioner local per construction no energy wasting global transfer operations Parallelism of preconditioner construction many similar / equally sized small problems solved in parallel without communication optimal for fine-graind parallelism, deep memory hierarchies Error resilience of overall method iterative methods error resilient by construction
19 Multi-GPU implementation (1) Iterative solver for dense linear systems parla in-house multi-gpu parallel library CUDA-aware MPI and overlapping of compuation and communication for optimal scaling iterative solvers impelemented with variable preconditioner and black-box or memory-based matrix-vector product currently available solvers: CG, CGS, Lanczos (EV problems) on-the-fly application of matrix-vector product in CUDA-kernel with no additional memory consumption (besides of points) libraries: CUB, CUDA 5.0, CUBLAS 5.0, OpenMPI Domain decomposition of points currently preprocessing step with clustering alorithm in external tool and ghost point layer
20 Multi-GPU implementation (2) Preconditioner setup (under development / optimization) setup within stochastic collocation tool brute-force GPU based knn-search (arbitrary dim. no lib.) for now: small systems solved by LU decomp. from CULA currently development for Fermi-type GPUs (sm 20) no concurrent kernel execution cublas<t>gemmbatched()? (performance for > ?) more libraries providing batched / device kernel functions? (CUB is great, but not yet feature complete / too low-level) libraries: Thrust, CUBLAS, CULA, OpenMPI Multi-GPU preconditioner setup & application purely local no par. comm.
21 First performance results runtime in min precond. solver small LUs perfect scaling number of GPUs strong scaling for interpolation points Parameters κ = 2 k: Matérn kernel solver stops at r < intp. points subset size 311 iteration count 45 overall time on 32 proc.: 8.73 min 1M intp. points subset size 381 iteration count 67 overall time on 32 proc.: min
22 Summary Meshfree interpolation / collocation needs scaling solvers RBF kernel methods with higher-order convergence Strong preconditioning techniques with high potential for Exascale systems Thank you!
Multi-GPU Parallel Numerical Methods for Uncertainty Quantification in Computational Fluid Dynamics
Multi-GPU Parallel Numerical Methods for Uncertainty Quantification in Computational Fluid Dynamics Michael Griebel Christian Rieger Peter Zaspel Institute for Numerical Simulation Rheinische Friedrich-Wilhelms-Universität
More informationMATH 590: Meshfree Methods
MATH 590: Meshfree Methods Chapter 34: Improving the Condition Number of the Interpolation Matrix Greg Fasshauer Department of Applied Mathematics Illinois Institute of Technology Fall 2010 fasshauer@iit.edu
More informationAPPROXIMATING GAUSSIAN PROCESSES
1 / 23 APPROXIMATING GAUSSIAN PROCESSES WITH H 2 -MATRICES Steffen Börm 1 Jochen Garcke 2 1 Christian-Albrechts-Universität zu Kiel 2 Universität Bonn and Fraunhofer SCAI 2 / 23 OUTLINE 1 GAUSSIAN PROCESSES
More informationParallel Asynchronous Hybrid Krylov Methods for Minimization of Energy Consumption. Langshi CHEN 1,2,3 Supervised by Serge PETITON 2
1 / 23 Parallel Asynchronous Hybrid Krylov Methods for Minimization of Energy Consumption Langshi CHEN 1,2,3 Supervised by Serge PETITON 2 Maison de la Simulation Lille 1 University CNRS March 18, 2013
More informationStabilization and Acceleration of Algebraic Multigrid Method
Stabilization and Acceleration of Algebraic Multigrid Method Recursive Projection Algorithm A. Jemcov J.P. Maruszewski Fluent Inc. October 24, 2006 Outline 1 Need for Algorithm Stabilization and Acceleration
More informationPreconditioned Krylov solvers for kernel regression
Preconditioned Krylov solvers for kernel regression Balaji Vasan Srinivasan 1, Qi Hu 2, Nail A. Gumerov 3, Ramani Duraiswami 3 1 Adobe Research Labs, Bangalore, India; 2 Facebook Inc., San Francisco, CA,
More informationFast Multipole Methods: Fundamentals & Applications. Ramani Duraiswami Nail A. Gumerov
Fast Multipole Methods: Fundamentals & Applications Ramani Duraiswami Nail A. Gumerov Week 1. Introduction. What are multipole methods and what is this course about. Problems from physics, mathematics,
More informationA orthonormal basis for Radial Basis Function approximation
A orthonormal basis for Radial Basis Function approximation 9th ISAAC Congress Krakow, August 5-9, 2013 Gabriele Santin, joint work with S. De Marchi Department of Mathematics. Doctoral School in Mathematical
More informationLattice Boltzmann simulations on heterogeneous CPU-GPU clusters
Lattice Boltzmann simulations on heterogeneous CPU-GPU clusters H. Köstler 2nd International Symposium Computer Simulations on GPU Freudenstadt, 29.05.2013 1 Contents Motivation walberla software concepts
More informationMATH 590: Meshfree Methods
MATH 590: Meshfree Methods Chapter 14: The Power Function and Native Space Error Estimates Greg Fasshauer Department of Applied Mathematics Illinois Institute of Technology Fall 2010 fasshauer@iit.edu
More informationMARCH 24-27, 2014 SAN JOSE, CA
MARCH 24-27, 2014 SAN JOSE, CA Sparse HPC on modern architectures Important scientific applications rely on sparse linear algebra HPCG a new benchmark proposal to complement Top500 (HPL) To solve A x =
More informationGreedy Kernel Techniques with Applications to Machine Learning
Greedy Kernel Techniques with Applications to Machine Learning Robert Schaback Jochen Werner Göttingen University Institut für Numerische und Angewandte Mathematik http://www.num.math.uni-goettingen.de/schaback
More informationPositive Definite Kernels: Opportunities and Challenges
Positive Definite Kernels: Opportunities and Challenges Michael McCourt Department of Mathematical and Statistical Sciences University of Colorado, Denver CUNY Mathematics Seminar CUNY Graduate College
More informationContents. Preface... xi. Introduction...
Contents Preface... xi Introduction... xv Chapter 1. Computer Architectures... 1 1.1. Different types of parallelism... 1 1.1.1. Overlap, concurrency and parallelism... 1 1.1.2. Temporal and spatial parallelism
More informationFaster Kinetics: Accelerate Your Finite-Rate Combustion Simulation with GPUs
Faster Kinetics: Accelerate Your Finite-Rate Combustion Simulation with GPUs Christopher P. Stone, Ph.D. Computational Science and Engineering, LLC Kyle Niemeyer, Ph.D. Oregon State University 2 Outline
More informationAlgebraic Multigrid as Solvers and as Preconditioner
Ò Algebraic Multigrid as Solvers and as Preconditioner Domenico Lahaye domenico.lahaye@cs.kuleuven.ac.be http://www.cs.kuleuven.ac.be/ domenico/ Department of Computer Science Katholieke Universiteit Leuven
More informationModelling and implementation of algorithms in applied mathematics using MPI
Modelling and implementation of algorithms in applied mathematics using MPI Lecture 3: Linear Systems: Simple Iterative Methods and their parallelization, Programming MPI G. Rapin Brazil March 2011 Outline
More informationSchwarz Preconditioner for the Stochastic Finite Element Method
Schwarz Preconditioner for the Stochastic Finite Element Method Waad Subber 1 and Sébastien Loisel 2 Preprint submitted to DD22 conference 1 Introduction The intrusive polynomial chaos approach for uncertainty
More informationCase Study: Quantum Chromodynamics
Case Study: Quantum Chromodynamics Michael Clark Harvard University with R. Babich, K. Barros, R. Brower, J. Chen and C. Rebbi Outline Primer to QCD QCD on a GPU Mixed Precision Solvers Multigrid solver
More informationMeshfree Approximation Methods with MATLAB
Interdisciplinary Mathematical Sc Meshfree Approximation Methods with MATLAB Gregory E. Fasshauer Illinois Institute of Technology, USA Y f? World Scientific NEW JERSEY LONDON SINGAPORE BEIJING SHANGHAI
More informationStability of Kernel Based Interpolation
Stability of Kernel Based Interpolation Stefano De Marchi Department of Computer Science, University of Verona (Italy) Robert Schaback Institut für Numerische und Angewandte Mathematik, University of Göttingen
More informationKernel-based Approximation. Methods using MATLAB. Gregory Fasshauer. Interdisciplinary Mathematical Sciences. Michael McCourt.
SINGAPORE SHANGHAI Vol TAIPEI - Interdisciplinary Mathematical Sciences 19 Kernel-based Approximation Methods using MATLAB Gregory Fasshauer Illinois Institute of Technology, USA Michael McCourt University
More informationBoundary Value Problems - Solving 3-D Finite-Difference problems Jacob White
Introduction to Simulation - Lecture 2 Boundary Value Problems - Solving 3-D Finite-Difference problems Jacob White Thanks to Deepak Ramaswamy, Michal Rewienski, and Karen Veroy Outline Reminder about
More informationOUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative methods ffl Krylov subspace methods ffl Preconditioning techniques: Iterative methods ILU
Preconditioning Techniques for Solving Large Sparse Linear Systems Arnold Reusken Institut für Geometrie und Praktische Mathematik RWTH-Aachen OUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative
More informationMATH 590: Meshfree Methods
MATH 590: Meshfree Methods Chapter 2: Radial Basis Function Interpolation in MATLAB Greg Fasshauer Department of Applied Mathematics Illinois Institute of Technology Fall 2010 fasshauer@iit.edu MATH 590
More informationCourse Requirements. Course Mechanics. Projects & Exams. Homework. Week 1. Introduction. Fast Multipole Methods: Fundamentals & Applications
Week 1. Introduction. Fast Multipole Methods: Fundamentals & Applications Ramani Duraiswami Nail A. Gumerov What are multipole methods and what is this course about. Problems from phsics, mathematics,
More informationTowards a highly-parallel PDE-Solver using Adaptive Sparse Grids on Compute Clusters
Towards a highly-parallel PDE-Solver using Adaptive Sparse Grids on Compute Clusters HIM - Workshop on Sparse Grids and Applications Alexander Heinecke Chair of Scientific Computing May 18 th 2011 HIM
More informationAMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)
AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) Lecture 19: Computing the SVD; Sparse Linear Systems Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical
More informationIterative Methods and Multigrid
Iterative Methods and Multigrid Part 3: Preconditioning 2 Eric de Sturler Preconditioning The general idea behind preconditioning is that convergence of some method for the linear system Ax = b can be
More informationMATH 590: Meshfree Methods
MATH 590: Meshfree Methods Chapter 33: Adaptive Iteration Greg Fasshauer Department of Applied Mathematics Illinois Institute of Technology Fall 2010 fasshauer@iit.edu MATH 590 Chapter 33 1 Outline 1 A
More informationGaussian Processes (10/16/13)
STA561: Probabilistic machine learning Gaussian Processes (10/16/13) Lecturer: Barbara Engelhardt Scribes: Changwei Hu, Di Jin, Mengdi Wang 1 Introduction In supervised learning, we observe some inputs
More informationRecent progress on boundary effects in kernel approximation
Recent progress on boundary effects in kernel approximation Thomas Hangelbroek University of Hawaii at Manoa FoCM 2014 work supported by: NSF DMS-1413726 Boundary effects and error estimates Radial basis
More informationRBF-FD Approximation to Solve Poisson Equation in 3D
RBF-FD Approximation to Solve Poisson Equation in 3D Jagadeeswaran.R March 14, 2014 1 / 28 Overview Problem Setup Generalized finite difference method. Uses numerical differentiations generated by Gaussian
More informationMATH 590: Meshfree Methods
MATH 590: Meshfree Methods Chapter 33: Adaptive Iteration Greg Fasshauer Department of Applied Mathematics Illinois Institute of Technology Fall 2010 fasshauer@iit.edu MATH 590 Chapter 33 1 Outline 1 A
More informationScalable machine learning for massive datasets: Fast summation algorithms
Scalable machine learning for massive datasets: Fast summation algorithms Getting good enough solutions as fast as possible Vikas Chandrakant Raykar vikas@cs.umd.edu University of Maryland, CollegePark
More informationMATH 590: Meshfree Methods
MATH 590: Meshfree Methods Chapter 1 Part 3: Radial Basis Function Interpolation in MATLAB Greg Fasshauer Department of Applied Mathematics Illinois Institute of Technology Fall 2014 fasshauer@iit.edu
More informationFrom Direct to Iterative Substructuring: some Parallel Experiences in 2 and 3D
From Direct to Iterative Substructuring: some Parallel Experiences in 2 and 3D Luc Giraud N7-IRIT, Toulouse MUMPS Day October 24, 2006, ENS-INRIA, Lyon, France Outline 1 General Framework 2 The direct
More informationMultigrid Methods and their application in CFD
Multigrid Methods and their application in CFD Michael Wurst TU München 16.06.2009 1 Multigrid Methods Definition Multigrid (MG) methods in numerical analysis are a group of algorithms for solving differential
More informationKernel Method: Data Analysis with Positive Definite Kernels
Kernel Method: Data Analysis with Positive Definite Kernels 2. Positive Definite Kernel and Reproducing Kernel Hilbert Space Kenji Fukumizu The Institute of Statistical Mathematics. Graduate University
More informationTR A Comparison of the Performance of SaP::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems
TR-0-07 A Comparison of the Performance of ::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems Ang Li, Omkar Deshmukh, Radu Serban, Dan Negrut May, 0 Abstract ::GPU is a
More informationKernel Methods. Charles Elkan October 17, 2007
Kernel Methods Charles Elkan elkan@cs.ucsd.edu October 17, 2007 Remember the xor example of a classification problem that is not linearly separable. If we map every example into a new representation, then
More informationGaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012
Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature
More informationA Recursive Trust-Region Method for Non-Convex Constrained Minimization
A Recursive Trust-Region Method for Non-Convex Constrained Minimization Christian Groß 1 and Rolf Krause 1 Institute for Numerical Simulation, University of Bonn. {gross,krause}@ins.uni-bonn.de 1 Introduction
More informationScalable Domain Decomposition Preconditioners For Heterogeneous Elliptic Problems
Scalable Domain Decomposition Preconditioners For Heterogeneous Elliptic Problems Pierre Jolivet, F. Hecht, F. Nataf, C. Prud homme Laboratoire Jacques-Louis Lions Laboratoire Jean Kuntzmann INRIA Rocquencourt
More informationKarhunen-Loève Approximation of Random Fields Using Hierarchical Matrix Techniques
Institut für Numerische Mathematik und Optimierung Karhunen-Loève Approximation of Random Fields Using Hierarchical Matrix Techniques Oliver Ernst Computational Methods with Applications Harrachov, CR,
More informationGPU accelerated Arnoldi solver for small batched matrix
15. 09. 22 GPU accelerated Arnoldi solver for small batched matrix Samsung Advanced Institute of Technology Hyung-Jin Kim Contents - Eigen value problems - Solution - Arnoldi Algorithm - Target - CUDA
More informationDistances & Similarities
Introduction to Data Mining Distances & Similarities CPSC/AMTH 445a/545a Guy Wolf guy.wolf@yale.edu Yale University Fall 2016 CPSC 445 (Guy Wolf) Distances & Similarities Yale - Fall 2016 1 / 22 Outline
More informationRadial basis function partition of unity methods for PDEs
Radial basis function partition of unity methods for PDEs Elisabeth Larsson, Scientific Computing, Uppsala University Credit goes to a number of collaborators Alfa Ali Alison Lina Victor Igor Heryudono
More informationLinear Solvers. Andrew Hazel
Linear Solvers Andrew Hazel Introduction Thus far we have talked about the formulation and discretisation of physical problems...... and stopped when we got to a discrete linear system of equations. Introduction
More informationHierarchical Parallel Solution of Stochastic Systems
Hierarchical Parallel Solution of Stochastic Systems Second M.I.T. Conference on Computational Fluid and Solid Mechanics Contents: Simple Model of Stochastic Flow Stochastic Galerkin Scheme Resulting Equations
More informationEfficient implementation of the overlap operator on multi-gpus
Efficient implementation of the overlap operator on multi-gpus Andrei Alexandru Mike Lujan, Craig Pelissier, Ben Gamari, Frank Lee SAAHPC 2011 - University of Tennessee Outline Motivation Overlap operator
More informationLA Support for Scalable Kernel Methods. David Bindel 29 Sep 2018
LA Support for Scalable Kernel Methods David Bindel 29 Sep 2018 Collaborators Kun Dong (Cornell CAM) David Eriksson (Cornell CAM) Jake Gardner (Cornell CS) Eric Lee (Cornell CS) Hannes Nickisch (Phillips
More informationA High-Performance Parallel Hybrid Method for Large Sparse Linear Systems
Outline A High-Performance Parallel Hybrid Method for Large Sparse Linear Systems Azzam Haidar CERFACS, Toulouse joint work with Luc Giraud (N7-IRIT, France) and Layne Watson (Virginia Polytechnic Institute,
More informationSolving the 3D Laplace Equation by Meshless Collocation via Harmonic Kernels
Solving the 3D Laplace Equation by Meshless Collocation via Harmonic Kernels Y.C. Hon and R. Schaback April 9, Abstract This paper solves the Laplace equation u = on domains Ω R 3 by meshless collocation
More informationDIRECT ERROR BOUNDS FOR SYMMETRIC RBF COLLOCATION
Meshless Methods in Science and Engineering - An International Conference Porto, 22 DIRECT ERROR BOUNDS FOR SYMMETRIC RBF COLLOCATION Robert Schaback Institut für Numerische und Angewandte Mathematik (NAM)
More informationPreface to the Second Edition. Preface to the First Edition
n page v Preface to the Second Edition Preface to the First Edition xiii xvii 1 Background in Linear Algebra 1 1.1 Matrices................................. 1 1.2 Square Matrices and Eigenvalues....................
More informationNVIDIA MPI-enabled Iterative Solvers for Large Scale Problems. Joe Eaton Manager, AmgX CUDA Library NVIDIA
NVIDIA MPI-enabled Iterative Solvers for Large Scale Problems Joe Eaton Manager, AmgX CUDA Library NVIDIA ANSYS Fluent Fluent control flow Accelerate this first Non-linear iterations Assemble Linear System
More informationScalable Non-blocking Preconditioned Conjugate Gradient Methods
Scalable Non-blocking Preconditioned Conjugate Gradient Methods Paul Eller and William Gropp University of Illinois at Urbana-Champaign Department of Computer Science Supercomputing 16 Paul Eller and William
More informationComputers and Mathematics with Applications
Computers and Mathematics with Applications 68 (2014) 1151 1160 Contents lists available at ScienceDirect Computers and Mathematics with Applications journal homepage: www.elsevier.com/locate/camwa A GPU
More informationScalable kernel methods and their use in black-box optimization
with derivatives Scalable kernel methods and their use in black-box optimization David Eriksson Center for Applied Mathematics Cornell University dme65@cornell.edu November 9, 2018 1 2 3 4 1/37 with derivatives
More informationSolving Ax = b, an overview. Program
Numerical Linear Algebra Improving iterative solvers: preconditioning, deflation, numerical software and parallelisation Gerard Sleijpen and Martin van Gijzen November 29, 27 Solving Ax = b, an overview
More informationSolving Symmetric Indefinite Systems with Symmetric Positive Definite Preconditioners
Solving Symmetric Indefinite Systems with Symmetric Positive Definite Preconditioners Eugene Vecharynski 1 Andrew Knyazev 2 1 Department of Computer Science and Engineering University of Minnesota 2 Department
More informationJ.I. Aliaga 1 M. Bollhöfer 2 A.F. Martín 1 E.S. Quintana-Ortí 1. March, 2009
Parallel Preconditioning of Linear Systems based on ILUPACK for Multithreaded Architectures J.I. Aliaga M. Bollhöfer 2 A.F. Martín E.S. Quintana-Ortí Deparment of Computer Science and Engineering, Univ.
More informationOpen-source finite element solver for domain decomposition problems
1/29 Open-source finite element solver for domain decomposition problems C. Geuzaine 1, X. Antoine 2,3, D. Colignon 1, M. El Bouajaji 3,2 and B. Thierry 4 1 - University of Liège, Belgium 2 - University
More informationRandomized Algorithms
Randomized Algorithms Saniv Kumar, Google Research, NY EECS-6898, Columbia University - Fall, 010 Saniv Kumar 9/13/010 EECS6898 Large Scale Machine Learning 1 Curse of Dimensionality Gaussian Mixture Models
More informationAccelerating linear algebra computations with hybrid GPU-multicore systems.
Accelerating linear algebra computations with hybrid GPU-multicore systems. Marc Baboulin INRIA/Université Paris-Sud joint work with Jack Dongarra (University of Tennessee and Oak Ridge National Laboratory)
More informationStability constants for kernel-based interpolation processes
Dipartimento di Informatica Università degli Studi di Verona Rapporto di ricerca Research report 59 Stability constants for kernel-based interpolation processes Stefano De Marchi Robert Schaback Dipartimento
More informationA User Friendly Toolbox for Parallel PDE-Solvers
A User Friendly Toolbox for Parallel PDE-Solvers Gundolf Haase Institut for Mathematics and Scientific Computing Karl-Franzens University of Graz Manfred Liebmann Mathematics in Sciences Max-Planck-Institute
More informationCOURSE DESCRIPTIONS. 1 of 5 8/21/2008 3:15 PM. (S) = Spring and (F) = Fall. All courses are 3 semester hours, unless otherwise noted.
1 of 5 8/21/2008 3:15 PM COURSE DESCRIPTIONS (S) = Spring and (F) = Fall All courses are 3 semester hours, unless otherwise noted. INTRODUCTORY COURSES: CAAM 210 (BOTH) INTRODUCTION TO ENGINEERING COMPUTATION
More informationFunctional Gradient Descent
Statistical Techniques in Robotics (16-831, F12) Lecture #21 (Nov 14, 2012) Functional Gradient Descent Lecturer: Drew Bagnell Scribe: Daniel Carlton Smith 1 1 Goal of Functional Gradient Descent We have
More informationNeural Networks and Deep Learning
Neural Networks and Deep Learning Professor Ameet Talwalkar November 12, 2015 Professor Ameet Talwalkar Neural Networks and Deep Learning November 12, 2015 1 / 16 Outline 1 Review of last lecture AdaBoost
More informationParallel programming practices for the solution of Sparse Linear Systems (motivated by computational physics and graphics)
Parallel programming practices for the solution of Sparse Linear Systems (motivated by computational physics and graphics) Eftychios Sifakis CS758 Guest Lecture - 19 Sept 2012 Introduction Linear systems
More informationSome Geometric and Algebraic Aspects of Domain Decomposition Methods
Some Geometric and Algebraic Aspects of Domain Decomposition Methods D.S.Butyugin 1, Y.L.Gurieva 1, V.P.Ilin 1,2, and D.V.Perevozkin 1 Abstract Some geometric and algebraic aspects of various domain decomposition
More informationITERATIVE METHODS FOR SPARSE LINEAR SYSTEMS
ITERATIVE METHODS FOR SPARSE LINEAR SYSTEMS YOUSEF SAAD University of Minnesota PWS PUBLISHING COMPANY I(T)P An International Thomson Publishing Company BOSTON ALBANY BONN CINCINNATI DETROIT LONDON MADRID
More informationFast Krylov Methods for N-Body Learning
Fast Krylov Methods for N-Body Learning Nando de Freitas Department of Computer Science University of British Columbia nando@cs.ubc.ca Maryam Mahdaviani Department of Computer Science University of British
More informationEfficient domain decomposition methods for the time-harmonic Maxwell equations
Efficient domain decomposition methods for the time-harmonic Maxwell equations Marcella Bonazzoli 1, Victorita Dolean 2, Ivan G. Graham 3, Euan A. Spence 3, Pierre-Henri Tournier 4 1 Inria Saclay (Defi
More informationRound-off error propagation and non-determinism in parallel applications
Round-off error propagation and non-determinism in parallel applications Vincent Baudoui (Argonne/Total SA) vincent.baudoui@gmail.com Franck Cappello (Argonne/INRIA/UIUC-NCSA) Georges Oppenheim (Paris-Sud
More informationM.A. Botchev. September 5, 2014
Rome-Moscow school of Matrix Methods and Applied Linear Algebra 2014 A short introduction to Krylov subspaces for linear systems, matrix functions and inexact Newton methods. Plan and exercises. M.A. Botchev
More informationarxiv: v1 [math.na] 26 Oct 2018
KERNEL BASED STOCHASTIC COLLOCATION FOR THE RANDOM TWO PHASE NAVIER-STOKES EQUATIONS arxiv:80.70v [math.na] 6 Oct 08 M. Griebel, C. Rieger, & P. Zaspel 3, Institute for Numerical Simulation, Bonn University,
More informationPRECONDITIONING IN THE PARALLEL BLOCK-JACOBI SVD ALGORITHM
Proceedings of ALGORITMY 25 pp. 22 211 PRECONDITIONING IN THE PARALLEL BLOCK-JACOBI SVD ALGORITHM GABRIEL OKŠA AND MARIÁN VAJTERŠIC Abstract. One way, how to speed up the computation of the singular value
More informationSolution to Laplace Equation using Preconditioned Conjugate Gradient Method with Compressed Row Storage using MPI
Solution to Laplace Equation using Preconditioned Conjugate Gradient Method with Compressed Row Storage using MPI Sagar Bhatt Person Number: 50170651 Department of Mechanical and Aerospace Engineering,
More informationInference For High Dimensional M-estimates. Fixed Design Results
: Fixed Design Results Lihua Lei Advisors: Peter J. Bickel, Michael I. Jordan joint work with Peter J. Bickel and Noureddine El Karoui Dec. 8, 2016 1/57 Table of Contents 1 Background 2 Main Results and
More informationPCA, Kernel PCA, ICA
PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per
More informationROBUST ESTIMATOR FOR MULTIPLE INLIER STRUCTURES
ROBUST ESTIMATOR FOR MULTIPLE INLIER STRUCTURES Xiang Yang (1) and Peter Meer (2) (1) Dept. of Mechanical and Aerospace Engineering (2) Dept. of Electrical and Computer Engineering Rutgers University,
More informationMini-project in scientific computing
Mini-project in scientific computing Eran Treister Computer Science Department, Ben-Gurion University of the Negev, Israel. March 7, 2018 1 / 30 Scientific computing Involves the solution of large computational
More informationEnhancing Performance of Tall-Skinny QR Factorization using FPGAs
Enhancing Performance of Tall-Skinny QR Factorization using FPGAs Abid Rafique Imperial College London August 31, 212 Enhancing Performance of Tall-Skinny QR Factorization using FPGAs 1/18 Our Claim Common
More informationA robust multilevel approximate inverse preconditioner for symmetric positive definite matrices
DICEA DEPARTMENT OF CIVIL, ENVIRONMENTAL AND ARCHITECTURAL ENGINEERING PhD SCHOOL CIVIL AND ENVIRONMENTAL ENGINEERING SCIENCES XXX CYCLE A robust multilevel approximate inverse preconditioner for symmetric
More informationPreconditioned Parallel Block Jacobi SVD Algorithm
Parallel Numerics 5, 15-24 M. Vajteršic, R. Trobec, P. Zinterhof, A. Uhl (Eds.) Chapter 2: Matrix Algebra ISBN 961-633-67-8 Preconditioned Parallel Block Jacobi SVD Algorithm Gabriel Okša 1, Marián Vajteršic
More informationUniform Convergence of a Multilevel Energy-based Quantization Scheme
Uniform Convergence of a Multilevel Energy-based Quantization Scheme Maria Emelianenko 1 and Qiang Du 1 Pennsylvania State University, University Park, PA 16803 emeliane@math.psu.edu and qdu@math.psu.edu
More informationPreconditioning techniques to accelerate the convergence of the iterative solution methods
Note Preconditioning techniques to accelerate the convergence of the iterative solution methods Many issues related to iterative solution of linear systems of equations are contradictory: numerical efficiency
More informationIntensity Analysis of Spatial Point Patterns Geog 210C Introduction to Spatial Data Analysis
Intensity Analysis of Spatial Point Patterns Geog 210C Introduction to Spatial Data Analysis Chris Funk Lecture 5 Topic Overview 1) Introduction/Unvariate Statistics 2) Bootstrapping/Monte Carlo Simulation/Kernel
More informationStatistical Machine Learning
Statistical Machine Learning Christoph Lampert Spring Semester 2015/2016 // Lecture 12 1 / 36 Unsupervised Learning Dimensionality Reduction 2 / 36 Dimensionality Reduction Given: data X = {x 1,..., x
More informationA Posteriori Adaptive Low-Rank Approximation of Probabilistic Models
A Posteriori Adaptive Low-Rank Approximation of Probabilistic Models Rainer Niekamp and Martin Krosche. Institute for Scientific Computing TU Braunschweig ILAS: 22.08.2011 A Posteriori Adaptive Low-Rank
More informationFast Multipole Methods for Incompressible Flow Simulation
Fast Multipole Methods for Incompressible Flow Simulation Nail A. Gumerov & Ramani Duraiswami Institute for Advanced Computer Studies University of Maryland, College Park Support of NSF awards 0086075
More informationRecent Results for Moving Least Squares Approximation
Recent Results for Moving Least Squares Approximation Gregory E. Fasshauer and Jack G. Zhang Abstract. We describe two experiments recently conducted with the approximate moving least squares (MLS) approximation
More informationDeep Learning: Approximation of Functions by Composition
Deep Learning: Approximation of Functions by Composition Zuowei Shen Department of Mathematics National University of Singapore Outline 1 A brief introduction of approximation theory 2 Deep learning: approximation
More informationMultilevel low-rank approximation preconditioners Yousef Saad Department of Computer Science and Engineering University of Minnesota
Multilevel low-rank approximation preconditioners Yousef Saad Department of Computer Science and Engineering University of Minnesota SIAM CSE Boston - March 1, 2013 First: Joint work with Ruipeng Li Work
More informationChapter 7 Iterative Techniques in Matrix Algebra
Chapter 7 Iterative Techniques in Matrix Algebra Per-Olof Persson persson@berkeley.edu Department of Mathematics University of California, Berkeley Math 128B Numerical Analysis Vector Norms Definition
More informationEfficient Solvers for Stochastic Finite Element Saddle Point Problems
Efficient Solvers for Stochastic Finite Element Saddle Point Problems Catherine E. Powell c.powell@manchester.ac.uk School of Mathematics University of Manchester, UK Efficient Solvers for Stochastic Finite
More informationLearning Multiple Tasks with a Sparse Matrix-Normal Penalty
Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Yi Zhang and Jeff Schneider NIPS 2010 Presented by Esther Salazar Duke University March 25, 2011 E. Salazar (Reading group) March 25, 2011 1
More information