An Efficient Solver for Sparse Linear Systems based on Rank-Structured Cholesky Factorization
|
|
- Alvin Evans
- 5 years ago
- Views:
Transcription
1 An Efficient Solver for Sparse Linear Systems based on Rank-Structured Cholesky Factorization David Bindel and Jeffrey Chadwick Department of Computer Science Cornell University 30 October 2015 (Department of Computer Science Cornell University) Rank-Structured Cholesky / 27
2 u = K \ f Great for circuit simulations, 1D or 2D finite elements, etc. Standard advice to students: Just try backslash for these problems. Standard response: What about for the 3D case? (Department of Computer Science Cornell University) Rank-Structured Cholesky / 27
3 Try PCG with a good preconditioner. Maybe start with the ones in PETSc. You ve taken Matrix Computations, right? Blah blah yadda blah... (Not an actual student) (Department of Computer Science Cornell University) Rank-Structured Cholesky / 27
4 Direct or iterative? A L CW: Gaussian elimination scales poorly. Iterate instead! Pro: Less memory, potentially better complexity Con: Less robust, potentially worse memory patterns Commercial finite element codes still use (out-of-core) Cholesky. Longer compute times, but fewer tech support hours. (Department of Computer Science Cornell University) Rank-Structured Cholesky / 27
5 Desiderata I want a code for sparse Cholesky (A = LL T ) that Handles modest problems on a desktop (or laptop?) Inside a loop, without trying my patience = Does not need gobs of memory = Makes effective use of level 3 BLAS Requires little parameter fiddling / hand-holding Works with general elliptic problems (esp. elasticity) See Sherry Li plenary (and many minisymposium talks here). (Department of Computer Science Cornell University) Rank-Structured Cholesky / 27
6 From ND to superfast ND... ND gets performance using just graph structure: 2D: O(N 3/2 ) time, O(N log N) space. 3D: O(N 2 ) time, O(N 4/3 ) space. Superfast ND reduces space/time complexity via low-rank structure. (Department of Computer Science Cornell University) Rank-Structured Cholesky / 27
7 Strategy Start with CHOLMOD (a good supernodal left-looking Cholesky) Supernodal data structures are compact Algorithm + data layout = most work in level 3 BLAS Widely used already (so re-use the API!) Incorporate compact representations for low-rank blocks Outer product for off-diagonal blocks HSS-style representations for diagonal blocks Optimize, test, swear, fix, repeat (Department of Computer Science Cornell University) Rank-Structured Cholesky / 27
8 Supernodal storage structure L D j L(C j, C j ) L D j L(C j, R j ) L O j collapsed L O j L j (Department of Computer Science Cornell University) Rank-Structured Cholesky / 27
9 Supernode factorization U D j A(C j, C j ) U O j A(R j, C j ) for each k D j do Build dense updates from L O k Scatter updates to U D j and U O j L D j cholesky(u D j ) UO j (L D j ) T L O j Initialize storage Pull Schur contributions Finish forming L D j What changes in the rank-structured Cholesky? (Department of Computer Science Cornell University) Rank-Structured Cholesky / 27
10 Off-diagonal block compression L D j V j U T j Collapsed L O j Compressed L O j L O j Collapsed off-diagonal block is a (nearly low-rank) dense matrix (Department of Computer Science Cornell University) Rank-Structured Cholesky / 27
11 Off-diagonal block compression G rand( C j, r + p) C (L O j )T G for i = 1,..., s do C (L O j )C C (L O j )T C U j = orth(c) V j = L O j U j Compress without explicit L O j : Probe (L O j )T with random G Extract orth. row basis U j L O j = V ju T j = V j = L O j U j Where do we get the estimated rank bound r? (Department of Computer Science Cornell University) Rank-Structured Cholesky / 27
12 Interaction rank Could dynamically estimate the rank of L O j. Practice: empirical rank bound α k log(k). (Department of Computer Science Cornell University) Rank-Structured Cholesky / 27
13 Optimization: Selective off-diagonal compression j 1 j 2 j 3 j 1 j 2 j 3 Compress off-diagonal blocks of sufficiently large supernodes (j 1, j 2 ). (Department of Computer Science Cornell University) Rank-Structured Cholesky / 27
14 Optimization: Interior blocks B 2 B 4 j 1 j 2 B 1 B 3 j 3 j 1 j 2 j 3 B 1 B 2 B 3 B 4 Don t store any of L O j for interior blocks (Represent as L O j = AO j (LD j ) 1 when needed) (Department of Computer Science Cornell University) Rank-Structured Cholesky / 27
15 Diagonal block compression L D j = L D j,1 0 L D j,2 L D j,3 L D j,4 0 L D j,5 0 L D j,6 L D j,7 Basic observation: off-diagonal blocks are low-rank. (H-matrix, semiseparable structure, quasiseparable structure,...) Assumes reasonable ordering of unknowns! (Department of Computer Science Cornell University) Rank-Structured Cholesky / 27
16 Diagonal block compression L D j L D j,1 0 0 Vj,2 D (UD j,2 )T L D j,3 Vj,4 D (UD j,4 )T L D j,5 0. V D j,6 (UD j,6 )T L D j,7 How do we get directly to this without forming U D j explicitly? (Department of Computer Science Cornell University) Rank-Structured Cholesky / 27
17 Forming compressed updates L D j,1 L D j,3 L D j,2 L D j,5 L D j,4 L D j,7 L D j,6 D j (Department of Computer Science Cornell University) Rank-Structured Cholesky / 27
18 Rank-structured supernode factorization Basic ingredients: Randomized algorithms form U D j Rank-structured factorization of U D j Randomized algorithm forms L O j (involves solves with LD j ) Plus various optimizations. (Department of Computer Science Cornell University) Rank-Structured Cholesky / 27
19 Example: Large deformation of an elastic block (Department of Computer Science Cornell University) Rank-Structured Cholesky / 27
20 Example: Large deformation of an elastic block Benchmark based on example from deal.ii: Nearly-incompressible hyperelastic block under compression Mixed FE formulation (pressure and dilation condensed out) Tried both p = 1 and p = 2 finite elements Two load steps, Newton on each (14-15 steps) Experimental setup: 8-core Xeon X5570 with 48 GB RAM LAPACK/BLAS from MKL 11.0 PCG + preconditioners from Trilinos (Department of Computer Science Cornell University) Rank-Structured Cholesky / 27
21 RSC vs standard preconditioners (p = 1, N = 50) Relative residual Jacobi Relative residual RSC ML ICC RSC Jacobi ICC ML Iterations 10 3 Seconds 10 3 (Department of Computer Science Cornell University) Rank-Structured Cholesky / 27
22 RSC vs standard preconditioners (p = 2, N = 35) Relative residual RSC ICC ML Jacobi Relative residual RSC ICC ML Jacobi Iterations Seconds 10 3 (Department of Computer Science Cornell University) Rank-Structured Cholesky / 27
23 Time and memory comparisons (p = 1) Solve time (s) 1, ICC ML Jacobi RSC Cholesk Memory (GB) Choles RSC n n 10 6 (Department of Computer Science Cornell University) Rank-Structured Cholesky / 27
24 Effect of in-separator ordering Relative residual Semi-sep diag relies on variable order don t want any old order! Apply recursive bisection based on spatial coords Use coordinates if known Geo. Eig. Random Else assign spectrally Iteration 10 3 (Department of Computer Science Cornell University) Rank-Structured Cholesky / 27
25 Example: Trabecular bone model ( 1M dof) Relative residual Relative residual ICC ML 10 6 RSC2RSC1 ML ICC 10 6 RSC1 RSC Iterations 10 3 Seconds 10 3 (Department of Computer Science Cornell University) Rank-Structured Cholesky / 27
26 Example: Steel flange ( 1.5M dof) Relative residual RSC1 RSC2 ML ICC Iterations 10 3 Relative residual RSC2 RSC Seconds ML ICC 10 3 (Department of Computer Science Cornell University) Rank-Structured Cholesky / 27
27 Conclusions For more: J. Chadwick and D. Bindel. An Efficient Solver for Sparse Linear Systems Based on Rank-Structured Cholesky Factorization. (Department of Computer Science Cornell University) Rank-Structured Cholesky / 27
An Efficient Solver for Sparse Linear Systems based on Rank-Structured Cholesky Factorization
An Efficient Solver for Sparse Linear Systems based on Rank-Structured Cholesky Factorization David Bindel Department of Computer Science Cornell University 15 March 2016 (TSIMF) Rank-Structured Cholesky
More informationarxiv: v1 [cs.na] 20 Jul 2015
AN EFFICIENT SOLVER FOR SPARSE LINEAR SYSTEMS BASED ON RANK-STRUCTURED CHOLESKY FACTORIZATION JEFFREY N. CHADWICK AND DAVID S. BINDEL arxiv:1507.05593v1 [cs.na] 20 Jul 2015 Abstract. Direct factorization
More informationLecture 17: Iterative Methods and Sparse Linear Algebra
Lecture 17: Iterative Methods and Sparse Linear Algebra David Bindel 25 Mar 2014 Logistics HW 3 extended to Wednesday after break HW 4 should come out Monday after break Still need project description
More informationIncomplete Cholesky preconditioners that exploit the low-rank property
anapov@ulb.ac.be ; http://homepages.ulb.ac.be/ anapov/ 1 / 35 Incomplete Cholesky preconditioners that exploit the low-rank property (theory and practice) Artem Napov Service de Métrologie Nucléaire, Université
More informationDirect and Incomplete Cholesky Factorizations with Static Supernodes
Direct and Incomplete Cholesky Factorizations with Static Supernodes AMSC 661 Term Project Report Yuancheng Luo 2010-05-14 Introduction Incomplete factorizations of sparse symmetric positive definite (SSPD)
More informationBindel, Fall 2016 Matrix Computations (CS 6210) Notes for
1 Logistics Notes for 2016-08-26 1. Our enrollment is at 50, and there are still a few students who want to get in. We only have 50 seats in the room, and I cannot increase the cap further. So if you are
More informationLinear Solvers. Andrew Hazel
Linear Solvers Andrew Hazel Introduction Thus far we have talked about the formulation and discretisation of physical problems...... and stopped when we got to a discrete linear system of equations. Introduction
More informationExploiting off-diagonal rank structures in the solution of linear matrix equations
Stefano Massei Exploiting off-diagonal rank structures in the solution of linear matrix equations Based on joint works with D. Kressner (EPFL), M. Mazza (IPP of Munich), D. Palitta (IDCTS of Magdeburg)
More informationMatrix Assembly in FEA
Matrix Assembly in FEA 1 In Chapter 2, we spoke about how the global matrix equations are assembled in the finite element method. We now want to revisit that discussion and add some details. For example,
More informationUtilisation de la compression low-rank pour réduire la complexité du solveur PaStiX
Utilisation de la compression low-rank pour réduire la complexité du solveur PaStiX 26 Septembre 2018 - JCAD 2018 - Lyon Grégoire Pichon, Mathieu Faverge, Pierre Ramet, Jean Roman Outline 1. Context 2.
More informationFast algorithms for hierarchically semiseparable matrices
NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS Numer. Linear Algebra Appl. 2010; 17:953 976 Published online 22 December 2009 in Wiley Online Library (wileyonlinelibrary.com)..691 Fast algorithms for hierarchically
More informationOn the design of parallel linear solvers for large scale problems
On the design of parallel linear solvers for large scale problems ICIAM - August 2015 - Mini-Symposium on Recent advances in matrix computations for extreme-scale computers M. Faverge, X. Lacoste, G. Pichon,
More informationAn Empirical Comparison of Graph Laplacian Solvers
An Empirical Comparison of Graph Laplacian Solvers Kevin Deweese 1 Erik Boman 2 John Gilbert 1 1 Department of Computer Science University of California, Santa Barbara 2 Scalable Algorithms Department
More informationPartial Left-Looking Structured Multifrontal Factorization & Algorithms for Compressed Sensing. Cinna Julie Wu
Partial Left-Looking Structured Multifrontal Factorization & Algorithms for Compressed Sensing by Cinna Julie Wu A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor
More informationSOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS. Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA
1 SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA 2 OUTLINE Sparse matrix storage format Basic factorization
More informationImplicit Solution of Viscous Aerodynamic Flows using the Discontinuous Galerkin Method
Implicit Solution of Viscous Aerodynamic Flows using the Discontinuous Galerkin Method Per-Olof Persson and Jaime Peraire Massachusetts Institute of Technology 7th World Congress on Computational Mechanics
More informationScientific Computing with Case Studies SIAM Press, Lecture Notes for Unit VII Sparse Matrix
Scientific Computing with Case Studies SIAM Press, 2009 http://www.cs.umd.edu/users/oleary/sccswebpage Lecture Notes for Unit VII Sparse Matrix Computations Part 1: Direct Methods Dianne P. O Leary c 2008
More informationAn Efficient Low Memory Implicit DG Algorithm for Time Dependent Problems
An Efficient Low Memory Implicit DG Algorithm for Time Dependent Problems P.-O. Persson and J. Peraire Massachusetts Institute of Technology 2006 AIAA Aerospace Sciences Meeting, Reno, Nevada January 9,
More informationNetwork Analysis at IIT Bombay
H. Narayanan Department of Electrical Engineering Indian Institute of Technology, Bombay October, 2007 Plan Speed-up of Network Analysis by exploiting Topological Methods. Adaptation of standard linear
More informationFast Structured Spectral Methods
Spectral methods HSS structures Fast algorithms Conclusion Fast Structured Spectral Methods Yingwei Wang Department of Mathematics, Purdue University Joint work with Prof Jie Shen and Prof Jianlin Xia
More informationPower System Analysis Prof. A. K. Sinha Department of Electrical Engineering Indian Institute of Technology, Kharagpur. Lecture - 21 Power Flow VI
Power System Analysis Prof. A. K. Sinha Department of Electrical Engineering Indian Institute of Technology, Kharagpur Lecture - 21 Power Flow VI (Refer Slide Time: 00:57) Welcome to lesson 21. In this
More informationACCELERATING SPARSE CHOLESKY FACTORIZATION ON THE GPU
ACCELERATING SPARSE CHOLESKY FACTORIZATION ON THE GPU STEVE RENNICH, SR. ENGINEER, NVIDIA DEVELOPER TECHNOLOGY DARKO STOSIC, PHD CANDIDATE, UNIV. FEDERAL DE PERNAMBUCO TIM DAVIS, PROFESSOR, CSE, TEXAS
More informationAMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)
AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) Lecture 19: Computing the SVD; Sparse Linear Systems Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical
More informationAssignment on iterative solution methods and preconditioning
Division of Scientific Computing, Department of Information Technology, Uppsala University Numerical Linear Algebra October-November, 2018 Assignment on iterative solution methods and preconditioning 1.
More informationSolving PDEs with Multigrid Methods p.1
Solving PDEs with Multigrid Methods Scott MacLachlan maclachl@colorado.edu Department of Applied Mathematics, University of Colorado at Boulder Solving PDEs with Multigrid Methods p.1 Support and Collaboration
More informationFINE-GRAINED PARALLEL INCOMPLETE LU FACTORIZATION
FINE-GRAINED PARALLEL INCOMPLETE LU FACTORIZATION EDMOND CHOW AND AFTAB PATEL Abstract. This paper presents a new fine-grained parallel algorithm for computing an incomplete LU factorization. All nonzeros
More informationNumerical Methods in Matrix Computations
Ake Bjorck Numerical Methods in Matrix Computations Springer Contents 1 Direct Methods for Linear Systems 1 1.1 Elements of Matrix Theory 1 1.1.1 Matrix Algebra 2 1.1.2 Vector Spaces 6 1.1.3 Submatrices
More informationA Fast Direct Solver for a Class of Elliptic Partial Differential Equations
J Sci Comput (2009) 38: 316 330 DOI 101007/s10915-008-9240-6 A Fast Direct Solver for a Class of Elliptic Partial Differential Equations Per-Gunnar Martinsson Received: 20 September 2007 / Revised: 30
More informationFine-Grained Parallel Algorithms for Incomplete Factorization Preconditioning
Fine-Grained Parallel Algorithms for Incomplete Factorization Preconditioning Edmond Chow School of Computational Science and Engineering Georgia Institute of Technology, USA SPPEXA Symposium TU München,
More informationA DISTRIBUTED-MEMORY RANDOMIZED STRUCTURED MULTIFRONTAL METHOD FOR SPARSE DIRECT SOLUTIONS
A DISTRIBUTED-MEMORY RANDOMIZED STRUCTURED MULTIFRONTAL METHOD FOR SPARSE DIRECT SOLUTIONS ZIXING XIN, JIANLIN XIA, MAARTEN V. DE HOOP, STEPHEN CAULEY, AND VENKATARAMANAN BALAKRISHNAN Abstract. We design
More informationA robust multilevel approximate inverse preconditioner for symmetric positive definite matrices
DICEA DEPARTMENT OF CIVIL, ENVIRONMENTAL AND ARCHITECTURAL ENGINEERING PhD SCHOOL CIVIL AND ENVIRONMENTAL ENGINEERING SCIENCES XXX CYCLE A robust multilevel approximate inverse preconditioner for symmetric
More informationA High-Performance Parallel Hybrid Method for Large Sparse Linear Systems
Outline A High-Performance Parallel Hybrid Method for Large Sparse Linear Systems Azzam Haidar CERFACS, Toulouse joint work with Luc Giraud (N7-IRIT, France) and Layne Watson (Virginia Polytechnic Institute,
More informationFAST STRUCTURED EIGENSOLVER FOR DISCRETIZED PARTIAL DIFFERENTIAL OPERATORS ON GENERAL MESHES
Proceedings of the Project Review, Geo-Mathematical Imaging Group Purdue University, West Lafayette IN, Vol. 1 2012 pp. 123-132. FAST STRUCTURED EIGENSOLVER FOR DISCRETIZED PARTIAL DIFFERENTIAL OPERATORS
More informationScalable Hybrid Programming and Performance for SuperLU Sparse Direct Solver
Scalable Hybrid Programming and Performance for SuperLU Sparse Direct Solver Sherry Li Lawrence Berkeley National Laboratory Piyush Sao Rich Vuduc Georgia Institute of Technology CUG 14, May 4-8, 14, Lugano,
More informationCourse Notes: Week 1
Course Notes: Week 1 Math 270C: Applied Numerical Linear Algebra 1 Lecture 1: Introduction (3/28/11) We will focus on iterative methods for solving linear systems of equations (and some discussion of eigenvalues
More informationA DISTRIBUTED-MEMORY RANDOMIZED STRUCTURED MULTIFRONTAL METHOD FOR SPARSE DIRECT SOLUTIONS
SIAM J. SCI. COMPUT. Vol. 39, No. 4, pp. C292 C318 c 2017 Society for Industrial and Applied Mathematics A DISTRIBUTED-MEMORY RANDOMIZED STRUCTURED MULTIFRONTAL METHOD FOR SPARSE DIRECT SOLUTIONS ZIXING
More informationThe Conjugate Gradient Method
The Conjugate Gradient Method Classical Iterations We have a problem, We assume that the matrix comes from a discretization of a PDE. The best and most popular model problem is, The matrix will be as large
More informationSparse least squares and Q-less QR
Notes for 2016-02-29 Sparse least squares and Q-less QR Suppose we want to solve a full-rank least squares problem in which A is large and sparse. In principle, we could solve the problem via the normal
More informationPDE Solvers for Fluid Flow
PDE Solvers for Fluid Flow issues and algorithms for the Streaming Supercomputer Eran Guendelman February 5, 2002 Topics Equations for incompressible fluid flow 3 model PDEs: Hyperbolic, Elliptic, Parabolic
More informationSolving Large Nonlinear Sparse Systems
Solving Large Nonlinear Sparse Systems Fred W. Wubs and Jonas Thies Computational Mechanics & Numerical Mathematics University of Groningen, the Netherlands f.w.wubs@rug.nl Centre for Interdisciplinary
More informationEnhancing Scalability of Sparse Direct Methods
Journal of Physics: Conference Series 78 (007) 0 doi:0.088/7-6596/78//0 Enhancing Scalability of Sparse Direct Methods X.S. Li, J. Demmel, L. Grigori, M. Gu, J. Xia 5, S. Jardin 6, C. Sovinec 7, L.-Q.
More informationAccelerating interior point methods with GPUs for smart grid systems
Downloaded from orbit.dtu.dk on: Dec 18, 2017 Accelerating interior point methods with GPUs for smart grid systems Gade-Nielsen, Nicolai Fog Publication date: 2011 Document Version Publisher's PDF, also
More informationQuasi-Newton Methods
Newton s Method Pros and Cons Quasi-Newton Methods MA 348 Kurt Bryan Newton s method has some very nice properties: It s extremely fast, at least once it gets near the minimum, and with the simple modifications
More informationEffective matrix-free preconditioning for the augmented immersed interface method
Effective matrix-free preconditioning for the augmented immersed interface method Jianlin Xia a, Zhilin Li b, Xin Ye a a Department of Mathematics, Purdue University, West Lafayette, IN 47907, USA. E-mail:
More informationBindel, Spring 2016 Numerical Analysis (CS 4220) Notes for
Cholesky Notes for 2016-02-17 2016-02-19 So far, we have focused on the LU factorization for general nonsymmetric matrices. There is an alternate factorization for the case where A is symmetric positive
More informationNumerical Methods I Non-Square and Sparse Linear Systems
Numerical Methods I Non-Square and Sparse Linear Systems Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 September 25th, 2014 A. Donev (Courant
More informationANONSINGULAR tridiagonal linear system of the form
Generalized Diagonal Pivoting Methods for Tridiagonal Systems without Interchanges Jennifer B. Erway, Roummel F. Marcia, and Joseph A. Tyson Abstract It has been shown that a nonsingular symmetric tridiagonal
More informationA Newton-Galerkin-ADI Method for Large-Scale Algebraic Riccati Equations
A Newton-Galerkin-ADI Method for Large-Scale Algebraic Riccati Equations Peter Benner Max-Planck-Institute for Dynamics of Complex Technical Systems Computational Methods in Systems and Control Theory
More informationLecture 18 Classical Iterative Methods
Lecture 18 Classical Iterative Methods MIT 18.335J / 6.337J Introduction to Numerical Methods Per-Olof Persson November 14, 2006 1 Iterative Methods for Linear Systems Direct methods for solving Ax = b,
More informationFrom Stationary Methods to Krylov Subspaces
Week 6: Wednesday, Mar 7 From Stationary Methods to Krylov Subspaces Last time, we discussed stationary methods for the iterative solution of linear systems of equations, which can generally be written
More information6. Iterative Methods for Linear Systems. The stepwise approach to the solution...
6 Iterative Methods for Linear Systems The stepwise approach to the solution Miriam Mehl: 6 Iterative Methods for Linear Systems The stepwise approach to the solution, January 18, 2013 1 61 Large Sparse
More informationNumerical Solution Techniques in Mechanical and Aerospace Engineering
Numerical Solution Techniques in Mechanical and Aerospace Engineering Chunlei Liang LECTURE 3 Solvers of linear algebraic equations 3.1. Outline of Lecture Finite-difference method for a 2D elliptic PDE
More informationFINE-GRAINED PARALLEL INCOMPLETE LU FACTORIZATION
FINE-GRAINED PARALLEL INCOMPLETE LU FACTORIZATION EDMOND CHOW AND AFTAB PATEL Abstract. This paper presents a new fine-grained parallel algorithm for computing an incomplete LU factorization. All nonzeros
More informationSparse solver 64 bit and out-of-core addition
Sparse solver 64 bit and out-of-core addition Prepared By: Richard Link Brian Yuen Martec Limited 1888 Brunswick Street, Suite 400 Halifax, Nova Scotia B3J 3J8 PWGSC Contract Number: W7707-145679 Contract
More information7.2 Steepest Descent and Preconditioning
7.2 Steepest Descent and Preconditioning Descent methods are a broad class of iterative methods for finding solutions of the linear system Ax = b for symmetric positive definite matrix A R n n. Consider
More informationAn Efficient Graph Sparsification Approach to Scalable Harmonic Balance (HB) Analysis of Strongly Nonlinear RF Circuits
Design Automation Group An Efficient Graph Sparsification Approach to Scalable Harmonic Balance (HB) Analysis of Strongly Nonlinear RF Circuits Authors : Lengfei Han (Speaker) Xueqian Zhao Dr. Zhuo Feng
More informationNumerical Methods I: Numerical linear algebra
1/3 Numerical Methods I: Numerical linear algebra Georg Stadler Courant Institute, NYU stadler@cimsnyuedu September 1, 017 /3 We study the solution of linear systems of the form Ax = b with A R n n, x,
More informationA robust inner-outer HSS preconditioner
NUMERICAL LINEAR ALGEBRA WIH APPLICAIONS Numer. Linear Algebra Appl. 2011; 00:1 0 [Version: 2002/09/18 v1.02] A robust inner-outer HSS preconditioner Jianlin Xia 1 1 Department of Mathematics, Purdue University,
More informationSparse LU Factorization on GPUs for Accelerating SPICE Simulation
Nano-scale Integrated Circuit and System (NICS) Laboratory Sparse LU Factorization on GPUs for Accelerating SPICE Simulation Xiaoming Chen PhD Candidate Department of Electronic Engineering Tsinghua University,
More informationFrom Direct to Iterative Substructuring: some Parallel Experiences in 2 and 3D
From Direct to Iterative Substructuring: some Parallel Experiences in 2 and 3D Luc Giraud N7-IRIT, Toulouse MUMPS Day October 24, 2006, ENS-INRIA, Lyon, France Outline 1 General Framework 2 The direct
More informationc 2015 Society for Industrial and Applied Mathematics
SIAM J. SCI. COMPUT. Vol. 37, No. 2, pp. C169 C193 c 2015 Society for Industrial and Applied Mathematics FINE-GRAINED PARALLEL INCOMPLETE LU FACTORIZATION EDMOND CHOW AND AFTAB PATEL Abstract. This paper
More informationA Comparison of Solving the Poisson Equation Using Several Numerical Methods in Matlab and Octave on the Cluster maya
A Comparison of Solving the Poisson Equation Using Several Numerical Methods in Matlab and Octave on the Cluster maya Sarah Swatski, Samuel Khuvis, and Matthias K. Gobbert (gobbert@umbc.edu) Department
More informationV C V L T I 0 C V B 1 V T 0 I. l nk
Multifrontal Method Kailai Xu September 16, 2017 Main observation. Consider the LDL T decomposition of a SPD matrix [ ] [ ] [ ] [ ] B V T L 0 I 0 L T L A = = 1 V T V C V L T I 0 C V B 1 V T, 0 I where
More informationA Space-Time Multigrid Solver Methodology for the Optimal Control of Time-Dependent Fluid Flow
A Space-Time Multigrid Solver Methodology for the Optimal Control of Time-Dependent Fluid Flow Michael Köster, Michael Hinze, Stefan Turek Michael Köster Institute for Applied Mathematics TU Dortmund Trier,
More informationKasetsart University Workshop. Multigrid methods: An introduction
Kasetsart University Workshop Multigrid methods: An introduction Dr. Anand Pardhanani Mathematics Department Earlham College Richmond, Indiana USA pardhan@earlham.edu A copy of these slides is available
More informationApplied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic
Applied Mathematics 205 Unit II: Numerical Linear Algebra Lecturer: Dr. David Knezevic Unit II: Numerical Linear Algebra Chapter II.3: QR Factorization, SVD 2 / 66 QR Factorization 3 / 66 QR Factorization
More informationPreliminary Results of GRAPES Helmholtz solver using GCR and PETSc tools
Preliminary Results of GRAPES Helmholtz solver using GCR and PETSc tools Xiangjun Wu (1),Lilun Zhang (2),Junqiang Song (2) and Dehui Chen (1) (1) Center for Numerical Weather Prediction, CMA (2) School
More informationA Robust Preconditioned Iterative Method for the Navier-Stokes Equations with High Reynolds Numbers
Applied and Computational Mathematics 2017; 6(4): 202-207 http://www.sciencepublishinggroup.com/j/acm doi: 10.11648/j.acm.20170604.18 ISSN: 2328-5605 (Print); ISSN: 2328-5613 (Online) A Robust Preconditioned
More informationFINDING PARALLELISM IN GENERAL-PURPOSE LINEAR PROGRAMMING
FINDING PARALLELISM IN GENERAL-PURPOSE LINEAR PROGRAMMING Daniel Thuerck 1,2 (advisors Michael Goesele 1,2 and Marc Pfetsch 1 ) Maxim Naumov 3 1 Graduate School of Computational Engineering, TU Darmstadt
More informationSparse Matrix Computations in Arterial Fluid Mechanics
Sparse Matrix Computations in Arterial Fluid Mechanics Murat Manguoğlu Middle East Technical University, Turkey Kenji Takizawa Ahmed Sameh Tayfun Tezduyar Waseda University, Japan Purdue University, USA
More informationParallel Transposition of Sparse Data Structures
Parallel Transposition of Sparse Data Structures Hao Wang, Weifeng Liu, Kaixi Hou, Wu-chun Feng Department of Computer Science, Virginia Tech Niels Bohr Institute, University of Copenhagen Scientific Computing
More informationDomain Decomposition-based contour integration eigenvalue solvers
Domain Decomposition-based contour integration eigenvalue solvers Vassilis Kalantzis joint work with Yousef Saad Computer Science and Engineering Department University of Minnesota - Twin Cities, USA SIAM
More informationAn exact reanalysis algorithm using incremental Cholesky factorization and its application to crack growth modeling
INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN ENGINEERING Int. J. Numer. Meth. Engng 01; 91:158 14 Published online 5 June 01 in Wiley Online Library (wileyonlinelibrary.com). DOI: 10.100/nme.4 SHORT
More informationSolution to Laplace Equation using Preconditioned Conjugate Gradient Method with Compressed Row Storage using MPI
Solution to Laplace Equation using Preconditioned Conjugate Gradient Method with Compressed Row Storage using MPI Sagar Bhatt Person Number: 50170651 Department of Mechanical and Aerospace Engineering,
More informationNumerical Methods Lecture 2 Simultaneous Equations
Numerical Methods Lecture 2 Simultaneous Equations Topics: matrix operations solving systems of equations pages 58-62 are a repeat of matrix notes. New material begins on page 63. Matrix operations: Mathcad
More informationAccelerating linear algebra computations with hybrid GPU-multicore systems.
Accelerating linear algebra computations with hybrid GPU-multicore systems. Marc Baboulin INRIA/Université Paris-Sud joint work with Jack Dongarra (University of Tennessee and Oak Ridge National Laboratory)
More informationTR A Comparison of the Performance of SaP::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems
TR-0-07 A Comparison of the Performance of ::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems Ang Li, Omkar Deshmukh, Radu Serban, Dan Negrut May, 0 Abstract ::GPU is a
More information9. Iterative Methods for Large Linear Systems
EE507 - Computational Techniques for EE Jitkomut Songsiri 9. Iterative Methods for Large Linear Systems introduction splitting method Jacobi method Gauss-Seidel method successive overrelaxation (SOR) 9-1
More informationPowerPoints organized by Dr. Michael R. Gustafson II, Duke University
Part 3 Chapter 10 LU Factorization PowerPoints organized by Dr. Michael R. Gustafson II, Duke University All images copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
More informationNumerical Linear Algebra
Numerical Linear Algebra Decompositions, numerical aspects Gerard Sleijpen and Martin van Gijzen September 27, 2017 1 Delft University of Technology Program Lecture 2 LU-decomposition Basic algorithm Cost
More informationProgram Lecture 2. Numerical Linear Algebra. Gaussian elimination (2) Gaussian elimination. Decompositions, numerical aspects
Numerical Linear Algebra Decompositions, numerical aspects Program Lecture 2 LU-decomposition Basic algorithm Cost Stability Pivoting Cholesky decomposition Sparse matrices and reorderings Gerard Sleijpen
More informationFinite-choice algorithm optimization in Conjugate Gradients
Finite-choice algorithm optimization in Conjugate Gradients Jack Dongarra and Victor Eijkhout January 2003 Abstract We present computational aspects of mathematically equivalent implementations of the
More informationMultigrid and Iterative Strategies for Optimal Control Problems
Multigrid and Iterative Strategies for Optimal Control Problems John Pearson 1, Stefan Takacs 1 1 Mathematical Institute, 24 29 St. Giles, Oxford, OX1 3LB e-mail: john.pearson@worc.ox.ac.uk, takacs@maths.ox.ac.uk
More informationCS 542G: Conditioning, BLAS, LU Factorization
CS 542G: Conditioning, BLAS, LU Factorization Robert Bridson September 22, 2008 1 Why some RBF Kernel Functions Fail We derived some sensible RBF kernel functions, like φ(r) = r 2 log r, from basic principles
More informationDELFT UNIVERSITY OF TECHNOLOGY
DELFT UNIVERSITY OF TECHNOLOGY REPORT -09 Computational and Sensitivity Aspects of Eigenvalue-Based Methods for the Large-Scale Trust-Region Subproblem Marielba Rojas, Bjørn H. Fotland, and Trond Steihaug
More informationCLASSICAL ITERATIVE METHODS
CLASSICAL ITERATIVE METHODS LONG CHEN In this notes we discuss classic iterative methods on solving the linear operator equation (1) Au = f, posed on a finite dimensional Hilbert space V = R N equipped
More informationSparse BLAS-3 Reduction
Sparse BLAS-3 Reduction to Banded Upper Triangular (Spar3Bnd) Gary Howell, HPC/OIT NC State University gary howell@ncsu.edu Sparse BLAS-3 Reduction p.1/27 Acknowledgements James Demmel, Gene Golub, Franc
More informationORIE 6326: Convex Optimization. Quasi-Newton Methods
ORIE 6326: Convex Optimization Quasi-Newton Methods Professor Udell Operations Research and Information Engineering Cornell April 10, 2017 Slides on steepest descent and analysis of Newton s method adapted
More informationFine-grained Parallel Incomplete LU Factorization
Fine-grained Parallel Incomplete LU Factorization Edmond Chow School of Computational Science and Engineering Georgia Institute of Technology Sparse Days Meeting at CERFACS June 5-6, 2014 Contribution
More informationPreconditioning Techniques for Large Linear Systems Part III: General-Purpose Algebraic Preconditioners
Preconditioning Techniques for Large Linear Systems Part III: General-Purpose Algebraic Preconditioners Michele Benzi Department of Mathematics and Computer Science Emory University Atlanta, Georgia, USA
More informationSpectral Clustering on Handwritten Digits Database Mid-Year Pr
Spectral Clustering on Handwritten Digits Database Mid-Year Presentation Danielle dmiddle1@math.umd.edu Advisor: Kasso Okoudjou kasso@umd.edu Department of Mathematics University of Maryland- College Park
More informationMath 671: Tensor Train decomposition methods II
Math 671: Tensor Train decomposition methods II Eduardo Corona 1 1 University of Michigan at Ann Arbor December 13, 2016 Table of Contents 1 What we ve talked about so far: 2 The Tensor Train decomposition
More information1 GSW Sets of Systems
1 Often, we have to solve a whole series of sets of simultaneous equations of the form y Ax, all of which have the same matrix A, but each of which has a different known vector y, and a different unknown
More informationNumerical Methods Lecture 2 Simultaneous Equations
CGN 42 - Computer Methods Numerical Methods Lecture 2 Simultaneous Equations Topics: matrix operations solving systems of equations Matrix operations: Adding / subtracting Transpose Multiplication Adding
More informationFast matrix algebra for dense matrices with rank-deficient off-diagonal blocks
CHAPTER 2 Fast matrix algebra for dense matrices with rank-deficient off-diagonal blocks Chapter summary: The chapter describes techniques for rapidly performing algebraic operations on dense matrices
More informationLinear Systems of Equations. ChEn 2450
Linear Systems of Equations ChEn 450 LinearSystems-directkey - August 5, 04 Example Circuit analysis (also used in heat transfer) + v _ R R4 I I I3 R R5 R3 Kirchoff s Laws give the following equations
More informationSolving linear equations with Gaussian Elimination (I)
Term Projects Solving linear equations with Gaussian Elimination The QR Algorithm for Symmetric Eigenvalue Problem The QR Algorithm for The SVD Quasi-Newton Methods Solving linear equations with Gaussian
More informationLecture 3: Special Matrices
Lecture 3: Special Matrices Feedback of assignment1 Random matrices The magic matrix commend magic() doesn t give us random matrix. Random matrix means we will get different matrices each time when we
More informationA dissection solver with kernel detection for unsymmetric matrices in FreeFem++
. p.1/21 11 Dec. 2014, LJLL, Paris FreeFem++ workshop A dissection solver with kernel detection for unsymmetric matrices in FreeFem++ Atsushi Suzuki Atsushi.Suzuki@ann.jussieu.fr Joint work with François-Xavier
More informationCS264: Beyond Worst-Case Analysis Lecture #15: Topic Modeling and Nonnegative Matrix Factorization
CS264: Beyond Worst-Case Analysis Lecture #15: Topic Modeling and Nonnegative Matrix Factorization Tim Roughgarden February 28, 2017 1 Preamble This lecture fulfills a promise made back in Lecture #1,
More information