Scalable, Robust, Fault-Tolerant Parallel QR Factorization. Camille Coti

Size: px
Start display at page:

Download "Scalable, Robust, Fault-Tolerant Parallel QR Factorization. Camille Coti"

Transcription

1 Communication-avoiding Fault-tolerant Scalable, Robust, Fault-Tolerant Parallel Factorization Camille Coti LIPN, CNRS UMR 7030, SPC, Université Paris 13, France DCABES August 24th, / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

2 Roadmap Communication-avoiding Fault-tolerant 1 Failures Factorization 2 Communication-avoiding 3 Fault-tolerant 4 2 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

3 Communication-avoiding Fault-tolerant Large-scale systems are volatile Failures Factorization Life expectancy of an electronic component: the famous bathtub curve 3 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

4 Communication-avoiding Fault-tolerant Reliability of a distributed system Failures Factorization Mean Time Between Failures n 1 1 MT BF total = ( ) 1 (1) MT BF i The more components a system is made of, the more likely it is to have a failure. Mean Time Between Failures of the system (hours) i= H H H H e+06 Number of components in the system 4 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

5 Factorization Communication-avoiding Fault-tolerant Failures Factorization Matrix A: A = R is upper-triangular Q is orthogonal Unicity: Yes, if the diagonal elements of R are positive No otherwise Several algorithms: Dense linear algebra: algorithms based on unitary transformations. (Modified) Gram-Schmidt, Givens rotation, Householder projection... 5 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

6 Factorization Communication-avoiding Fault-tolerant Failures Factorization Panel-update approach Factorize a panel Update the trailing matrix Repeat recursively on the trailing matrix R ( ) A11 A 12 A = A 21 A 22 ( ) 1 R 12 = Q 1 0 A 1 22 Q

7 Factorization Communication-avoiding Fault-tolerant Failures Factorization Panel-update approach Factorize a panel Update the trailing matrix Repeat recursively on the trailing matrix R ( ) A11 A 12 A = A 21 A 22 ( ) 1 R 12 = Q 1 0 A 1 22 Q

8 Factorization Communication-avoiding Fault-tolerant Failures Factorization Panel-update approach Factorize a panel Update the trailing matrix Repeat recursively on the trailing matrix R ( ) A11 A 12 A = A 21 A 22 ( ) 1 R 12 = Q 1 0 A 1 22 Q panel trailing matrix 6 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

9 Communication-avoiding Fault-tolerant 1 Failures Factorization 2 Communication-avoiding 3 Fault-tolerant 4 7 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

10 Communication-avoiding Fault-tolerant factorization of a panel: particular shape M >> N: the matrix is tall-and-skinny Specific algorithm: TS ( A11 A 21 ) ( ) 1 = Q 1 0 P0 A0 R V0 V0 V P1 A1 V1 P2 A2 V2 V2 P3 A3 V3 8 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

11 Communication-avoiding Fault-tolerant factorization of a panel: particular shape M >> N: the matrix is tall-and-skinny Specific algorithm: TS ( A11 A 21 ) ( ) 1 = Q 1 0 P0 A0 R V0 V0 V P1 A1 V1 P2 A2 V2 V2 P3 A3 V3 8 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

12 Communication-avoiding Fault-tolerant factorization of a panel: particular shape M >> N: the matrix is tall-and-skinny Specific algorithm: TS ( A11 A 21 ) ( ) 1 = Q 1 0 P0 A0 R V0 V0 V P1 A1 V1 P2 A2 V2 V2 P3 A3 V3 8 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

13 Communication-avoiding Fault-tolerant factorization of a panel: particular shape M >> N: the matrix is tall-and-skinny Specific algorithm: TS ( A11 A 21 ) ( ) 1 = Q 1 0 P0 A0 R V0 V0 V P1 A1 V1 P2 A2 V2 V2 P3 A3 V3 8 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

14 Communication-avoiding Fault-tolerant factorization of a panel: particular shape M >> N: the matrix is tall-and-skinny Specific algorithm: TS ( A11 A 21 ) ( ) 1 = Q 1 0 P0 A0 R V0 V0 V P1 A1 V1 P2 A2 V2 V2 P3 A3 V3 8 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

15 Communication-avoiding Fault-tolerant factorization of a panel: particular shape M >> N: the matrix is tall-and-skinny Specific algorithm: TS ( A11 A 21 ) ( ) 1 = Q 1 0 P0 A0 R V0 V0 V P1 A1 V1 P2 A2 V2 V2 P3 A3 V3 8 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

16 Communication-avoiding Fault-tolerant After the panel factorization: Compute the compact representation of Q 1: Q 1 = I Y 1T 1Y1 T Use the Y 1 and T 1 matrices to update the trailing matrix: ( I Y1T 1Y T 1 ) ( A 12 A 22 ) = = ( A12 A 22 ( 2 A 1 22 ) ) Y 1(T T 1 (Y T 1 ( A12 A 22 ) )) R 12 : upper part of the R A 1 22 : new trailing matrix. ( ) A11 A 12 A = A 21 A 22 ( ) 1 R 12 = Q 1 0 A / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

17 Communication-avoiding Fault-tolerant Step 0 Step 1 Step 2 P0 P0 P0 P1 P1 P1 P2 P2 P2 P3 P3 P3 Cij 10 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

18 Communication-avoiding Fault-tolerant Step 0 Step 1 Step 2 P0 P0 P0 P1 P1 P1 P2 P2 P2 P3 P3 P3 Cij 10 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

19 Communication-avoiding Fault-tolerant Step 0 Step 1 Step 2 P0 P0 P0 P1 P1 P1 P2 P2 P2 P3 P3 P3 Cij 10 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

20 Communication-avoiding Fault-tolerant 1 Failures Factorization 2 Communication-avoiding 3 Fault-tolerant 4 11 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

21 Communication-avoiding Fault-tolerant Let s look at TS in details P0 A0 R V0 V0 V P1 A1 V1 P2 A2 V2 V2 P3 A3 V3 12 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

22 Communication-avoiding Fault-tolerant Let s look at TS in details P0 A0 R V0 V0 V P1 A1 V1 P2 A2 V2 V2 P3 A3 V3 12 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

23 Communication-avoiding Fault-tolerant Let s look at TS in details P0 A0 R V0 V0 V P1 A1 V1 P2 A2 V2 V2 P3 A3 V3 12 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

24 Communication-avoiding Fault-tolerant Let s look at TS in details P0 A0 R V0 V0 V P1 A1 V1 P2 A2 V2 V2 P3 A3 V3 12 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

25 Communication-avoiding Fault-tolerant Let s look at TS in details P0 A0 R V0 V0 V P1 A1 V1 P2 A2 V2 V2 P3 A3 V3 12 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

26 Communication-avoiding Fault-tolerant Let s look at TS in details P0 A0 R V0 V0 V P1 A1 V1 P2 A2 V2 V2 P3 A3 V3 12 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

27 Communication-avoiding Fault-tolerant P 0 works beginning end P 2 works during the first two steps, then stops P 1 and P 3 work during the first step, then stops Let s put these lazy dudes to work! 13 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

28 Fault-tolerant TS Communication-avoiding Fault-tolerant P 0 P 1 P 2 P 3 14 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

29 Fault-tolerant TS Communication-avoiding Fault-tolerant P 0 P 1 P 2 P 3 14 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

30 Fault-tolerant TS Communication-avoiding Fault-tolerant P 0 P 1 P 2 P 3 14 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

31 Communication-avoiding Fault-tolerant Update of the trailing matrix Triggered by the tree of the panel factorization Uses the intermediate Y and T matrices Pairwise operation tree ( ) R Ĉ 0 A = Q Ĉ 1 Decompose the blocks of the trailing matrix: ( ) ( ) C C i = i Ci[: N 1] C i = C i[n :] Compute the update : ( ) ( ) C 0 C R 1 C 1 = 0 C 1 P 0 Ĉ 0 = C 0 W C 0 W P 1 T W = T T (C 0 Y T 1 C 1) Ĉ 1 = C 1 Y 1W 15 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

32 Fault-tolerant update Communication-avoiding Fault-tolerant Change the order of the operations T W = T T (C 0 Y T 1 C 1) Ĉ 0 = C 0 W P 0 C 0 C 1, Y 1 P 1 T W = T T (C 0 Y T 1 C 1) Ĉ 1 = C 1 Y 1W If a process fails Both process have the W matrix Computation recovered from the W matrix and the initial submatrix 16 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

33 Fault-tolerant update Communication-avoiding Fault-tolerant Change the order of the operations T W = T T (C 0 Y T 1 C 1) Ĉ 0 = C 0 W P 0 C 0 C 1, Y 1 P 1 T W = T T (C 0 Y T 1 C 1) Ĉ 1 = C 1 Y 1W If a process fails Both process have the W matrix Computation recovered from the W matrix and the initial submatrix 16 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

34 Communication scheme Communication-avoiding Fault-tolerant P 0 P 1 P 2 P 3 17 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

35 Soft errors Communication-avoiding Fault-tolerant Soft errors : Bit flips causing numerical errors Caused by radiations, cosmic rays, various electric disturbances... Approaches to tackle soft errors: Triple modular redundancy ECC memory Multiple executions... At several levels: hardware, system, application Idea here: Take advantage of partial redundancy to detect errors Recompute only the erroneous step 18 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

36 Soft errors Communication-avoiding Fault-tolerant exchange check check exchange check P 0 P 1 P 2 P 3 19 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

37 Soft errors Communication-avoiding Fault-tolerant exchange check check exchange check P 0 P 1 P 2 P 3 19 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

38 Soft errors Communication-avoiding Fault-tolerant exchange check check exchange check P 0 P 1 P 2 P 3 19 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

39 Soft errors Communication-avoiding Fault-tolerant exchange check check exchange check P 0 P 1 P 2 P 3 19 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

40 Soft errors Communication-avoiding Fault-tolerant exchange check check exchange check P 0 P 1 P 2 P 3 19 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

41 Communication-avoiding Fault-tolerant 1 Failures Factorization 2 Communication-avoiding 3 Fault-tolerant 4 20 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

42 Communication-avoiding Fault-tolerant Algorithm-based fault-tolerance Failure recovery is handled by the application Behavior upon failures: defined in the algorithm Goal: Minimal overhead during failure-free executions As few recomputations as possible after failures Fault-tolerant algorithms for factorization Use intrinsic redundancy or idle processes Operation reordering based on algebraic properties Small modifications in the critical path No additional computations Data exchange instead of send/recv Partial results already ready for recovery 21 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

43 Appendix: performance Communication-avoiding Fault-tolerant Time (sec) CA FT TS FT CA TS 0 1k 4k 8k 16k 32k Matrix size 21 / 21 C. Coti Scalable, Robust, Fault-Tolerant Parallel Factorization

Scalable, Robust, Fault-Tolerant Parallel QR Factorization

Scalable, Robust, Fault-Tolerant Parallel QR Factorization Scalable, Robust, Fault-Tolerant Parallel QR Factorization Camille Coti LIPN, CNRS UMR 7030, Université Paris 13, Sorbonne Paris Cité 99, avenue Jean-Baptiste Clément, F-93430 Villetaneuse, FRANCE camille.coti@lipn.univ-paris13.fr

More information

QR Factorization of Tall and Skinny Matrices in a Grid Computing Environment

QR Factorization of Tall and Skinny Matrices in a Grid Computing Environment QR Factorization of Tall and Skinny Matrices in a Grid Computing Environment Emmanuel AGULLO (INRIA / LaBRI) Camille COTI (Iowa State University) Jack DONGARRA (University of Tennessee) Thomas HÉRAULT

More information

Avoiding Communication in Distributed-Memory Tridiagonalization

Avoiding Communication in Distributed-Memory Tridiagonalization Avoiding Communication in Distributed-Memory Tridiagonalization SIAM CSE 15 Nicholas Knight University of California, Berkeley March 14, 2015 Joint work with: Grey Ballard (SNL) James Demmel (UCB) Laura

More information

Communication avoiding parallel algorithms for dense matrix factorizations

Communication avoiding parallel algorithms for dense matrix factorizations Communication avoiding parallel dense matrix factorizations 1/ 44 Communication avoiding parallel algorithms for dense matrix factorizations Edgar Solomonik Department of EECS, UC Berkeley October 2013

More information

The 'linear algebra way' of talking about "angle" and "similarity" between two vectors is called "inner product". We'll define this next.

The 'linear algebra way' of talking about angle and similarity between two vectors is called inner product. We'll define this next. Orthogonality and QR The 'linear algebra way' of talking about "angle" and "similarity" between two vectors is called "inner product". We'll define this next. So, what is an inner product? An inner product

More information

EECS150 - Digital Design Lecture 26 - Faults and Error Correction. Types of Faults in Digital Designs

EECS150 - Digital Design Lecture 26 - Faults and Error Correction. Types of Faults in Digital Designs EECS150 - Digital Design Lecture 26 - Faults and Error Correction April 25, 2013 John Wawrzynek 1 Types of Faults in Digital Designs Design Bugs (function, timing, power draw) detected and corrected at

More information

Fault Tolerate Linear Algebra: Survive Fail-Stop Failures without Checkpointing

Fault Tolerate Linear Algebra: Survive Fail-Stop Failures without Checkpointing 20 Years of Innovative Computing Knoxville, Tennessee March 26 th, 200 Fault Tolerate Linear Algebra: Survive Fail-Stop Failures without Checkpointing Zizhong (Jeffrey) Chen zchen@mines.edu Colorado School

More information

High Performance Dense Linear System Solver with Resilience to Multiple Soft Errors

High Performance Dense Linear System Solver with Resilience to Multiple Soft Errors Procedia Computer Science Procedia Computer Science 00 (2012) 1 10 International Conference on Computational Science, ICCS 2012 High Performance Dense Linear System Solver with Resilience to Multiple Soft

More information

MATH 350: Introduction to Computational Mathematics

MATH 350: Introduction to Computational Mathematics MATH 350: Introduction to Computational Mathematics Chapter V: Least Squares Problems Greg Fasshauer Department of Applied Mathematics Illinois Institute of Technology Spring 2011 fasshauer@iit.edu MATH

More information

Fault Tolerance. Dealing with Faults

Fault Tolerance. Dealing with Faults Fault Tolerance Real-time computing systems must be fault-tolerant: they must be able to continue operating despite the failure of a limited subset of their hardware or software. They must also allow graceful

More information

CS 598: Communication Cost Analysis of Algorithms Lecture 9: The Ideal Cache Model and the Discrete Fourier Transform

CS 598: Communication Cost Analysis of Algorithms Lecture 9: The Ideal Cache Model and the Discrete Fourier Transform CS 598: Communication Cost Analysis of Algorithms Lecture 9: The Ideal Cache Model and the Discrete Fourier Transform Edgar Solomonik University of Illinois at Urbana-Champaign September 21, 2016 Fast

More information

Communication-avoiding parallel and sequential QR factorizations

Communication-avoiding parallel and sequential QR factorizations Communication-avoiding parallel and sequential QR factorizations James Demmel, Laura Grigori, Mark Hoemmen, and Julien Langou May 30, 2008 Abstract We present parallel and sequential dense QR factorization

More information

Sparse BLAS-3 Reduction

Sparse BLAS-3 Reduction Sparse BLAS-3 Reduction to Banded Upper Triangular (Spar3Bnd) Gary Howell, HPC/OIT NC State University gary howell@ncsu.edu Sparse BLAS-3 Reduction p.1/27 Acknowledgements James Demmel, Gene Golub, Franc

More information

Reliability of Technical Systems

Reliability of Technical Systems Main Topics 1. Introduction, Key Terms, Framing the Problem 2. Reliability Parameters: Failure Rate, Failure Probability, etc. 3. Some Important Reliability Distributions 4. Component Reliability 5. Software

More information

Tradeoff between Reliability and Power Management

Tradeoff between Reliability and Power Management Tradeoff between Reliability and Power Management 9/1/2005 FORGE Lee, Kyoungwoo Contents 1. Overview of relationship between reliability and power management 2. Dakai Zhu, Rami Melhem and Daniel Moss e,

More information

FAULT TOLERANT SYSTEMS

FAULT TOLERANT SYSTEMS ﻋﻨﻮان درس ﻧﺎم اﺳﺘﺎد 1394-95 ﻧﺎم درس ﻧﺎم رﺷﺘﻪ ﻧﺎم ﮔﺮاﯾﺶ ﻧﺎم ﻣﻮﻟﻒ ﻧﺎم ﮐﺎرﺷﻨﺎس درس FAULT TOLERANT SYSTEMS Part 8 RAID Systems Chapter 3 Information Redundancy RAID - Redundant Arrays of Inexpensive (Independent

More information

lecture 2 and 3: algorithms for linear algebra

lecture 2 and 3: algorithms for linear algebra lecture 2 and 3: algorithms for linear algebra STAT 545: Introduction to computational statistics Vinayak Rao Department of Statistics, Purdue University August 27, 2018 Solving a system of linear equations

More information

Hybrid static/dynamic scheduling for already optimized dense matrix factorization. Joint Laboratory for Petascale Computing, INRIA-UIUC

Hybrid static/dynamic scheduling for already optimized dense matrix factorization. Joint Laboratory for Petascale Computing, INRIA-UIUC Hybrid static/dynamic scheduling for already optimized dense matrix factorization Simplice Donfack, Laura Grigori, INRIA, France Bill Gropp, Vivek Kale UIUC, USA Joint Laboratory for Petascale Computing,

More information

Communication-avoiding LU and QR factorizations for multicore architectures

Communication-avoiding LU and QR factorizations for multicore architectures Communication-avoiding LU and QR factorizations for multicore architectures DONFACK Simplice INRIA Saclay Joint work with Laura Grigori INRIA Saclay Alok Kumar Gupta BCCS,Norway-5075 16th April 2010 Communication-avoiding

More information

Introduction to communication avoiding algorithms for direct methods of factorization in Linear Algebra

Introduction to communication avoiding algorithms for direct methods of factorization in Linear Algebra Introduction to communication avoiding algorithms for direct methods of factorization in Linear Algebra Laura Grigori Abstract Modern, massively parallel computers play a fundamental role in a large and

More information

Radiation Effects on Electronics. Dr. Brock J. LaMeres Associate Professor Electrical & Computer Engineering Montana State University

Radiation Effects on Electronics. Dr. Brock J. LaMeres Associate Professor Electrical & Computer Engineering Montana State University Dr. Brock J. LaMeres Associate Professor Electrical & Computer Engineering Montana State University Research Statement Support the Computing Needs of Space Exploration & Science Computation Power Efficiency

More information

Lecture 6. Numerical methods. Approximation of functions

Lecture 6. Numerical methods. Approximation of functions Lecture 6 Numerical methods Approximation of functions Lecture 6 OUTLINE 1. Approximation and interpolation 2. Least-square method basis functions design matrix residual weighted least squares normal equation

More information

LINEAR ALGEBRA 1, 2012-I PARTIAL EXAM 3 SOLUTIONS TO PRACTICE PROBLEMS

LINEAR ALGEBRA 1, 2012-I PARTIAL EXAM 3 SOLUTIONS TO PRACTICE PROBLEMS LINEAR ALGEBRA, -I PARTIAL EXAM SOLUTIONS TO PRACTICE PROBLEMS Problem (a) For each of the two matrices below, (i) determine whether it is diagonalizable, (ii) determine whether it is orthogonally diagonalizable,

More information

LINEAR ALGEBRA: NUMERICAL METHODS. Version: August 12,

LINEAR ALGEBRA: NUMERICAL METHODS. Version: August 12, LINEAR ALGEBRA: NUMERICAL METHODS. Version: August 12, 2000 74 6 Summary Here we summarize the most important information about theoretical and numerical linear algebra. MORALS OF THE STORY: I. Theoretically

More information

A Backward/Forward Recovery Approach for the Preconditioned Conjugate Gradient Method

A Backward/Forward Recovery Approach for the Preconditioned Conjugate Gradient Method A Backward/Forward Recovery Approach for the Preconditioned Conjugate Gradient Method Fasi, Massimiliano and Langou, Julien and Robert, Yves and Uçar, Bora 2015 MIMS EPrint: 2015.113 Manchester Institute

More information

Communication-avoiding parallel and sequential QR factorizations

Communication-avoiding parallel and sequential QR factorizations Communication-avoiding parallel and sequential QR factorizations James Demmel Laura Grigori Mark Frederick Hoemmen Julien Langou Electrical Engineering and Computer Sciences University of California at

More information

Lecture 11. Linear systems: Cholesky method. Eigensystems: Terminology. Jacobi transformations QR transformation

Lecture 11. Linear systems: Cholesky method. Eigensystems: Terminology. Jacobi transformations QR transformation Lecture Cholesky method QR decomposition Terminology Linear systems: Eigensystems: Jacobi transformations QR transformation Cholesky method: For a symmetric positive definite matrix, one can do an LU decomposition

More information

What is a quantum computer? Quantum Architecture. Quantum Mechanics. Quantum Superposition. Quantum Entanglement. What is a Quantum Computer (contd.

What is a quantum computer? Quantum Architecture. Quantum Mechanics. Quantum Superposition. Quantum Entanglement. What is a Quantum Computer (contd. What is a quantum computer? Quantum Architecture by Murat Birben A quantum computer is a device designed to take advantage of distincly quantum phenomena in carrying out a computational task. A quantum

More information

Parallel Numerical Algorithms

Parallel Numerical Algorithms Parallel Numerical Algorithms Chapter 5 Eigenvalue Problems Section 5.1 Michael T. Heath and Edgar Solomonik Department of Computer Science University of Illinois at Urbana-Champaign CS 554 / CSE 512 Michael

More information

lecture 3 and 4: algorithms for linear algebra

lecture 3 and 4: algorithms for linear algebra lecture 3 and 4: algorithms for linear algebra STAT 545: Introduction to computational statistics Vinayak Rao Department of Statistics, Purdue University August 30, 2016 Solving a system of linear equations

More information

Scalability of Partial Differential Equations Preconditioner Resilient to Soft and Hard Faults

Scalability of Partial Differential Equations Preconditioner Resilient to Soft and Hard Faults Scalability of Partial Differential Equations Preconditioner Resilient to Soft and Hard Faults K.Morris, F.Rizzi, K.Sargsyan, K.Dahlgren, P.Mycek, C.Safta, O.LeMaitre, O.Knio, B.Debusschere Sandia National

More information

EECS150 - Digital Design Lecture 26 Faults and Error Correction. Recap

EECS150 - Digital Design Lecture 26 Faults and Error Correction. Recap EECS150 - Digital Design Lecture 26 Faults and Error Correction Nov. 26, 2013 Prof. Ronald Fearing Electrical Engineering and Computer Sciences University of California, Berkeley (slides courtesy of Prof.

More information

Improvements for Implicit Linear Equation Solvers

Improvements for Implicit Linear Equation Solvers Improvements for Implicit Linear Equation Solvers Roger Grimes, Bob Lucas, Clement Weisbecker Livermore Software Technology Corporation Abstract Solving large sparse linear systems of equations is often

More information

Tile QR Factorization with Parallel Panel Processing for Multicore Architectures

Tile QR Factorization with Parallel Panel Processing for Multicore Architectures Tile QR Factorization with Parallel Panel Processing for Multicore Architectures Bilel Hadri, Hatem Ltaief, Emmanuel Agullo, Jack Dongarra Department of Electrical Engineering and Computer Science, University

More information

DRAM Reliability: Parity, ECC, Chipkill, Scrubbing. Alpha Particle or Cosmic Ray. electron-hole pairs. silicon. DRAM Memory System: Lecture 13

DRAM Reliability: Parity, ECC, Chipkill, Scrubbing. Alpha Particle or Cosmic Ray. electron-hole pairs. silicon. DRAM Memory System: Lecture 13 slide 1 DRAM Reliability: Parity, ECC, Chipkill, Scrubbing Alpha Particle or Cosmic Ray electron-hole pairs silicon Alpha Particles: Radioactive impurity in package material slide 2 - Soft errors were

More information

Parallelization of Multilevel Preconditioners Constructed from Inverse-Based ILUs on Shared-Memory Multiprocessors

Parallelization of Multilevel Preconditioners Constructed from Inverse-Based ILUs on Shared-Memory Multiprocessors Parallelization of Multilevel Preconditioners Constructed from Inverse-Based ILUs on Shared-Memory Multiprocessors J.I. Aliaga 1 M. Bollhöfer 2 A.F. Martín 1 E.S. Quintana-Ortí 1 1 Deparment of Computer

More information

Robust Network Codes for Unicast Connections: A Case Study

Robust Network Codes for Unicast Connections: A Case Study Robust Network Codes for Unicast Connections: A Case Study Salim Y. El Rouayheb, Alex Sprintson, and Costas Georghiades Department of Electrical and Computer Engineering Texas A&M University College Station,

More information

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Fault Tolerant Computing ECE 655

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Fault Tolerant Computing ECE 655 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Fault Tolerant Computing ECE 655 Part 1 Introduction C. M. Krishna Fall 2006 ECE655/Krishna Part.1.1 Prerequisites Basic courses in

More information

Terminology and Concepts

Terminology and Concepts Terminology and Concepts Prof. Naga Kandasamy 1 Goals of Fault Tolerance Dependability is an umbrella term encompassing the concepts of reliability, availability, performability, safety, and testability.

More information

In this section again we shall assume that the matrix A is m m, real and symmetric.

In this section again we shall assume that the matrix A is m m, real and symmetric. 84 3. The QR algorithm without shifts See Chapter 28 of the textbook In this section again we shall assume that the matrix A is m m, real and symmetric. 3.1. Simultaneous Iterations algorithm Suppose we

More information

2.5D algorithms for distributed-memory computing

2.5D algorithms for distributed-memory computing ntroduction for distributed-memory computing C Berkeley July, 2012 1/ 62 ntroduction Outline ntroduction Strong scaling 2.5D factorization 2/ 62 ntroduction Strong scaling Solving science problems faster

More information

J.I. Aliaga 1 M. Bollhöfer 2 A.F. Martín 1 E.S. Quintana-Ortí 1. March, 2009

J.I. Aliaga 1 M. Bollhöfer 2 A.F. Martín 1 E.S. Quintana-Ortí 1. March, 2009 Parallel Preconditioning of Linear Systems based on ILUPACK for Multithreaded Architectures J.I. Aliaga M. Bollhöfer 2 A.F. Martín E.S. Quintana-Ortí Deparment of Computer Science and Engineering, Univ.

More information

Tall and Skinny QR Matrix Factorization Using Tile Algorithms on Multicore Architectures LAPACK Working Note - 222

Tall and Skinny QR Matrix Factorization Using Tile Algorithms on Multicore Architectures LAPACK Working Note - 222 Tall and Skinny QR Matrix Factorization Using Tile Algorithms on Multicore Architectures LAPACK Working Note - 222 Bilel Hadri 1, Hatem Ltaief 1, Emmanuel Agullo 1, and Jack Dongarra 1,2,3 1 Department

More information

Section 4.5 Eigenvalues of Symmetric Tridiagonal Matrices

Section 4.5 Eigenvalues of Symmetric Tridiagonal Matrices Section 4.5 Eigenvalues of Symmetric Tridiagonal Matrices Key Terms Symmetric matrix Tridiagonal matrix Orthogonal matrix QR-factorization Rotation matrices (plane rotations) Eigenvalues We will now complete

More information

Work in Progress: Reachability Analysis for Time-triggered Hybrid Systems, The Platoon Benchmark

Work in Progress: Reachability Analysis for Time-triggered Hybrid Systems, The Platoon Benchmark Work in Progress: Reachability Analysis for Time-triggered Hybrid Systems, The Platoon Benchmark François Bidet LIX, École polytechnique, CNRS Université Paris-Saclay 91128 Palaiseau, France francois.bidet@polytechnique.edu

More information

A FUZZY LOGIC BASED MULTI-SENSOR NAVIGATION SYSTEM FOR AN UNMANNED SURFACE VEHICLE. T Xu, J Chudley and R Sutton

A FUZZY LOGIC BASED MULTI-SENSOR NAVIGATION SYSTEM FOR AN UNMANNED SURFACE VEHICLE. T Xu, J Chudley and R Sutton A FUZZY LOGIC BASED MULTI-SENSOR NAVIGATION SYSTEM FOR AN UNMANNED SURFACE VEHICLE T Xu, J Chudley and R Sutton Marine and Industrial Dynamic Analysis Research Group School of Engineering The University

More information

This can be accomplished by left matrix multiplication as follows: I

This can be accomplished by left matrix multiplication as follows: I 1 Numerical Linear Algebra 11 The LU Factorization Recall from linear algebra that Gaussian elimination is a method for solving linear systems of the form Ax = b, where A R m n and bran(a) In this method

More information

Important Matrix Factorizations

Important Matrix Factorizations LU Factorization Choleski Factorization The QR Factorization LU Factorization: Gaussian Elimination Matrices Gaussian elimination transforms vectors of the form a α, b where a R k, 0 α R, and b R n k 1,

More information

On aggressive early deflation in parallel variants of the QR algorithm

On aggressive early deflation in parallel variants of the QR algorithm On aggressive early deflation in parallel variants of the QR algorithm Bo Kågström 1, Daniel Kressner 2, and Meiyue Shao 1 1 Department of Computing Science and HPC2N Umeå University, S-901 87 Umeå, Sweden

More information

Coding for loss tolerant systems

Coding for loss tolerant systems Coding for loss tolerant systems Workshop APRETAF, 22 janvier 2009 Mathieu Cunche, Vincent Roca INRIA, équipe Planète INRIA Rhône-Alpes Mathieu Cunche, Vincent Roca The erasure channel Erasure codes Reed-Solomon

More information

LU factorization with Panel Rank Revealing Pivoting and its Communication Avoiding version

LU factorization with Panel Rank Revealing Pivoting and its Communication Avoiding version 1 LU factorization with Panel Rank Revealing Pivoting and its Communication Avoiding version Amal Khabou Advisor: Laura Grigori Université Paris Sud 11, INRIA Saclay France SIAMPP12 February 17, 2012 2

More information

Performance of Random Sampling for Computing Low-rank Approximations of a Dense Matrix on GPUs

Performance of Random Sampling for Computing Low-rank Approximations of a Dense Matrix on GPUs Performance of Random Sampling for Computing Low-rank Approximations of a Dense Matrix on GPUs Théo Mary, Ichitaro Yamazaki, Jakub Kurzak, Piotr Luszczek, Stanimire Tomov, Jack Dongarra presenter 1 Low-Rank

More information

EECS150 - Digital Design Lecture 26 Error Correction Codes, Linear Feedback Shift Registers (LFSRs)

EECS150 - Digital Design Lecture 26 Error Correction Codes, Linear Feedback Shift Registers (LFSRs) EECS150 - igital esign Lecture 26 Error Correction Codes, Linear Feedback Shift Registers (LFSRs) Nov 21, 2002 John Wawrzynek Fall 2002 EECS150 Lec26-ECC Page 1 Outline Error detection using parity Hamming

More information

An Efficient Algorithm for Computing the Reliability of Consecutive-k-out-of-n:F Systems

An Efficient Algorithm for Computing the Reliability of Consecutive-k-out-of-n:F Systems 1 An Efficient Algorithm for Computing the Reliability of Consecutive-k-out-of-n:F Systems Thomas Cluzeau, Jörg Keller, Member, IEEE Computer Society, and Winfrid Schneeweiss, Senior Life Member, IEEE

More information

Lecture 3: QR-Factorization

Lecture 3: QR-Factorization Lecture 3: QR-Factorization This lecture introduces the Gram Schmidt orthonormalization process and the associated QR-factorization of matrices It also outlines some applications of this factorization

More information

Theoretical Computer Science

Theoretical Computer Science Theoretical Computer Science 412 (2011) 1484 1491 Contents lists available at ScienceDirect Theoretical Computer Science journal homepage: wwwelseviercom/locate/tcs Parallel QR processing of Generalized

More information

Fraction-free Row Reduction of Matrices of Skew Polynomials

Fraction-free Row Reduction of Matrices of Skew Polynomials Fraction-free Row Reduction of Matrices of Skew Polynomials Bernhard Beckermann Laboratoire d Analyse Numérique et d Optimisation Université des Sciences et Technologies de Lille France bbecker@ano.univ-lille1.fr

More information

Enhancing Performance of Tall-Skinny QR Factorization using FPGAs

Enhancing Performance of Tall-Skinny QR Factorization using FPGAs Enhancing Performance of Tall-Skinny QR Factorization using FPGAs Abid Rafique Imperial College London August 31, 212 Enhancing Performance of Tall-Skinny QR Factorization using FPGAs 1/18 Our Claim Common

More information

ECC for NAND Flash. Osso Vahabzadeh. TexasLDPC Inc. Flash Memory Summit 2017 Santa Clara, CA 1

ECC for NAND Flash. Osso Vahabzadeh. TexasLDPC Inc. Flash Memory Summit 2017 Santa Clara, CA 1 ECC for NAND Flash Osso Vahabzadeh TexasLDPC Inc. 1 Overview Why Is Error Correction Needed in Flash Memories? Error Correction Codes Fundamentals Low-Density Parity-Check (LDPC) Codes LDPC Encoding and

More information

Matrix Factorization and Analysis

Matrix Factorization and Analysis Chapter 7 Matrix Factorization and Analysis Matrix factorizations are an important part of the practice and analysis of signal processing. They are at the heart of many signal-processing algorithms. Their

More information

30.3. LU Decomposition. Introduction. Prerequisites. Learning Outcomes

30.3. LU Decomposition. Introduction. Prerequisites. Learning Outcomes LU Decomposition 30.3 Introduction In this Section we consider another direct method for obtaining the solution of systems of equations in the form AX B. Prerequisites Before starting this Section you

More information

Practice Exam. 2x 1 + 4x 2 + 2x 3 = 4 x 1 + 2x 2 + 3x 3 = 1 2x 1 + 3x 2 + 4x 3 = 5

Practice Exam. 2x 1 + 4x 2 + 2x 3 = 4 x 1 + 2x 2 + 3x 3 = 1 2x 1 + 3x 2 + 4x 3 = 5 Practice Exam. Solve the linear system using an augmented matrix. State whether the solution is unique, there are no solutions or whether there are infinitely many solutions. If the solution is unique,

More information

Reduction of Detected Acceptable Faults for Yield Improvement via Error-Tolerance

Reduction of Detected Acceptable Faults for Yield Improvement via Error-Tolerance Reduction of Detected Acceptable Faults for Yield Improvement via Error-Tolerance Tong-Yu Hsieh and Kuen-Jong Lee Department of Electrical Engineering National Cheng Kung University Tainan, Taiwan 70101

More information

Parallel Asynchronous Hybrid Krylov Methods for Minimization of Energy Consumption. Langshi CHEN 1,2,3 Supervised by Serge PETITON 2

Parallel Asynchronous Hybrid Krylov Methods for Minimization of Energy Consumption. Langshi CHEN 1,2,3 Supervised by Serge PETITON 2 1 / 23 Parallel Asynchronous Hybrid Krylov Methods for Minimization of Energy Consumption Langshi CHEN 1,2,3 Supervised by Serge PETITON 2 Maison de la Simulation Lille 1 University CNRS March 18, 2013

More information

Part IB Numerical Analysis

Part IB Numerical Analysis Part IB Numerical Analysis Definitions Based on lectures by G. Moore Notes taken by Dexter Chua Lent 206 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after

More information

Matrix Computations: Direct Methods II. May 5, 2014 Lecture 11

Matrix Computations: Direct Methods II. May 5, 2014 Lecture 11 Matrix Computations: Direct Methods II May 5, 2014 ecture Summary You have seen an example of how a typical matrix operation (an important one) can be reduced to using lower level BS routines that would

More information

Direct and Incomplete Cholesky Factorizations with Static Supernodes

Direct and Incomplete Cholesky Factorizations with Static Supernodes Direct and Incomplete Cholesky Factorizations with Static Supernodes AMSC 661 Term Project Report Yuancheng Luo 2010-05-14 Introduction Incomplete factorizations of sparse symmetric positive definite (SSPD)

More information

A 3 3 DILATION COUNTEREXAMPLE

A 3 3 DILATION COUNTEREXAMPLE A 3 3 DILATION COUNTEREXAMPLE MAN DUEN CHOI AND KENNETH R DAVIDSON Dedicated to the memory of William B Arveson Abstract We define four 3 3 commuting contractions which do not dilate to commuting isometries

More information

Krylov Subspace Methods that Are Based on the Minimization of the Residual

Krylov Subspace Methods that Are Based on the Minimization of the Residual Chapter 5 Krylov Subspace Methods that Are Based on the Minimization of the Residual Remark 51 Goal he goal of these methods consists in determining x k x 0 +K k r 0,A such that the corresponding Euclidean

More information

Reliability Assessment Electric Utility Mapping. Maged Yackoub Eva Szatmari Veridian Connections Toronto, October 2015

Reliability Assessment Electric Utility Mapping. Maged Yackoub Eva Szatmari Veridian Connections Toronto, October 2015 Reliability Assessment Electric Utility Mapping Maged Yackoub Eva Szatmari Veridian Connections Toronto, October 2015 Agenda Introduction About Veridian Connections Veridian s GIS platform Reliability

More information

Contents. Preface... xi. Introduction...

Contents. Preface... xi. Introduction... Contents Preface... xi Introduction... xv Chapter 1. Computer Architectures... 1 1.1. Different types of parallelism... 1 1.1.1. Overlap, concurrency and parallelism... 1 1.1.2. Temporal and spatial parallelism

More information

Performance and Scalability. Lars Karlsson

Performance and Scalability. Lars Karlsson Performance and Scalability Lars Karlsson Outline Complexity analysis Runtime, speedup, efficiency Amdahl s Law and scalability Cost and overhead Cost optimality Iso-efficiency function Case study: matrix

More information

Redundant Array of Independent Disks

Redundant Array of Independent Disks Redundant Array of Independent Disks Yashwant K. Malaiya 1 Redundant Array of Independent Disks (RAID) Enables greater levels of performance and/or reliability How? By concurrent use of two or more hard

More information

A short note on the Householder QR factorization

A short note on the Householder QR factorization A short note on the Householder QR factorization Alfredo Buttari October 25, 2016 1 Uses of the QR factorization This document focuses on the QR factorization of a dense matrix. This method decomposes

More information

QR Decomposition in a Multicore Environment

QR Decomposition in a Multicore Environment QR Decomposition in a Multicore Environment Omar Ahsan University of Maryland-College Park Advised by Professor Howard Elman College Park, MD oha@cs.umd.edu ABSTRACT In this study we examine performance

More information

[Disclaimer: This is not a complete list of everything you need to know, just some of the topics that gave people difficulty.]

[Disclaimer: This is not a complete list of everything you need to know, just some of the topics that gave people difficulty.] Math 43 Review Notes [Disclaimer: This is not a complete list of everything you need to know, just some of the topics that gave people difficulty Dot Product If v (v, v, v 3 and w (w, w, w 3, then the

More information

CALU: A Communication Optimal LU Factorization Algorithm

CALU: A Communication Optimal LU Factorization Algorithm CALU: A Communication Optimal LU Factorization Algorithm James Demmel Laura Grigori Hua Xiang Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-010-9

More information

A DISTRIBUTED-MEMORY RANDOMIZED STRUCTURED MULTIFRONTAL METHOD FOR SPARSE DIRECT SOLUTIONS

A DISTRIBUTED-MEMORY RANDOMIZED STRUCTURED MULTIFRONTAL METHOD FOR SPARSE DIRECT SOLUTIONS A DISTRIBUTED-MEMORY RANDOMIZED STRUCTURED MULTIFRONTAL METHOD FOR SPARSE DIRECT SOLUTIONS ZIXING XIN, JIANLIN XIA, MAARTEN V. DE HOOP, STEPHEN CAULEY, AND VENKATARAMANAN BALAKRISHNAN Abstract. We design

More information

Do we have a quorum?

Do we have a quorum? Do we have a quorum? Quorum Systems Given a set U of servers, U = n: A quorum system is a set Q 2 U such that Q 1, Q 2 Q : Q 1 Q 2 Each Q in Q is a quorum How quorum systems work: A read/write shared register

More information

CRYOGENIC DRAM BASED MEMORY SYSTEM FOR SCALABLE QUANTUM COMPUTERS: A FEASIBILITY STUDY

CRYOGENIC DRAM BASED MEMORY SYSTEM FOR SCALABLE QUANTUM COMPUTERS: A FEASIBILITY STUDY CRYOGENIC DRAM BASED MEMORY SYSTEM FOR SCALABLE QUANTUM COMPUTERS: A FEASIBILITY STUDY MEMSYS-2017 SWAMIT TANNU DOUG CARMEAN MOINUDDIN QURESHI Why Quantum Computers? 2 Molecule and Material Simulations

More information

A Vector Space Justification of Householder Orthogonalization

A Vector Space Justification of Householder Orthogonalization A Vector Space Justification of Householder Orthogonalization Ronald Christensen Professor of Statistics Department of Mathematics and Statistics University of New Mexico August 28, 2015 Abstract We demonstrate

More information

Integer Least Squares: Sphere Decoding and the LLL Algorithm

Integer Least Squares: Sphere Decoding and the LLL Algorithm Integer Least Squares: Sphere Decoding and the LLL Algorithm Sanzheng Qiao Department of Computing and Software McMaster University 28 Main St. West Hamilton Ontario L8S 4L7 Canada. ABSTRACT This paper

More information

Error Detection, Correction and Erasure Codes for Implementation in a Cluster File-system

Error Detection, Correction and Erasure Codes for Implementation in a Cluster File-system Error Detection, Correction and Erasure Codes for Implementation in a Cluster File-system Steve Baker December 6, 2011 Abstract. The evaluation of various error detection and correction algorithms and

More information

Assignment #9: Orthogonal Projections, Gram-Schmidt, and Least Squares. Name:

Assignment #9: Orthogonal Projections, Gram-Schmidt, and Least Squares. Name: Assignment 9: Orthogonal Projections, Gram-Schmidt, and Least Squares Due date: Friday, April 0, 08 (:pm) Name: Section Number Assignment 9: Orthogonal Projections, Gram-Schmidt, and Least Squares Due

More information

Solving linear equations with Gaussian Elimination (I)

Solving linear equations with Gaussian Elimination (I) Term Projects Solving linear equations with Gaussian Elimination The QR Algorithm for Symmetric Eigenvalue Problem The QR Algorithm for The SVD Quasi-Newton Methods Solving linear equations with Gaussian

More information

Switch Fabrics. Switching Technology S P. Raatikainen Switching Technology / 2004.

Switch Fabrics. Switching Technology S P. Raatikainen Switching Technology / 2004. Switch Fabrics Switching Technology S38.65 http://www.netlab.hut.fi/opetus/s3865 L4 - Switch fabrics Basic concepts Time and space switching Two stage switches Three stage switches Cost criteria Multi-stage

More information

AMS 209, Fall 2015 Final Project Type A Numerical Linear Algebra: Gaussian Elimination with Pivoting for Solving Linear Systems

AMS 209, Fall 2015 Final Project Type A Numerical Linear Algebra: Gaussian Elimination with Pivoting for Solving Linear Systems AMS 209, Fall 205 Final Project Type A Numerical Linear Algebra: Gaussian Elimination with Pivoting for Solving Linear Systems. Overview We are interested in solving a well-defined linear system given

More information

Orthonormal Bases; Gram-Schmidt Process; QR-Decomposition

Orthonormal Bases; Gram-Schmidt Process; QR-Decomposition Orthonormal Bases; Gram-Schmidt Process; QR-Decomposition MATH 322, Linear Algebra I J. Robert Buchanan Department of Mathematics Spring 205 Motivation When working with an inner product space, the most

More information

Shortest paths with negative lengths

Shortest paths with negative lengths Chapter 8 Shortest paths with negative lengths In this chapter we give a linear-space, nearly linear-time algorithm that, given a directed planar graph G with real positive and negative lengths, but no

More information

Linear Algebra, part 3 QR and SVD

Linear Algebra, part 3 QR and SVD Linear Algebra, part 3 QR and SVD Anna-Karin Tornberg Mathematical Models, Analysis and Simulation Fall semester, 2012 Going back to least squares (Section 1.4 from Strang, now also see section 5.2). We

More information

5. Communication resources

5. Communication resources 5. Communication resources Classical channel Quantum channel Entanglement How does the state evolve under LOCC? Properties of maximally entangled states Bell basis Quantum dense coding Quantum teleportation

More information

Matrices, Vector Spaces, and Information Retrieval

Matrices, Vector Spaces, and Information Retrieval Matrices, Vector Spaces, and Information Authors: M. W. Berry and Z. Drmac and E. R. Jessup SIAM 1999: Society for Industrial and Applied Mathematics Speaker: Mattia Parigiani 1 Introduction Large volumes

More information

5.6. PSEUDOINVERSES 101. A H w.

5.6. PSEUDOINVERSES 101. A H w. 5.6. PSEUDOINVERSES 0 Corollary 5.6.4. If A is a matrix such that A H A is invertible, then the least-squares solution to Av = w is v = A H A ) A H w. The matrix A H A ) A H is the left inverse of A and

More information

Towards optimal synchronous counting

Towards optimal synchronous counting Towards optimal synchronous counting Christoph Lenzen Joel Rybicki Jukka Suomela MPI for Informatics MPI for Informatics Aalto University Aalto University PODC 5 July 3 Focus on fault-tolerance Fault-tolerant

More information

Safety and Reliability of Embedded Systems

Safety and Reliability of Embedded Systems (Sicherheit und Zuverlässigkeit eingebetteter Systeme) Fault Tree Analysis Mathematical Background and Algorithms Prof. Dr. Liggesmeyer, 0 Content Definitions of Terms Introduction to Combinatorics General

More information

A New Space for Comparing Graphs

A New Space for Comparing Graphs A New Space for Comparing Graphs Anshumali Shrivastava and Ping Li Cornell University and Rutgers University August 18th 2014 Anshumali Shrivastava and Ping Li ASONAM 2014 August 18th 2014 1 / 38 Main

More information

A DISTRIBUTED-MEMORY RANDOMIZED STRUCTURED MULTIFRONTAL METHOD FOR SPARSE DIRECT SOLUTIONS

A DISTRIBUTED-MEMORY RANDOMIZED STRUCTURED MULTIFRONTAL METHOD FOR SPARSE DIRECT SOLUTIONS SIAM J. SCI. COMPUT. Vol. 39, No. 4, pp. C292 C318 c 2017 Society for Industrial and Applied Mathematics A DISTRIBUTED-MEMORY RANDOMIZED STRUCTURED MULTIFRONTAL METHOD FOR SPARSE DIRECT SOLUTIONS ZIXING

More information

PART II : Least-Squares Approximation

PART II : Least-Squares Approximation PART II : Least-Squares Approximation Basic theory Let U be an inner product space. Let V be a subspace of U. For any g U, we look for a least-squares approximation of g in the subspace V min f V f g 2,

More information

Preliminaries and Complexity Theory

Preliminaries and Complexity Theory Preliminaries and Complexity Theory Oleksandr Romanko CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 16, 2006 Introduction Book structure: 2 Part I Linear Algebra

More information

Arnoldi Methods in SLEPc

Arnoldi Methods in SLEPc Scalable Library for Eigenvalue Problem Computations SLEPc Technical Report STR-4 Available at http://slepc.upv.es Arnoldi Methods in SLEPc V. Hernández J. E. Román A. Tomás V. Vidal Last update: October,

More information