Model Order Reduction via Matlab Parallel Computing Toolbox. Istanbul Technical University

Size: px
Start display at page:

Download "Model Order Reduction via Matlab Parallel Computing Toolbox. Istanbul Technical University"

Transcription

1 Model Order Reduction via Matlab Parallel Computing Toolbox E. Fatih Yetkin & Hasan Dağ Istanbul Technical University Computational Science & Engineering Department September 21, 2009 E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

2 1 Parallel Computation Why We Need Parallelism in MOR? What is Parallelism? Parallel Architectures 2 Tools of Parallelization Programming Models Parallel Matlab 3 Parallel Version of Rational Krylov Methods Rational Krylov Methods H 2 optimality and Rational Krylov methods An Example System Parallelization of the Algorithm Results 4 Conclusions E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

3 Why We Need Parallelism in MOR? Computational Complexity Model reduction methods aim to build a model, which is easy to handle. However, for some type of methods such as balanced truncation or rational Krylov reduction process takes lots of time for dense problems. Computational Complexity of Rational Krylov Methods Complexity of the process decomposition of (A σ i E) for k points is O(N 3 ) Therefore, especially in dense problems parallelism is an obligation. E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

4 What is Parallelism? Sequential Programming A single CPU (core) is available Problem is composed of series of commands Each command is executed one after another E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

5 What is Parallelism? Parallel Programming In the simplest sense, parallel computing is the simultaneous use of multiple computing resources to solve a computational problem: with multiple CPUs or cores Problem is broken into discrete parts that can be solved concurrently. Each part is executed on different CPUs simultaneously. E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

6 Parallel Architectures Shared Memory Generally shared memory machines have in common the ability for all processors to access all memory as global address space. Multiple processors can operate independently but share the same memory resources. Shared memory machines can be divided into two main classes based upon memory access times: UMA and NUMA E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

7 Parallel Architectures UMA vs. NUMA In Uniform Memory Access (UMA) architecture, identical processors has equal access times to memory. Also called Symmetric Multiprocessor (SMP). Non-uniform Memory Access (NUMA) machines, often made by physically linking two or more SMPs and not all processors have equal access time to all memories. E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

8 Parallel Architectures Distributed Memory Processors have their own local memory. Memory addresses in one processor do not map to another processor, so there is no concept of global address space across all processors. When a processor needs access to data in another processor, it is usually the task of the programmer to explicitly define how and when data is communicated. Synchronization between tasks is likewise the programmer s responsibility. E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

9 Parallel Architectures Hybrid Memory The largest and the fastest computers in the world today employ both shared and distributed memory architectures. The shared memory component is usually a cache coherent SMP machine. Processors on a given SMP can address that machine s memory as global. Network communications are required to move data from one SMP to another. E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

10 Parallel Programming Models: Threads POSIX Threads & OpenMP In the threads model of parallel programming, a single process can have multiple, concurrent execution paths. Threads can come and go, but a.out remains present to provide the necessary shared resources until the application is completed. Unrelated standardization efforts have resulted in two very different implementations of threads: POSIX Threads and OpenMP. E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

11 Parallel Programming Models: Message Passing Interface MPI A set of tasks that use their own local memory during computation. Multiple tasks can reside on the same physical machine as well across an arbitrary number of machines. Tasks exchange data through communications by sending and receiving messages. E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

12 Matlab Distributed Computing Toolbox Distributed or Parallel From the view of Matlab terminology parallel jobs run on the internal workers such as cores and distributed jobs run on the cluster nodes. E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

13 Basics of Parallel Computing Toolbox parfor In Matlab you can use parfor to make a parallel loop. Message passing or some low level communication issues handled by Matlab itself. E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

14 Basics of Parallel Computing Toolbox when we can use parfor? E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

15 Basics of Parallel Computing Toolbox when we can not use parfor? E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

16 Basics of Parallel Computing Toolbox single process multiple data (spmd) In Matlab you can use spmd blocks to run a process on different data sets. E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

17 Basics of Parallel Computing Toolbox single process multiple data (spmd) Master processor has a right to access for all workers data E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

18 Basics of Parallel Computing Toolbox distributed arrays It is possible to distribute any array to workers. E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

19 Basics of Parallel Computing Toolbox distributed arrays E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

20 Matrix transposing MPI-Fortran vs. Matlab -DCT E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

21 Rational Krylov Methods If D selected as zero system triple can be selected as Σ = (A, B, C) for ẋ = Ax + Bu y = C T x + Du Two matrices V R nxk and W R nxk can be defined where W V = I k and k n With these two matrices reduced order system can be found as  = W AV ˆB = W B Ĉ = CV (1) E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

22 Rational Krylov Method There are lots of ways to build the projection matrices. One way is using rational Krylov subspace bases. Assume that k distinct points in complex plane are selected for interpolation. Then interpolation matrices, V and W can be built as shown below. V = [(s 1 I A) 1 B... (s k I A) 1 B] Ŵ = [(s 1 I A T ) 1 C... (s k I A T 1) 1 C] (2) E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

23 Rational Krylov Projectors Assuming that det(ŵ V ) 0, the projected reduced system can be built as, Â = W T AV, ˆB = W T B, Ĉ = CV (3) where W = Ŵ ( ˆV W ) 1 to ensure W V = I k. The basic problem is to find a strategy to select the interpolation points. As the worst case, the interpolation points can be selected as randomly from the operating frequency of the system. E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

24 Rational Krylov Projectors E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

25 H 2 norm of a system This approach is not optimal. To improve this approach several methods can be used. In this work we use the iterative rational Krylov approach to achieve H 2 norm optimal reduced model. H 2 norm of a system is defined as below, [ + 1/2 G 2 := G(jω) dω] 2 (4) E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

26 H 2 optimality Reduced order system G r (s) is H 2 optimal if it minimizes the G r (s) = argmin deg( Ĝ)=r G(s) Ĝ(s) H2 (5) And there are two important theorems to obtain an H 2 optimal reduced model given by Meier (1967) and Grimme (1997). Antoulas et.al. combine these two important results to achieve an Iterative Rational Krylov Algorithm (IRKA) to obtain H 2 optimal reduced order model E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

27 Iterative Rational Krylov Algorithm (IRKA) E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

28 Example RLC network We use a ladder RLC network as benchmark example for the numerical implementation of the Alg.1 and Alg.2. Minimal realization of the circuit is given in Fig.1. For this circuit order of the system n = 5. On the other hand, system matrices of this circuit can easily be extended E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

29 Frequency plots of the reduced and original systems N=201 and the order of reduced system k=20 E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

30 Computational Cost of Methods Computational cost of the rational Krylov methods can be given as O(N 3 ) for dense problems In IRKA rational Krylov methods are used iteratively and the computational complexity has to be multiplied by the iteration number r. E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

31 Parallel Parts of Algorithms Although both algorithms have k times factorization to compute (s i I A) 1 B, these factorizations can be computed on different processors independently. The matrix-matrix and matrix-vector multiplications in the algorithms are amenable to parallel processing. E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

32 Parallel Version of Alg. 1 E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

33 Parallel Version of Alg. 1 E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

34 CPU times for Rational Krylov Table: CPU times of parallel version of Alg.1 for different system orders where the reduced system order k=200. Proc no. time (n=2000) time (n=5000) E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

35 CPU times for IRKA Table: CPU times of parallel version of Alg.2 for different system orders where the reduced system order k=200. Proc no. time (n=2000) time (n=5000) E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

36 Speedup graph for RK Speedup of a parallel algorithm is defined as S p = T 1 T p (6) where T 1 is the CPU time for one processor and T p is the CPU time for P processor. E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

37 Speedup graph for RK E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

38 Speedup graph for IRKA E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

39 continued It can easily be seen from the figures, when we increase the number of processors processing time decreases appreciably upto some point, after which it starts to increase. This is due to communication times becoming dominant over computation time. But in both algorithm, when the size of the system matrices are getting larger better speedups are obtained. E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

40 Conclusions In this work, iterative rational Krylov method based optimal H 2 norm model reduction methods are parallelized. These methods require huge computation but the algorithm themselves are suitable for parallel processing. Therefore, computational time decreases when the number of processors is increased. Due to communication needs of the processors, communication time dominates the overall process time when the system order is small. But in larger orders, parallel algorithm has better speedup values. E. Fatih Yetkin (Istanbul Technical Univ.) Terschelling, 2009 September 21, / 40

Performance Evaluation of the Matlab PCT for Parallel Implementations of Nonnegative Tensor Factorization

Performance Evaluation of the Matlab PCT for Parallel Implementations of Nonnegative Tensor Factorization Performance Evaluation of the Matlab PCT for Parallel Implementations of Nonnegative Tensor Factorization Tabitha Samuel, Master s Candidate Dr. Michael W. Berry, Major Professor Abstract: Increasingly

More information

Parallel Numerics. Scope: Revise standard numerical methods considering parallel computations!

Parallel Numerics. Scope: Revise standard numerical methods considering parallel computations! Parallel Numerics Scope: Revise standard numerical methods considering parallel computations! Required knowledge: Numerics Parallel Programming Graphs Literature: Dongarra, Du, Sorensen, van der Vorst:

More information

Contents. Preface... xi. Introduction...

Contents. Preface... xi. Introduction... Contents Preface... xi Introduction... xv Chapter 1. Computer Architectures... 1 1.1. Different types of parallelism... 1 1.1.1. Overlap, concurrency and parallelism... 1 1.1.2. Temporal and spatial parallelism

More information

Exploiting Low-Rank Structure in Computing Matrix Powers with Applications to Preconditioning

Exploiting Low-Rank Structure in Computing Matrix Powers with Applications to Preconditioning Exploiting Low-Rank Structure in Computing Matrix Powers with Applications to Preconditioning Erin C. Carson, Nicholas Knight, James Demmel, Ming Gu U.C. Berkeley SIAM PP 12, Savannah, Georgia, USA, February

More information

Welcome to MCS 572. content and organization expectations of the course. definition and classification

Welcome to MCS 572. content and organization expectations of the course. definition and classification Welcome to MCS 572 1 About the Course content and organization expectations of the course 2 Supercomputing definition and classification 3 Measuring Performance speedup and efficiency Amdahl s Law Gustafson

More information

Balanced Truncation Model Reduction of Large and Sparse Generalized Linear Systems

Balanced Truncation Model Reduction of Large and Sparse Generalized Linear Systems Balanced Truncation Model Reduction of Large and Sparse Generalized Linear Systems Jos M. Badía 1, Peter Benner 2, Rafael Mayo 1, Enrique S. Quintana-Ortí 1, Gregorio Quintana-Ortí 1, A. Remón 1 1 Depto.

More information

Binary Decision Diagrams and Symbolic Model Checking

Binary Decision Diagrams and Symbolic Model Checking Binary Decision Diagrams and Symbolic Model Checking Randy Bryant Ed Clarke Ken McMillan Allen Emerson CMU CMU Cadence U Texas http://www.cs.cmu.edu/~bryant Binary Decision Diagrams Restricted Form of

More information

Static-scheduling and hybrid-programming in SuperLU DIST on multicore cluster systems

Static-scheduling and hybrid-programming in SuperLU DIST on multicore cluster systems Static-scheduling and hybrid-programming in SuperLU DIST on multicore cluster systems Ichitaro Yamazaki University of Tennessee, Knoxville Xiaoye Sherry Li Lawrence Berkeley National Laboratory MS49: Sparse

More information

B629 project - StreamIt MPI Backend. Nilesh Mahajan

B629 project - StreamIt MPI Backend. Nilesh Mahajan B629 project - StreamIt MPI Backend Nilesh Mahajan March 26, 2013 Abstract StreamIt is a language based on the dataflow model of computation. StreamIt consists of computation units called filters connected

More information

Krylov-Subspace Based Model Reduction of Nonlinear Circuit Models Using Bilinear and Quadratic-Linear Approximations

Krylov-Subspace Based Model Reduction of Nonlinear Circuit Models Using Bilinear and Quadratic-Linear Approximations Krylov-Subspace Based Model Reduction of Nonlinear Circuit Models Using Bilinear and Quadratic-Linear Approximations Peter Benner and Tobias Breiten Abstract We discuss Krylov-subspace based model reduction

More information

Approximation of the Linearized Boussinesq Equations

Approximation of the Linearized Boussinesq Equations Approximation of the Linearized Boussinesq Equations Alan Lattimer Advisors Jeffrey Borggaard Serkan Gugercin Department of Mathematics Virginia Tech SIAM Talk Series, Virginia Tech, April 22, 2014 Alan

More information

Model reduction of large-scale dynamical systems

Model reduction of large-scale dynamical systems Model reduction of large-scale dynamical systems Lecture III: Krylov approximation and rational interpolation Thanos Antoulas Rice University and Jacobs University email: aca@rice.edu URL: www.ece.rice.edu/

More information

Towards high performance IRKA on hybrid CPU-GPU systems

Towards high performance IRKA on hybrid CPU-GPU systems Towards high performance IRKA on hybrid CPU-GPU systems Jens Saak in collaboration with Georg Pauer (OVGU/MPI Magdeburg) Kapil Ahuja, Ruchir Garg (IIT Indore) Hartwig Anzt, Jack Dongarra (ICL Uni Tennessee

More information

Model Reduction for Linear Dynamical Systems

Model Reduction for Linear Dynamical Systems Summer School on Numerical Linear Algebra for Dynamical and High-Dimensional Problems Trogir, October 10 15, 2011 Model Reduction for Linear Dynamical Systems Peter Benner Max Planck Institute for Dynamics

More information

BALANCING-RELATED MODEL REDUCTION FOR DATA-SPARSE SYSTEMS

BALANCING-RELATED MODEL REDUCTION FOR DATA-SPARSE SYSTEMS BALANCING-RELATED Peter Benner Professur Mathematik in Industrie und Technik Fakultät für Mathematik Technische Universität Chemnitz Computational Methods with Applications Harrachov, 19 25 August 2007

More information

Parallelization of the Dirac operator. Pushan Majumdar. Indian Association for the Cultivation of Sciences, Jadavpur, Kolkata

Parallelization of the Dirac operator. Pushan Majumdar. Indian Association for the Cultivation of Sciences, Jadavpur, Kolkata Parallelization of the Dirac operator Pushan Majumdar Indian Association for the Cultivation of Sciences, Jadavpur, Kolkata Outline Introduction Algorithms Parallelization Comparison of performances Conclusions

More information

Parallelization of Multilevel Preconditioners Constructed from Inverse-Based ILUs on Shared-Memory Multiprocessors

Parallelization of Multilevel Preconditioners Constructed from Inverse-Based ILUs on Shared-Memory Multiprocessors Parallelization of Multilevel Preconditioners Constructed from Inverse-Based ILUs on Shared-Memory Multiprocessors J.I. Aliaga 1 M. Bollhöfer 2 A.F. Martín 1 E.S. Quintana-Ortí 1 1 Deparment of Computer

More information

Solving the Inverse Toeplitz Eigenproblem Using ScaLAPACK and MPI *

Solving the Inverse Toeplitz Eigenproblem Using ScaLAPACK and MPI * Solving the Inverse Toeplitz Eigenproblem Using ScaLAPACK and MPI * J.M. Badía and A.M. Vidal Dpto. Informática., Univ Jaume I. 07, Castellón, Spain. badia@inf.uji.es Dpto. Sistemas Informáticos y Computación.

More information

Performance Evaluation of the Matlab PCT for Parallel Implementations of Nonnegative Tensor Factorization

Performance Evaluation of the Matlab PCT for Parallel Implementations of Nonnegative Tensor Factorization Performance Evaluation of the Matlab PCT for Parallel Implementations of Nonnegative Tensor Factorization Tabitha Samuel, Master s Candidate Dr. Michael W. Berry, Major Professor What is the Parallel Computing

More information

Large-scale Electronic Structure Simulations with MVAPICH2 on Intel Knights Landing Manycore Processors

Large-scale Electronic Structure Simulations with MVAPICH2 on Intel Knights Landing Manycore Processors Large-scale Electronic Structure Simulations with MVAPICH2 on Intel Knights Landing Manycore Processors Hoon Ryu, Ph.D. (E: elec1020@kisti.re.kr) Principal Researcher / Korea Institute of Science and Technology

More information

MPI at MPI. Jens Saak. Max Planck Institute for Dynamics of Complex Technical Systems Computational Methods in Systems and Control Theory

MPI at MPI. Jens Saak. Max Planck Institute for Dynamics of Complex Technical Systems Computational Methods in Systems and Control Theory MAX PLANCK INSTITUTE November 5, 2010 MPI at MPI Jens Saak Max Planck Institute for Dynamics of Complex Technical Systems Computational Methods in Systems and Control Theory FOR DYNAMICS OF COMPLEX TECHNICAL

More information

Communication-avoiding parallel and sequential QR factorizations

Communication-avoiding parallel and sequential QR factorizations Communication-avoiding parallel and sequential QR factorizations James Demmel, Laura Grigori, Mark Hoemmen, and Julien Langou May 30, 2008 Abstract We present parallel and sequential dense QR factorization

More information

Inverse problems. High-order optimization and parallel computing. Lecture 7

Inverse problems. High-order optimization and parallel computing. Lecture 7 Inverse problems High-order optimization and parallel computing Nikolai Piskunov 2014 Lecture 7 Non-linear least square fit The (conjugate) gradient search has one important problem which often occurs

More information

Model Reduction for Unstable Systems

Model Reduction for Unstable Systems Model Reduction for Unstable Systems Klajdi Sinani Virginia Tech klajdi@vt.edu Advisor: Serkan Gugercin October 22, 2015 (VT) SIAM October 22, 2015 1 / 26 Overview 1 Introduction 2 Interpolatory Model

More information

Panorama des modèles et outils de programmation parallèle

Panorama des modèles et outils de programmation parallèle Panorama des modèles et outils de programmation parallèle Sylvain HENRY sylvain.henry@inria.fr University of Bordeaux - LaBRI - Inria - ENSEIRB April 19th, 2013 1/45 Outline Introduction Accelerators &

More information

Coordinate Update Algorithm Short Course The Package TMAC

Coordinate Update Algorithm Short Course The Package TMAC Coordinate Update Algorithm Short Course The Package TMAC Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 16 TMAC: A Toolbox of Async-Parallel, Coordinate, Splitting, and Stochastic Methods C++11 multi-threading

More information

Dense Arithmetic over Finite Fields with CUMODP

Dense Arithmetic over Finite Fields with CUMODP Dense Arithmetic over Finite Fields with CUMODP Sardar Anisul Haque 1 Xin Li 2 Farnam Mansouri 1 Marc Moreno Maza 1 Wei Pan 3 Ning Xie 1 1 University of Western Ontario, Canada 2 Universidad Carlos III,

More information

Problem set 5 solutions 1

Problem set 5 solutions 1 Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.242, Fall 24: MODEL REDUCTION Problem set 5 solutions Problem 5. For each of the stetements below, state

More information

INF2270 Spring Philipp Häfliger. Lecture 8: Superscalar CPUs, Course Summary/Repetition (1/2)

INF2270 Spring Philipp Häfliger. Lecture 8: Superscalar CPUs, Course Summary/Repetition (1/2) INF2270 Spring 2010 Philipp Häfliger Summary/Repetition (1/2) content From Scalar to Superscalar Lecture Summary and Brief Repetition Binary numbers Boolean Algebra Combinational Logic Circuits Encoder/Decoder

More information

An Efficient FETI Implementation on Distributed Shared Memory Machines with Independent Numbers of Subdomains and Processors

An Efficient FETI Implementation on Distributed Shared Memory Machines with Independent Numbers of Subdomains and Processors Contemporary Mathematics Volume 218, 1998 B 0-8218-0988-1-03024-7 An Efficient FETI Implementation on Distributed Shared Memory Machines with Independent Numbers of Subdomains and Processors Michel Lesoinne

More information

Lecture 22: Multithreaded Algorithms CSCI Algorithms I. Andrew Rosenberg

Lecture 22: Multithreaded Algorithms CSCI Algorithms I. Andrew Rosenberg Lecture 22: Multithreaded Algorithms CSCI 700 - Algorithms I Andrew Rosenberg Last Time Open Addressing Hashing Today Multithreading Two Styles of Threading Shared Memory Every thread can access the same

More information

High Performance Computing

High Performance Computing Master Degree Program in Computer Science and Networking, 2014-15 High Performance Computing 2 nd appello February 11, 2015 Write your name, surname, student identification number (numero di matricola),

More information

The parallelization of the Keller box method on heterogeneous cluster of workstations

The parallelization of the Keller box method on heterogeneous cluster of workstations Available online at http://wwwibnusinautmmy/jfs Journal of Fundamental Sciences Article The parallelization of the Keller box method on heterogeneous cluster of workstations Norhafiza Hamzah*, Norma Alias,

More information

J.I. Aliaga 1 M. Bollhöfer 2 A.F. Martín 1 E.S. Quintana-Ortí 1. March, 2009

J.I. Aliaga 1 M. Bollhöfer 2 A.F. Martín 1 E.S. Quintana-Ortí 1. March, 2009 Parallel Preconditioning of Linear Systems based on ILUPACK for Multithreaded Architectures J.I. Aliaga M. Bollhöfer 2 A.F. Martín E.S. Quintana-Ortí Deparment of Computer Science and Engineering, Univ.

More information

SPARSE SOLVERS POISSON EQUATION. Margreet Nool. November 9, 2015 FOR THE. CWI, Multiscale Dynamics

SPARSE SOLVERS POISSON EQUATION. Margreet Nool. November 9, 2015 FOR THE. CWI, Multiscale Dynamics SPARSE SOLVERS FOR THE POISSON EQUATION Margreet Nool CWI, Multiscale Dynamics November 9, 2015 OUTLINE OF THIS TALK 1 FISHPACK, LAPACK, PARDISO 2 SYSTEM OVERVIEW OF CARTESIUS 3 POISSON EQUATION 4 SOLVERS

More information

Cyclops Tensor Framework

Cyclops Tensor Framework Cyclops Tensor Framework Edgar Solomonik Department of EECS, Computer Science Division, UC Berkeley March 17, 2014 1 / 29 Edgar Solomonik Cyclops Tensor Framework 1/ 29 Definition of a tensor A rank r

More information

Projectile Motion Slide 1/16. Projectile Motion. Fall Semester. Parallel Computing

Projectile Motion Slide 1/16. Projectile Motion. Fall Semester. Parallel Computing Projectile Motion Slide 1/16 Projectile Motion Fall Semester Projectile Motion Slide 2/16 Topic Outline Historical Perspective ABC and ENIAC Ballistics tables Projectile Motion Air resistance Euler s method

More information

Parallelism in Structured Newton Computations

Parallelism in Structured Newton Computations Parallelism in Structured Newton Computations Thomas F Coleman and Wei u Department of Combinatorics and Optimization University of Waterloo Waterloo, Ontario, Canada N2L 3G1 E-mail: tfcoleman@uwaterlooca

More information

Schwarz-type methods and their application in geomechanics

Schwarz-type methods and their application in geomechanics Schwarz-type methods and their application in geomechanics R. Blaheta, O. Jakl, K. Krečmer, J. Starý Institute of Geonics AS CR, Ostrava, Czech Republic E-mail: stary@ugn.cas.cz PDEMAMIP, September 7-11,

More information

Model reduction via tangential interpolation

Model reduction via tangential interpolation Model reduction via tangential interpolation K. Gallivan, A. Vandendorpe and P. Van Dooren May 14, 2002 1 Introduction Although most of the theory presented in this paper holds for both continuous-time

More information

Parallel Scientific Computing

Parallel Scientific Computing IV-1 Parallel Scientific Computing Matrix-vector multiplication. Matrix-matrix multiplication. Direct method for solving a linear equation. Gaussian Elimination. Iterative method for solving a linear equation.

More information

CRYSTAL in parallel: replicated and distributed (MPP) data. Why parallel?

CRYSTAL in parallel: replicated and distributed (MPP) data. Why parallel? CRYSTAL in parallel: replicated and distributed (MPP) data Roberto Orlando Dipartimento di Chimica Università di Torino Via Pietro Giuria 5, 10125 Torino (Italy) roberto.orlando@unito.it 1 Why parallel?

More information

Efficiency of Dynamic Load Balancing Based on Permanent Cells for Parallel Molecular Dynamics Simulation

Efficiency of Dynamic Load Balancing Based on Permanent Cells for Parallel Molecular Dynamics Simulation Efficiency of Dynamic Load Balancing Based on Permanent Cells for Parallel Molecular Dynamics Simulation Ryoko Hayashi and Susumu Horiguchi School of Information Science, Japan Advanced Institute of Science

More information

CME342 Parallel Methods in Numerical Analysis. Matrix Computation: Iterative Methods II. Sparse Matrix-vector Multiplication.

CME342 Parallel Methods in Numerical Analysis. Matrix Computation: Iterative Methods II. Sparse Matrix-vector Multiplication. CME342 Parallel Methods in Numerical Analysis Matrix Computation: Iterative Methods II Outline: CG & its parallelization. Sparse Matrix-vector Multiplication. 1 Basic iterative methods: Ax = b r = b Ax

More information

Communication-avoiding parallel and sequential QR factorizations

Communication-avoiding parallel and sequential QR factorizations Communication-avoiding parallel and sequential QR factorizations James Demmel Laura Grigori Mark Frederick Hoemmen Julien Langou Electrical Engineering and Computer Sciences University of California at

More information

Zacros. Software Package Development: Pushing the Frontiers of Kinetic Monte Carlo Simulation in Catalysis

Zacros. Software Package Development: Pushing the Frontiers of Kinetic Monte Carlo Simulation in Catalysis Zacros Software Package Development: Pushing the Frontiers of Kinetic Monte Carlo Simulation in Catalysis Jens H Nielsen, Mayeul D'Avezac, James Hetherington & Michail Stamatakis Introduction to Zacros

More information

HYCOM and Navy ESPC Future High Performance Computing Needs. Alan J. Wallcraft. COAPS Short Seminar November 6, 2017

HYCOM and Navy ESPC Future High Performance Computing Needs. Alan J. Wallcraft. COAPS Short Seminar November 6, 2017 HYCOM and Navy ESPC Future High Performance Computing Needs Alan J. Wallcraft COAPS Short Seminar November 6, 2017 Forecasting Architectural Trends 3 NAVY OPERATIONAL GLOBAL OCEAN PREDICTION Trend is higher

More information

Performance Analysis of Lattice QCD Application with APGAS Programming Model

Performance Analysis of Lattice QCD Application with APGAS Programming Model Performance Analysis of Lattice QCD Application with APGAS Programming Model Koichi Shirahata 1, Jun Doi 2, Mikio Takeuchi 2 1: Tokyo Institute of Technology 2: IBM Research - Tokyo Programming Models

More information

Model Reduction of Inhomogeneous Initial Conditions

Model Reduction of Inhomogeneous Initial Conditions Model Reduction of Inhomogeneous Initial Conditions Caleb Magruder Abstract Our goal is to develop model reduction processes for linear dynamical systems with non-zero initial conditions. Standard model

More information

A Quantum Chemistry Domain-Specific Language for Heterogeneous Clusters

A Quantum Chemistry Domain-Specific Language for Heterogeneous Clusters A Quantum Chemistry Domain-Specific Language for Heterogeneous Clusters ANTONINO TUMEO, ORESTE VILLA Collaborators: Karol Kowalski, Sriram Krishnamoorthy, Wenjing Ma, Simone Secchi May 15, 2012 1 Outline!

More information

Porting a sphere optimization program from LAPACK to ScaLAPACK

Porting a sphere optimization program from LAPACK to ScaLAPACK Porting a sphere optimization program from LAPACK to ScaLAPACK Mathematical Sciences Institute, Australian National University. For presentation at Computational Techniques and Applications Conference

More information

Direct Self-Consistent Field Computations on GPU Clusters

Direct Self-Consistent Field Computations on GPU Clusters Direct Self-Consistent Field Computations on GPU Clusters Guochun Shi, Volodymyr Kindratenko National Center for Supercomputing Applications University of Illinois at UrbanaChampaign Ivan Ufimtsev, Todd

More information

Iterative Rational Krylov Algorithm for Unstable Dynamical Systems and Generalized Coprime Factorizations

Iterative Rational Krylov Algorithm for Unstable Dynamical Systems and Generalized Coprime Factorizations Iterative Rational Krylov Algorithm for Unstable Dynamical Systems and Generalized Coprime Factorizations Klajdi Sinani Thesis submitted to the Faculty of the Virginia Polytechnic Institute and State University

More information

Parametric Model Order Reduction for Linear Control Systems

Parametric Model Order Reduction for Linear Control Systems Parametric Model Order Reduction for Linear Control Systems Peter Benner HRZZ Project Control of Dynamical Systems (ConDys) second project meeting Zagreb, 2 3 November 2017 Outline 1. Introduction 2. PMOR

More information

Using Model Reduction techniques for simulating the heat transfer in electronic systems.

Using Model Reduction techniques for simulating the heat transfer in electronic systems. Using Model Reduction techniques for simulating the heat transfer in electronic systems. -Pramod Mathai, Dr. Benjamin Shapiro, University of Maryland, College Park Abstract: There is an increasing need

More information

11 Parallel programming models

11 Parallel programming models 237 // Program Design 10.3 Assessing parallel programs 11 Parallel programming models Many different models for expressing parallelism in programming languages Actor model Erlang Scala Coordination languages

More information

CSE 548: Analysis of Algorithms. Lecture 12 ( Analyzing Parallel Algorithms )

CSE 548: Analysis of Algorithms. Lecture 12 ( Analyzing Parallel Algorithms ) CSE 548: Analysis of Algorithms Lecture 12 ( Analyzing Parallel Algorithms ) Rezaul A. Chowdhury Department of Computer Science SUNY Stony Brook Fall 2017 Why Parallelism? Moore s Law Source: Wikipedia

More information

THE TWO DIMENSIONAL COUPLED NONLINEAR SCHRÖDINGER EQUATION- NUMERICAL METHODS AND EXPERIMENTS HARINI MEDIKONDURU

THE TWO DIMENSIONAL COUPLED NONLINEAR SCHRÖDINGER EQUATION- NUMERICAL METHODS AND EXPERIMENTS HARINI MEDIKONDURU THE TWO DIMENSIONAL COUPLED NONLINEAR SCHRÖDINGER EQUATION- NUMERICAL METHODS AND EXPERIMENTS by HARINI MEDIKONDURU (Under the Direction of Thiab R. Taha) ABSTRACT The coupled nonlinear Schrödinger equation

More information

Fluid flow dynamical model approximation and control

Fluid flow dynamical model approximation and control Fluid flow dynamical model approximation and control... a case-study on an open cavity flow C. Poussot-Vassal & D. Sipp Journée conjointe GT Contrôle de Décollement & GT MOSAR Frequency response of an

More information

Adaptive rational Krylov subspaces for large-scale dynamical systems. V. Simoncini

Adaptive rational Krylov subspaces for large-scale dynamical systems. V. Simoncini Adaptive rational Krylov subspaces for large-scale dynamical systems V. Simoncini Dipartimento di Matematica, Università di Bologna valeria@dm.unibo.it joint work with Vladimir Druskin, Schlumberger Doll

More information

BLAS: Basic Linear Algebra Subroutines Analysis of the Matrix-Vector-Product Analysis of Matrix-Matrix Product

BLAS: Basic Linear Algebra Subroutines Analysis of the Matrix-Vector-Product Analysis of Matrix-Matrix Product Level-1 BLAS: SAXPY BLAS-Notation: S single precision (D for double, C for complex) A α scalar X vector P plus operation Y vector SAXPY: y = αx + y Vectorization of SAXPY (αx + y) by pipelining: page 8

More information

Scikit-learn. scikit. Machine learning for the small and the many Gaël Varoquaux. machine learning in Python

Scikit-learn. scikit. Machine learning for the small and the many Gaël Varoquaux. machine learning in Python Scikit-learn Machine learning for the small and the many Gaël Varoquaux scikit machine learning in Python In this meeting, I represent low performance computing Scikit-learn Machine learning for the small

More information

PRECONDITIONING IN THE PARALLEL BLOCK-JACOBI SVD ALGORITHM

PRECONDITIONING IN THE PARALLEL BLOCK-JACOBI SVD ALGORITHM Proceedings of ALGORITMY 25 pp. 22 211 PRECONDITIONING IN THE PARALLEL BLOCK-JACOBI SVD ALGORITHM GABRIEL OKŠA AND MARIÁN VAJTERŠIC Abstract. One way, how to speed up the computation of the singular value

More information

Parallelization of the QC-lib Quantum Computer Simulator Library

Parallelization of the QC-lib Quantum Computer Simulator Library Parallelization of the QC-lib Quantum Computer Simulator Library Ian Glendinning and Bernhard Ömer VCPC European Centre for Parallel Computing at Vienna Liechtensteinstraße 22, A-19 Vienna, Austria http://www.vcpc.univie.ac.at/qc/

More information

(Mathematical Operations with Arrays) Applied Linear Algebra in Geoscience Using MATLAB

(Mathematical Operations with Arrays) Applied Linear Algebra in Geoscience Using MATLAB Applied Linear Algebra in Geoscience Using MATLAB (Mathematical Operations with Arrays) Contents Getting Started Matrices Creating Arrays Linear equations Mathematical Operations with Arrays Using Script

More information

Accelerating linear algebra computations with hybrid GPU-multicore systems.

Accelerating linear algebra computations with hybrid GPU-multicore systems. Accelerating linear algebra computations with hybrid GPU-multicore systems. Marc Baboulin INRIA/Université Paris-Sud joint work with Jack Dongarra (University of Tennessee and Oak Ridge National Laboratory)

More information

Convergence Models and Surprising Results for the Asynchronous Jacobi Method

Convergence Models and Surprising Results for the Asynchronous Jacobi Method Convergence Models and Surprising Results for the Asynchronous Jacobi Method Jordi Wolfson-Pou School of Computational Science and Engineering Georgia Institute of Technology Atlanta, Georgia, United States

More information

TR A Comparison of the Performance of SaP::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems

TR A Comparison of the Performance of SaP::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems TR-0-07 A Comparison of the Performance of ::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems Ang Li, Omkar Deshmukh, Radu Serban, Dan Negrut May, 0 Abstract ::GPU is a

More information

Review for the Midterm Exam

Review for the Midterm Exam Review for the Midterm Exam 1 Three Questions of the Computational Science Prelim scaled speedup network topologies work stealing 2 The in-class Spring 2012 Midterm Exam pleasingly parallel computations

More information

& - Analysis and Reduction of Large- Scale Dynamic Systems in MATLAB

& - Analysis and Reduction of Large- Scale Dynamic Systems in MATLAB & - Analysis and Reduction of Large- Scale Dynamic Systems in MATLAB Alessandro Castagnotto in collaboration with: Maria Cruz Varona, Boris Lohmann related publications: sss & sssmor: Analysis and reduction

More information

Optimization Techniques for Parallel Code 1. Parallel programming models

Optimization Techniques for Parallel Code 1. Parallel programming models Optimization Techniques for Parallel Code 1. Parallel programming models Sylvain Collange Inria Rennes Bretagne Atlantique http://www.irisa.fr/alf/collange/ sylvain.collange@inria.fr OPT - 2017 Goals of

More information

Efficient Longest Common Subsequence Computation using Bulk-Synchronous Parallelism

Efficient Longest Common Subsequence Computation using Bulk-Synchronous Parallelism Efficient Longest Common Subsequence Computation using Bulk-Synchronous Parallelism Peter Krusche Department of Computer Science University of Warwick June 2006 Outline 1 Introduction Motivation The BSP

More information

Energy-aware scheduling for GreenIT in large-scale distributed systems

Energy-aware scheduling for GreenIT in large-scale distributed systems Energy-aware scheduling for GreenIT in large-scale distributed systems 1 PASCAL BOUVRY UNIVERSITY OF LUXEMBOURG GreenIT CORE/FNR project Context and Motivation Outline 2 Problem Description Proposed Solution

More information

An iterative SVD-Krylov based method for model reduction of large-scale dynamical systems

An iterative SVD-Krylov based method for model reduction of large-scale dynamical systems An iterative SVD-Krylov based method for model reduction of large-scale dynamical systems Serkan Gugercin Department of Mathematics, Virginia Tech., Blacksburg, VA, USA, 24061-0123 gugercin@math.vt.edu

More information

Performance of the fusion code GYRO on three four generations of Crays. Mark Fahey University of Tennessee, Knoxville

Performance of the fusion code GYRO on three four generations of Crays. Mark Fahey University of Tennessee, Knoxville Performance of the fusion code GYRO on three four generations of Crays Mark Fahey mfahey@utk.edu University of Tennessee, Knoxville Contents Introduction GYRO Overview Benchmark Problem Test Platforms

More information

QuIDD-Optimised Quantum Algorithms

QuIDD-Optimised Quantum Algorithms QuIDD-Optimised Quantum Algorithms by S K University of York Computer science 3 rd year project Supervisor: Prof Susan Stepney 03/05/2004 1 Project Objectives Investigate the QuIDD optimisation techniques

More information

Massively parallel semi-lagrangian solution of the 6d Vlasov-Poisson problem

Massively parallel semi-lagrangian solution of the 6d Vlasov-Poisson problem Massively parallel semi-lagrangian solution of the 6d Vlasov-Poisson problem Katharina Kormann 1 Klaus Reuter 2 Markus Rampp 2 Eric Sonnendrücker 1 1 Max Planck Institut für Plasmaphysik 2 Max Planck Computing

More information

Applications of Mathematical Economics

Applications of Mathematical Economics Applications of Mathematical Economics Michael Curran Trinity College Dublin Overview Introduction. Data Preparation Filters. Dynamic Stochastic General Equilibrium Models: Sunspots and Blanchard-Kahn

More information

Parallelization of the QC-lib Quantum Computer Simulator Library

Parallelization of the QC-lib Quantum Computer Simulator Library Parallelization of the QC-lib Quantum Computer Simulator Library Ian Glendinning and Bernhard Ömer September 9, 23 PPAM 23 1 Ian Glendinning / September 9, 23 Outline Introduction Quantum Bits, Registers

More information

Parallel programming using MPI. Analysis and optimization. Bhupender Thakur, Jim Lupo, Le Yan, Alex Pacheco

Parallel programming using MPI. Analysis and optimization. Bhupender Thakur, Jim Lupo, Le Yan, Alex Pacheco Parallel programming using MPI Analysis and optimization Bhupender Thakur, Jim Lupo, Le Yan, Alex Pacheco Outline l Parallel programming: Basic definitions l Choosing right algorithms: Optimal serial and

More information

An Algorithmic Framework of Large-Scale Circuit Simulation Using Exponential Integrators

An Algorithmic Framework of Large-Scale Circuit Simulation Using Exponential Integrators An Algorithmic Framework of Large-Scale Circuit Simulation Using Exponential Integrators Hao Zhuang 1, Wenjian Yu 2, Ilgweon Kang 1, Xinan Wang 1, and Chung-Kuan Cheng 1 1. University of California, San

More information

Krylov Techniques for Model Reduction of Second-Order Systems

Krylov Techniques for Model Reduction of Second-Order Systems Krylov Techniques for Model Reduction of Second-Order Systems A Vandendorpe and P Van Dooren February 4, 2004 Abstract The purpose of this paper is to present a Krylov technique for model reduction of

More information

Parallel Numerical Algorithms

Parallel Numerical Algorithms Parallel Numerical Algorithms Chapter 6 Matrix Models Section 6.2 Low Rank Approximation Edgar Solomonik Department of Computer Science University of Illinois at Urbana-Champaign CS 554 / CSE 512 Edgar

More information

Parallel Programming in C with MPI and OpenMP

Parallel Programming in C with MPI and OpenMP Parallel Programming in C with MPI and OpenMP Michael J. Quinn Chapter 13 Finite Difference Methods Outline n Ordinary and partial differential equations n Finite difference methods n Vibrating string

More information

Robust Multivariable Control

Robust Multivariable Control Lecture 2 Anders Helmersson anders.helmersson@liu.se ISY/Reglerteknik Linköpings universitet Today s topics Today s topics Norms Today s topics Norms Representation of dynamic systems Today s topics Norms

More information

Parallel Programming. Parallel algorithms Linear systems solvers

Parallel Programming. Parallel algorithms Linear systems solvers Parallel Programming Parallel algorithms Linear systems solvers Terminology System of linear equations Solve Ax = b for x Special matrices Upper triangular Lower triangular Diagonally dominant Symmetric

More information

Fast Frequency Response Analysis using Model Order Reduction

Fast Frequency Response Analysis using Model Order Reduction Fast Frequency Response Analysis using Model Order Reduction Peter Benner EU MORNET Exploratory Workshop Applications of Model Order Reduction Methods in Industrial Research and Development Luxembourg,

More information

Sparse BLAS-3 Reduction

Sparse BLAS-3 Reduction Sparse BLAS-3 Reduction to Banded Upper Triangular (Spar3Bnd) Gary Howell, HPC/OIT NC State University gary howell@ncsu.edu Sparse BLAS-3 Reduction p.1/27 Acknowledgements James Demmel, Gene Golub, Franc

More information

The EVSL package for symmetric eigenvalue problems Yousef Saad Department of Computer Science and Engineering University of Minnesota

The EVSL package for symmetric eigenvalue problems Yousef Saad Department of Computer Science and Engineering University of Minnesota The EVSL package for symmetric eigenvalue problems Yousef Saad Department of Computer Science and Engineering University of Minnesota 15th Copper Mountain Conference Mar. 28, 218 First: Joint work with

More information

Lecture 13: Sequential Circuits, FSM

Lecture 13: Sequential Circuits, FSM Lecture 13: Sequential Circuits, FSM Today s topics: Sequential circuits Finite state machines 1 Clocks A microprocessor is composed of many different circuits that are operating simultaneously if each

More information

An iterative SVD-Krylov based method for model reduction of large-scale dynamical systems

An iterative SVD-Krylov based method for model reduction of large-scale dynamical systems Available online at www.sciencedirect.com Linear Algebra and its Applications 428 (2008) 1964 1986 www.elsevier.com/locate/laa An iterative SVD-Krylov based method for model reduction of large-scale dynamical

More information

Lightweight Superscalar Task Execution in Distributed Memory

Lightweight Superscalar Task Execution in Distributed Memory Lightweight Superscalar Task Execution in Distributed Memory Asim YarKhan 1 and Jack Dongarra 1,2,3 1 Innovative Computing Lab, University of Tennessee, Knoxville, TN 2 Oak Ridge National Lab, Oak Ridge,

More information

Parallel PIPS-SBB Multi-level parallelism for 2-stage SMIPS. Lluís-Miquel Munguia, Geoffrey M. Oxberry, Deepak Rajan, Yuji Shinano

Parallel PIPS-SBB Multi-level parallelism for 2-stage SMIPS. Lluís-Miquel Munguia, Geoffrey M. Oxberry, Deepak Rajan, Yuji Shinano Parallel PIPS-SBB Multi-level parallelism for 2-stage SMIPS Lluís-Miquel Munguia, Geoffrey M. Oxberry, Deepak Rajan, Yuji Shinano ... Our contribution PIPS-PSBB*: Multi-level parallelism for Stochastic

More information

A refined Lanczos method for computing eigenvalues and eigenvectors of unsymmetric matrices

A refined Lanczos method for computing eigenvalues and eigenvectors of unsymmetric matrices A refined Lanczos method for computing eigenvalues and eigenvectors of unsymmetric matrices Jean Christophe Tremblay and Tucker Carrington Chemistry Department Queen s University 23 août 2007 We want to

More information

arxiv: v1 [hep-lat] 7 Oct 2010

arxiv: v1 [hep-lat] 7 Oct 2010 arxiv:.486v [hep-lat] 7 Oct 2 Nuno Cardoso CFTP, Instituto Superior Técnico E-mail: nunocardoso@cftp.ist.utl.pt Pedro Bicudo CFTP, Instituto Superior Técnico E-mail: bicudo@ist.utl.pt We discuss the CUDA

More information

An Integrative Model for Parallelism

An Integrative Model for Parallelism An Integrative Model for Parallelism Victor Eijkhout ICERM workshop 2012/01/09 Introduction Formal part Examples Extension to other memory models Conclusion tw-12-exascale 2012/01/09 2 Introduction tw-12-exascale

More information

NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1

NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1 NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1 Multi-processor vs. Multi-computer architecture µp vs. DSP RISC vs. DSP RISC Reduced-instruction-set Register-to-register operation Higher throughput by using

More information

H 2 -optimal model reduction of MIMO systems

H 2 -optimal model reduction of MIMO systems H 2 -optimal model reduction of MIMO systems P. Van Dooren K. A. Gallivan P.-A. Absil Abstract We consider the problem of approximating a p m rational transfer function Hs of high degree by another p m

More information

Heterogeneous programming for hybrid CPU-GPU systems: Lessons learned from computational chemistry

Heterogeneous programming for hybrid CPU-GPU systems: Lessons learned from computational chemistry Heterogeneous programming for hybrid CPU-GPU systems: Lessons learned from computational chemistry and Eugene DePrince Argonne National Laboratory (LCF and CNM) (Eugene moved to Georgia Tech last week)

More information

arxiv: v3 [math.na] 6 Jul 2018

arxiv: v3 [math.na] 6 Jul 2018 A Connection Between Time Domain Model Order Reduction and Moment Matching for LTI Systems arxiv:181.785v3 [math.na] 6 Jul 218 Manuela Hund a and Jens Saak a a Max Planck Institute for Dynamics of Complex

More information