FPGA Implementation of a Predictive Controller
|
|
- Robert Harris
- 5 years ago
- Views:
Transcription
1 FPGA Implementation of a Predictive Controller SIAM Conference on Optimization 2011, Darmstadt, Germany Minisymposium on embedded optimization Juan L. Jerez, George A. Constantinides and Eric C. Kerrigan May 18, / 20 Juan L. Jerez, George A. Constantinides and Eric C. Kerrigan
2 MPC Problem Formulation Contents Field Programmable Gate Array (FPGA) Algorithms for Quadratic Programming Implementation Details Results Related Work 2 / 20 Juan L. Jerez, George A. Constantinides and Eric C. Kerrigan
3 Optimal control problem subject to min θ x T N Qx N + N 1 k=0 [ xk u k ] T [ Q S S T R ] [ xk u k ] (1) x 0 = x (2a) x k+1 = Ax k + Bu k for k = 0, 1, 2,..., N 1 (2b) Jx k + Eu k d for k = 0, 1, 2,..., N 1 (2c) x k R n, u k R m Goal Accelerate the computation of the optimal value θ such that MPC can be implemented at faster sampling rates 3 / 20 Juan L. Jerez, George A. Constantinides and Eric C. Kerrigan
4 where Quadratic Programming Formulation 1 min θ 2 θt Hθ subject to F θ = f, Gθ g θ := [x0 T u0 T x1 T u1 T x2 T u2 T... xn 1 T un 1 T xn T ] T R N(n+m)+n, [ ] Q S I H := N S T 0 R, 0 Q I n x A B I n F :=..., f := 0., A B I n 0 G := I N [ J E ], g := d := 1 N d. 4 / 20 Juan L. Jerez, George A. Constantinides and Eric C. Kerrigan
5 where Quadratic Programming Formulation 1 min θ 2 θt Hθ subject to F θ = f, Gθ g θ := [x0 T u0 T x1 T u1 T x2 T u2 T... xn 1 T un 1 T xn T ] T R N(n+m)+n, [ ] Q S I RESULT H := N S T 0 R, DATA 0 Q I n x A B I n F :=..., f := 0., A B I n 0 G := I N [ J E ], g := d := 1 N d. 4 / 20 Juan L. Jerez, George A. Constantinides and Eric C. Kerrigan
6 Reconfigurable logic blocks Reconfigurable interconnect Other reconfigurable hard blocks What is an FPGA? On-chip memories Embedded multipliers Advantages for embedded real-time applications Deterministic execution time Computational/Energy efficiency Much reduced low volume cost compared to ASIC Disadvantages Clock frequency < 350MHz Hardware design process 5 / 20 Juan L. Jerez, George A. Constantinides and Eric C. Kerrigan
7 Is MPC suitable for FPGA computation? Parallelisation opportunities Level 2 BLAS operations Deep pipelining is necessary to maintain high clock frequency 6 / 20 Juan L. Jerez, George A. Constantinides and Eric C. Kerrigan
8 Is MPC suitable for FPGA computation? Parallelisation opportunities Level 2 BLAS operations Deep pipelining is necessary to maintain high clock frequency 6 / 20 Juan L. Jerez, George A. Constantinides and Eric C. Kerrigan
9 Is MPC suitable for FPGA computation? Parallelisation opportunities Level 2 BLAS operations Deep pipelining is necessary to maintain high clock frequency 6 / 20 Juan L. Jerez, George A. Constantinides and Eric C. Kerrigan
10 Is MPC suitable for FPGA computation? Cycle accurate completion guarantee No jitter Compute-bound application O(n + m) 3 compute operations O(n + m) I/O operations Fixed-point computation is faster and uses less resources 7 / 20 Juan L. Jerez, George A. Constantinides and Eric C. Kerrigan
11 Algorithms for Quadratic Programming Active-Set methods Worst-case exponential complexity Varying matrix structure Interior-Point methods Polynomial complexity Predictable matrix structure S. Mehrotra: Solves two systems of linear equations every iteration S. Wright [1]: Solves one system of linear equations [1] Applying new optimization algorithms to model predictive control. In Proc. Int. Conf. Chemical Process Control, Jan / 20 Juan L. Jerez, George A. Constantinides and Eric C. Kerrigan
12 Why iterative linear solvers? Small number of division operations Matrix vector multiplications Easy to parallelise Trade off between computation time and accuracy Conserve matrix structure (no fill-in) Allows exploiting fine structure to reduce memory requirements Examples Conjugate Gradient (CG) for SPD matrices Minimum Residual (MINRES) for indefinite symmetric matrices 9 / 20 Juan L. Jerez, George A. Constantinides and Eric C. Kerrigan
13 Infeasible Primal-Dual Interior-Point algorithm Initialization (θ 0, ν 0, λ 0, s 0) with [λ T 0 s T 0 ] T > 0 for k = 0 to I IP 1 do [ H + G T W Linearization A k := k G F T F 0 [ (H + G T W b k := k G)θ k F T ν G T (λ k W k g + σµs 1 k ) F θ k + f [ ] θk Solve A k z k = b k for z k =: ν k Compute λ k := W k (G(θ k + θ k ) g + σµs 1 k ) s k := s k λ k [ ] λk + α λ Line Search α k := max (0,1] α : k > 0. s k + α s k Update (θ k+1, ν k+1, λ k+1, s k+1 ) := (θ k, ν k, λ k, s k ) + α k ( θ k, ν k, λ k, s k ) end for 10 / 20 Juan L. Jerez, George A. Constantinides and Eric C. Kerrigan ], ]
14 Coefficient Matrix A k After variable re-ordering: I I Q 0 S A T S T R 0 B T A B I I Q 1 S A T S T R 1 B T A B I... I Q N 1 S A T S T R N 1 B T A B I I Q N Banded Size Symmetric Halfband Indefinite 11 / 20 Juan L. Jerez, George A. Constantinides and Eric C. Kerrigan Z := N(2n + m) + 2n M := 2n + m
15 Coefficient Matrix A k After variable re-ordering: I I Q 0 S A T S T R 0 B T A B I I Q 1 S A T S T R 1 B T A B I... I Q N 1 S A T S T R N 1 B T A B I I Q N Banded Size Symmetric Halfband Indefinite 11 / 20 Juan L. Jerez, George A. Constantinides and Eric C. Kerrigan Z := N(2n + m) + 2n M := 2n + m
16 Matrix storage Columns of symmetric CDS matrix are stored in separate on-chip memories In-band zeros and ones do not need to be stored Constant columns consist of repeated blocks and are constant for all problems being solved simultaneously 12 / 20 Juan L. Jerez, George A. Constantinides and Eric C. Kerrigan
17 Matrix storage Columns of symmetric CDS matrix are stored in separate on-chip memories In-band zeros and ones do not need to be stored Constant columns consist of repeated blocks and are constant for all problems being solved simultaneously 12 / 20 Juan L. Jerez, George A. Constantinides and Eric C. Kerrigan
18 Reduction in storage requirements 13 / 20 Juan L. Jerez, George A. Constantinides and Eric C. Kerrigan
19 MINRES implementation Hardware architecture for computing Aq i RAMcolumn1 RAMcolumnM-1 RAMcolumnM Z -(M-1) Z -(M-2) vector x x x x 1 2 M 2M-2 x2m log2(2m-1) latency = 2Z + M + k 1 log 2 (2M 1) + k 2 throughput = Z #problems = 2Z+M+k1 log 2 (2M 1) +k 2 Z + Z 3 q 1 = b, β 1 = q 1 2 for k = 1 to I MR do q i = q i β i z = Aq i α = qi T z q i+1 = z αq i β i q i 1 β i+1 = q i+1 2. γ i+1 = δ ρ 1 σ i+1 = β i+1 ρ 1 w i = q i ρ 3w i 2 ρ 2w i 1 ρ 1 x i = x i 1 + γ i+1 ηw i η = σ i+1 η end for 14 / 20 Juan L. Jerez, George A. Constantinides and Eric C. Kerrigan
20 QP solver design overview maximise throughput: latency IP = 2 latency Stage2 (solves 2 #problems) For large problems, a sequential implementation of Stage 1 is sufficient for latency Stage1 < latency Stage2 minimise latency: latency IP = latency Stage1 + latency Stage2 (solves 1 problem) 15 / 20 Juan L. Jerez, George A. Constantinides and Eric C. Kerrigan
21 Number of free parallel channels 25 Number of parallel channels Number of states (n) Number of inputs (m) [1] An FPGA Implementation of a Sparse Quadratic Programming Solver for Constrained Predictive Control. In Proc. ACM/SIGDA Symposium on Field Programmable Gate Arrays. Mar / 20 Juan L. Jerez, George A. Constantinides and Eric C. Kerrigan
22 Performance Hardware : Xilinx Virtex 6 SX 250MHz (40nm) Software : Intel Core2 2.5GHz, 3GB RAM, 4MB L2 Cache (45nm) Time per interior point iteration, seconds CPU measured FPGA latency (2 #problems) FPGA throughput (2 #problems) FPGA latency (1 problem) Number of states, n 17 / 20 Juan L. Jerez, George A. Constantinides and Eric C. Kerrigan For small problems there is no performance improvement. For the largest problem, the improvement is: Red curve: 14x Black curve: 36x Blue curve: 85x 3 inputs 3 outputs 20 steps state and input constraints
23 Filling the pipeline Parallel Multiplexed MPC [1][2] Each thread optimizes over a subset of the m inputs assuming a fixed value for the rest. Effect on the size of the problem: m m 2 #problems Parallel Move Blocking MPC [3] The horizon N is split into blocks Each independent thread solves a problem with different splitting pattern to guarantee recursive feasibility Effect on the size of the problem: N N 2 #problems [1] MPC for Deeply Pipelined FPGA Implementation: Algorithms and Circuitry. In IET Control Theory and Applications [2] Parallel MPC for Real-time FPGA-based Implementation. In Proc. IFAC World Congress Aug [3] Parallel Move Blocking Model Predictive Control. Submitted to Conference on Decision and Control Dec / 20 Juan L. Jerez, George A. Constantinides and Eric C. Kerrigan
24 Filling the pipeline Other possible strategies: Distributed algorithms Sampling faster than the computational delay Moving horizon estimation 19 / 20 Juan L. Jerez, George A. Constantinides and Eric C. Kerrigan
25 Questions 20 / 20 Juan L. Jerez, George A. Constantinides and Eric C. Kerrigan
A Condensed and Sparse QP Formulation for Predictive Control
211 5th IEEE Conference on Decision and Control and European Control Conference (CDC-ECC) Orlando, FL, USA, December 12-15, 211 A Condensed and Sparse QP Formulation for Predictive Control Juan L Jerez,
More informationHPMPC - A new software package with efficient solvers for Model Predictive Control
- A new software package with efficient solvers for Model Predictive Control Technical University of Denmark CITIES Second General Consortium Meeting, DTU, Lyngby Campus, 26-27 May 2015 Introduction Model
More informationTowards a Fixed Point QP Solver for Predictive Control
5st IEEE Conference on Decision and Control December -3,. Maui, Hawaii, USA Towards a Fixed Point QP Solver for Predictive Control Juan L. Jerez, George A. Constantinides and Eric C. Kerrigan Abstract
More informationOptimizing Loop Operation and Dataflow in FPGA Acceleration of Deep Convolutional Neural Networks
2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays Optimizing Loop Operation and Dataflow in FPGA Acceleration of Deep Convolutional Neural Networks Yufei Ma, Yu Cao, Sarma Vrudhula,
More informationModel Predictive Control on an FPGA: Aerospace and Space Scenarios
Model Predictive Control on an FPGA: Aerospace and Space Scenarios Edward Hartley (edward.hartley@eng.cam.ac.uk) Workshop on Embedded Optimisation EMBOPT 2014, IMT Lucca Monday 8th September 2014: 14:00
More informationParallel Move Blocking Model Predictive Control
Parallel Move Blocking Model Predictive Control Stefano Longo, Eric C. Kerrigan, Keck Voon Ling and George A. Constantinides Abstract This paper proposes the use of parallel computing architectures (multi-core,
More informationIMPLICIT generation of a control law through solution of a
2013 European Control Conference (ECC) July 17-19, 2013, Zürich, Switzerland. Predictive Control for Spacecraft Rendezvous in an Elliptical Orbit using an FPGA Edward N. Hartley and Jan M. Maciejowski
More informationA High Throughput FPGA-Based Implementation of the Lanczos Method for the Symmetric Extremal Eigenvalue Problem
A High Throughput FPGA-Based Implementation of the Lanczos Method for the Symmetric Extremal Eigenvalue Problem Abid Rafique, Nachiket Kapre, and George A. Constantinides Electrical and Electronic Engineering,
More informationReduced-Area Constant-Coefficient and Multiple-Constant Multipliers for Xilinx FPGAs with 6-Input LUTs
Article Reduced-Area Constant-Coefficient and Multiple-Constant Multipliers for Xilinx FPGAs with 6-Input LUTs E. George Walters III Department of Electrical and Computer Engineering, Penn State Erie,
More informationAn Active Set Strategy for Solving Optimization Problems with up to 200,000,000 Nonlinear Constraints
An Active Set Strategy for Solving Optimization Problems with up to 200,000,000 Nonlinear Constraints Klaus Schittkowski Department of Computer Science, University of Bayreuth 95440 Bayreuth, Germany e-mail:
More informationAlgorithms and Methods for Fast Model Predictive Control
Algorithms and Methods for Fast Model Predictive Control Technical University of Denmark Department of Applied Mathematics and Computer Science 13 April 2016 Background: Model Predictive Control Model
More information2. Accelerated Computations
2. Accelerated Computations 2.1. Bent Function Enumeration by a Circular Pipeline Implemented on an FPGA Stuart W. Schneider Jon T. Butler 2.1.1. Background A naive approach to encoding a plaintext message
More informationEfficient robust optimization for robust control with constraints Paul Goulart, Eric Kerrigan and Danny Ralph
Efficient robust optimization for robust control with constraints p. 1 Efficient robust optimization for robust control with constraints Paul Goulart, Eric Kerrigan and Danny Ralph Efficient robust optimization
More informationEfficient random number generation on FPGA-s
Proceedings of the 9 th International Conference on Applied Informatics Eger, Hungary, January 29 February 1, 2014. Vol. 1. pp. 313 320 doi: 10.14794/ICAI.9.2014.1.313 Efficient random number generation
More informationIncomplete Cholesky preconditioners that exploit the low-rank property
anapov@ulb.ac.be ; http://homepages.ulb.ac.be/ anapov/ 1 / 35 Incomplete Cholesky preconditioners that exploit the low-rank property (theory and practice) Artem Napov Service de Métrologie Nucléaire, Université
More informationMultivariate Gaussian Random Number Generator Targeting Specific Resource Utilization in an FPGA
Multivariate Gaussian Random Number Generator Targeting Specific Resource Utilization in an FPGA Chalermpol Saiprasert, Christos-Savvas Bouganis and George A. Constantinides Department of Electrical &
More informationUTPlaceF 3.0: A Parallelization Framework for Modern FPGA Global Placement
UTPlaceF 3.0: A Parallelization Framework for Modern FPGA Global Placement Wuxi Li, Meng Li, Jiajun Wang, and David Z. Pan University of Texas at Austin wuxili@utexas.edu November 14, 2017 UT DA Wuxi Li
More informationWord-length Optimization and Error Analysis of a Multivariate Gaussian Random Number Generator
Word-length Optimization and Error Analysis of a Multivariate Gaussian Random Number Generator Chalermpol Saiprasert, Christos-Savvas Bouganis and George A. Constantinides Department of Electrical & Electronic
More informationNovel Devices and Circuits for Computing
Novel Devices and Circuits for Computing UCSB 594BB Winter 2013 Lecture 4: Resistive switching: Logic Class Outline Material Implication logic Stochastic computing Reconfigurable logic Material Implication
More informationWhat s New in Active-Set Methods for Nonlinear Optimization?
What s New in Active-Set Methods for Nonlinear Optimization? Philip E. Gill Advances in Numerical Computation, Manchester University, July 5, 2011 A Workshop in Honor of Sven Hammarling UCSD Center for
More informationVLSI Signal Processing
VLSI Signal Processing Lecture 1 Pipelining & Retiming ADSP Lecture1 - Pipelining & Retiming (cwliu@twins.ee.nctu.edu.tw) 1-1 Introduction DSP System Real time requirement Data driven synchronized by data
More informationFINDING PARALLELISM IN GENERAL-PURPOSE LINEAR PROGRAMMING
FINDING PARALLELISM IN GENERAL-PURPOSE LINEAR PROGRAMMING Daniel Thuerck 1,2 (advisors Michael Goesele 1,2 and Marc Pfetsch 1 ) Maxim Naumov 3 1 Graduate School of Computational Engineering, TU Darmstadt
More informationAdvanced Hardware Architecture for Soft Decoding Reed-Solomon Codes
Advanced Hardware Architecture for Soft Decoding Reed-Solomon Codes Stefan Scholl, Norbert Wehn Microelectronic Systems Design Research Group TU Kaiserslautern, Germany Overview Soft decoding decoding
More informationERLANGEN REGIONAL COMPUTING CENTER
ERLANGEN REGIONAL COMPUTING CENTER Making Sense of Performance Numbers Georg Hager Erlangen Regional Computing Center (RRZE) Friedrich-Alexander-Universität Erlangen-Nürnberg OpenMPCon 2018 Barcelona,
More informationCME342 Parallel Methods in Numerical Analysis. Matrix Computation: Iterative Methods II. Sparse Matrix-vector Multiplication.
CME342 Parallel Methods in Numerical Analysis Matrix Computation: Iterative Methods II Outline: CG & its parallelization. Sparse Matrix-vector Multiplication. 1 Basic iterative methods: Ax = b r = b Ax
More informationNCU EE -- DSP VLSI Design. Tsung-Han Tsai 1
NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1 Multi-processor vs. Multi-computer architecture µp vs. DSP RISC vs. DSP RISC Reduced-instruction-set Register-to-register operation Higher throughput by using
More informationA Warm-start Interior-point Method for Predictive Control
A Warm-start Interior-point Method for Predictive Control Amir Shahzad Eric C Kerrigan George A Constantinides Department of Electrical and Electronic Engineering, Imperial College London, SW7 2AZ, UK
More informationLecture 18: Optimization Programming
Fall, 2016 Outline Unconstrained Optimization 1 Unconstrained Optimization 2 Equality-constrained Optimization Inequality-constrained Optimization Mixture-constrained Optimization 3 Quadratic Programming
More informationCMP 338: Third Class
CMP 338: Third Class HW 2 solution Conversion between bases The TINY processor Abstraction and separation of concerns Circuit design big picture Moore s law and chip fabrication cost Performance What does
More informationABHELSINKI UNIVERSITY OF TECHNOLOGY
On Repeated Squarings in Binary Fields Kimmo Järvinen Helsinki University of Technology August 14, 2009 K. Järvinen On Repeated Squarings in Binary Fields 1/1 Introduction Repeated squaring Repeated squaring:
More informationL16: Power Dissipation in Digital Systems. L16: Spring 2007 Introductory Digital Systems Laboratory
L16: Power Dissipation in Digital Systems 1 Problem #1: Power Dissipation/Heat Power (Watts) 100000 10000 1000 100 10 1 0.1 4004 80088080 8085 808686 386 486 Pentium proc 18KW 5KW 1.5KW 500W 1971 1974
More informationPenalty and Barrier Methods General classical constrained minimization problem minimize f(x) subject to g(x) 0 h(x) =0 Penalty methods are motivated by the desire to use unconstrained optimization techniques
More informationAn Efficient FETI Implementation on Distributed Shared Memory Machines with Independent Numbers of Subdomains and Processors
Contemporary Mathematics Volume 218, 1998 B 0-8218-0988-1-03024-7 An Efficient FETI Implementation on Distributed Shared Memory Machines with Independent Numbers of Subdomains and Processors Michel Lesoinne
More informationArithmetic Operators for Pairing-Based Cryptography
Arithmetic Operators for Pairing-Based Cryptography J.-L. Beuchat 1 N. Brisebarre 2 J. Detrey 3 E. Okamoto 1 1 University of Tsukuba, Japan 2 École Normale Supérieure de Lyon, France 3 Cosec, b-it, Bonn,
More informationNumbering Systems. Computational Platforms. Scaling and Round-off Noise. Special Purpose. here that is dedicated architecture
Computational Platforms Numbering Systems Basic Building Blocks Scaling and Round-off Noise Computational Platforms Viktor Öwall viktor.owall@eit.lth.seowall@eit lth Standard Processors or Special Purpose
More informationImplementation Of Digital Fir Filter Using Improved Table Look Up Scheme For Residue Number System
Implementation Of Digital Fir Filter Using Improved Table Look Up Scheme For Residue Number System G.Suresh, G.Indira Devi, P.Pavankumar Abstract The use of the improved table look up Residue Number System
More informationFPGA accelerated multipliers over binary composite fields constructed via low hamming weight irreducible polynomials
FPGA accelerated multipliers over binary composite fields constructed via low hamming weight irreducible polynomials C. Shu, S. Kwon and K. Gaj Abstract: The efficient design of digit-serial multipliers
More informationReorganized and Compact DFA for Efficient Regular Expression Matching
Reorganized and Compact DFA for Efficient Regular Expression Matching Kai Wang 1,2, Yaxuan Qi 1,2, Yibo Xue 2,3, Jun Li 2,3 1 Department of Automation, Tsinghua University, Beijing, China 2 Research Institute
More informationCourse Notes: Week 1
Course Notes: Week 1 Math 270C: Applied Numerical Linear Algebra 1 Lecture 1: Introduction (3/28/11) We will focus on iterative methods for solving linear systems of equations (and some discussion of eigenvalues
More informationBeiHang Short Course, Part 7: HW Acceleration: It s about Performance, Energy and Power
BeiHang Short Course, Part 7: HW Acceleration: It s about Performance, Energy and Power James C. Hoe Department of ECE Carnegie Mellon niversity Eric S. Chung, et al., Single chip Heterogeneous Computing:
More informationIterative Methods for Solving A x = b
Iterative Methods for Solving A x = b A good (free) online source for iterative methods for solving A x = b is given in the description of a set of iterative solvers called templates found at netlib: http
More informationHardware Acceleration of the Tate Pairing in Characteristic Three
Hardware Acceleration of the Tate Pairing in Characteristic Three CHES 2005 Hardware Acceleration of the Tate Pairing in Characteristic Three Slide 1 Introduction Pairing based cryptography is a (fairly)
More informationCORDIC, Divider, Square Root
4// EE6B: VLSI Signal Processing CORDIC, Divider, Square Root Prof. Dejan Marković ee6b@gmail.com Iterative algorithms CORDIC Division Square root Lecture Overview Topics covered include Algorithms and
More informationDistributed and Real-time Predictive Control
Distributed and Real-time Predictive Control Melanie Zeilinger Christian Conte (ETH) Alexander Domahidi (ETH) Ye Pu (EPFL) Colin Jones (EPFL) Challenges in modern control systems Power system: - Frequency
More informationBlock Structured Preconditioning within an Active-Set Method for Real-Time Optimal Control
MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Block Structured Preconditioning within an Active-Set Method for Real-Time Optimal Control Quirynen, R.; Knyazev, A.; Di Cairano, S. TR2018-081
More informationDigital Integrated Circuits A Design Perspective. Arithmetic Circuits. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.
Digital Integrated Circuits A Design Perspective Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic Arithmetic Circuits January, 2003 1 A Generic Digital Processor MEMORY INPUT-OUTPUT CONTROL DATAPATH
More informationSchool of EECS Seoul National University
4!4 07$ 8902808 3 School of EECS Seoul National University Introduction Low power design 3974/:.9 43 Increasing demand on performance and integrity of VLSI circuits Popularity of portable devices Low power
More informationA Parallel Method for the Computation of Matrix Exponential based on Truncated Neumann Series
A Parallel Method for the Computation of Matrix Exponential based on Truncated Neumann Series V. S. Dimitrov 12, V. Ariyarathna 3, D. F. G. Coelho 1, L. Rakai 1, A. Madanayake 3, R. J. Cintra 4 1 ECE Department,
More informationAn Optimized Hardware Architecture of Montgomery Multiplication Algorithm
An Optimized Hardware Architecture of Montgomery Multiplication Algorithm Miaoqing Huang 1, Kris Gaj 2, Soonhak Kwon 3, and Tarek El-Ghazawi 1 1 The George Washington University, Washington, DC 20052,
More informationStreamSVM Linear SVMs and Logistic Regression When Data Does Not Fit In Memory
StreamSVM Linear SVMs and Logistic Regression When Data Does Not Fit In Memory S.V. N. (vishy) Vishwanathan Purdue University and Microsoft vishy@purdue.edu October 9, 2012 S.V. N. Vishwanathan (Purdue,
More informationReview: From problem to parallel algorithm
Review: From problem to parallel algorithm Mathematical formulations of interesting problems abound Poisson s equation Sources: Electrostatics, gravity, fluid flow, image processing (!) Numerical solution:
More informationParallelized Model Predictive Control
Parallelized Model Predictive Control The MIT Faculty has made this article openly available Please share how this access benefits you Your story matters Citation As Published Publisher Soudbakhsh, Damoon
More informationSMO vs PDCO for SVM: Sequential Minimal Optimization vs Primal-Dual interior method for Convex Objectives for Support Vector Machines
vs for SVM: Sequential Minimal Optimization vs Primal-Dual interior method for Convex Objectives for Support Vector Machines Ding Ma Michael Saunders Working paper, January 5 Introduction In machine learning,
More information- Part 4 - Multicore and Manycore Technology: Chances and Challenges. Vincent Heuveline
- Part 4 - Multicore and Manycore Technology: Chances and Challenges Vincent Heuveline 1 Numerical Simulation of Tropical Cyclones Goal oriented adaptivity for tropical cyclones ~10⁴km ~1500km ~100km 2
More informationCIS 371 Computer Organization and Design
CIS 371 Computer Organization and Design Unit 13: Power & Energy Slides developed by Milo Mar0n & Amir Roth at the University of Pennsylvania with sources that included University of Wisconsin slides by
More informationFixed-Point Dual Gradient Projection for Embedded Model Predictive Control
2013 European Control Conference ECC) July 17-19, 2013, Zürich, Switzerland. Fixed-Point Dual Gradient Projection for Embedded Model Predictive Control Panagiotis Patrinos, Alberto Guiggiani, Alberto Bemporad
More informationSLIM. University of British Columbia
Accelerating an Iterative Helmholtz Solver Using Reconfigurable Hardware Art Petrenko M.Sc. Defence, April 9, 2014 Seismic Laboratory for Imaging and Modelling Department of Earth, Ocean and Atmospheric
More informationFast ADMM for Sum of Squares Programs Using Partial Orthogonality
Fast ADMM for Sum of Squares Programs Using Partial Orthogonality Antonis Papachristodoulou Department of Engineering Science University of Oxford www.eng.ox.ac.uk/control/sysos antonis@eng.ox.ac.uk with
More informationTHE solution of the absolute value equation (AVE) of
The nonlinear HSS-like iterative method for absolute value equations Mu-Zheng Zhu Member, IAENG, and Ya-E Qi arxiv:1403.7013v4 [math.na] 2 Jan 2018 Abstract Salkuyeh proposed the Picard-HSS iteration method
More informationGF(2 m ) arithmetic: summary
GF(2 m ) arithmetic: summary EE 387, Notes 18, Handout #32 Addition/subtraction: bitwise XOR (m gates/ops) Multiplication: bit serial (shift and add) bit parallel (combinational) subfield representation
More informationAM 205: lecture 19. Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods
AM 205: lecture 19 Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods Optimality Conditions: Equality Constrained Case As another example of equality
More informationConversion from Linear to Circular Polarization and Stokes Parameters in FPGA. Koyel Das, Alan Roy, Gino Tuccari, Reinhard Keller
Conversion from Linear to Circular Polarization and Stokes Parameters in FPGA Koyel Das, Alan Roy, Gino Tuccari, Reinhard Keller Purpose 1. Conventionally, for the formation of circular polarization, analogue
More informationTopics. The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems
Topics The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems What about non-spd systems? Methods requiring small history Methods requiring large history Summary of solvers 1 / 52 Conjugate
More informationAM 205: lecture 19. Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods
AM 205: lecture 19 Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods Quasi-Newton Methods General form of quasi-newton methods: x k+1 = x k α
More informationCOVER SHEET: Problem#: Points
EEL 4712 Midterm 3 Spring 2017 VERSION 1 Name: UFID: Sign here to give permission for your test to be returned in class, where others might see your score: IMPORTANT: Please be neat and write (or draw)
More informationIterative Methods for Linear Systems of Equations
Iterative Methods for Linear Systems of Equations Projection methods (3) ITMAN PhD-course DTU 20-10-08 till 24-10-08 Martin van Gijzen 1 Delft University of Technology Overview day 4 Bi-Lanczos method
More informationGPU Acceleration of BCP Procedure for SAT Algorithms
GPU Acceleration of BCP Procedure for SAT Algorithms Hironori Fujii 1 and Noriyuki Fujimoto 1 1 Graduate School of Science Osaka Prefecture University 1-1 Gakuencho, Nakaku, Sakai, Osaka 599-8531, Japan
More informationConstrained Nonlinear Optimization Algorithms
Department of Industrial Engineering and Management Sciences Northwestern University waechter@iems.northwestern.edu Institute for Mathematics and its Applications University of Minnesota August 4, 2016
More informationWhat is Performance Analysis?
1.2 Basic Concepts What is Performance Analysis? Performance Analysis Space Complexity: - the amount of memory space used by the algorithm Time Complexity - the amount of computing time used by the algorithm
More informationDigital Circuits and Systems
EE201: Digital Circuits and Systems 4 Sequential Circuits page 1 of 11 EE201: Digital Circuits and Systems Section 4 Sequential Circuits 4.1 Overview of Sequential Circuits: Definition The circuit whose
More informationParallel programming practices for the solution of Sparse Linear Systems (motivated by computational physics and graphics)
Parallel programming practices for the solution of Sparse Linear Systems (motivated by computational physics and graphics) Eftychios Sifakis CS758 Guest Lecture - 19 Sept 2012 Introduction Linear systems
More informationLINEAR AND NONLINEAR PROGRAMMING
LINEAR AND NONLINEAR PROGRAMMING Stephen G. Nash and Ariela Sofer George Mason University The McGraw-Hill Companies, Inc. New York St. Louis San Francisco Auckland Bogota Caracas Lisbon London Madrid Mexico
More information6. Iterative Methods for Linear Systems. The stepwise approach to the solution...
6 Iterative Methods for Linear Systems The stepwise approach to the solution Miriam Mehl: 6 Iterative Methods for Linear Systems The stepwise approach to the solution, January 18, 2013 1 61 Large Sparse
More informationFPGA-based Niederreiter Cryptosystem using Binary Goppa Codes
FPGA-based Niederreiter Cryptosystem using Binary Goppa Codes Wen Wang 1, Jakub Szefer 1, and Ruben Niederhagen 2 1. Yale University, USA 2. Fraunhofer Institute SIT, Germany April 9, 2018 PQCrypto 2018
More informationFast Model Predictive Control with Soft Constraints
European Control Conference (ECC) July 7-9,, Zürich, Switzerland. Fast Model Predictive Control with Soft Constraints Arthur Richards Department of Aerospace Engineering, University of Bristol Queens Building,
More informationcsci 210: Data Structures Program Analysis
csci 210: Data Structures Program Analysis 1 Summary Summary analysis of algorithms asymptotic analysis big-o big-omega big-theta asymptotic notation commonly used functions discrete math refresher READING:
More informationPERFORMANCE METRICS. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah
PERFORMANCE METRICS Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture Overview Announcement Jan. 17 th : Homework 1 release (due on Jan.
More informationA FPGA Implementation of Large Restricted Boltzmann Machines. Charles Lo. Supervisor: Paul Chow April 2010
A FPGA Implementation of Large Restricted Boltzmann Machines by Charles Lo Supervisor: Paul Chow April 2010 Abstract A FPGA Implementation of Large Restricted Boltzmann Machines Charles Lo Engineering
More informationEfficient Polynomial Evaluation Algorithm and Implementation on FPGA
Efficient Polynomial Evaluation Algorithm and Implementation on FPGA by Simin Xu School of Computer Engineering A thesis submitted to Nanyang Technological University in partial fullfillment of the requirements
More informationFast model predictive control based on linear input/output models and bounded-variable least squares
7 IEEE 56th Annual Conference on Decision and Control (CDC) December -5, 7, Melbourne, Australia Fast model predictive control based on linear input/output models and bounded-variable least squares Nilay
More informationConstruction of a reconfigurable dynamic logic cell
PRAMANA c Indian Academy of Sciences Vol. 64, No. 3 journal of March 2005 physics pp. 433 441 Construction of a reconfigurable dynamic logic cell K MURALI 1, SUDESHNA SINHA 2 and WILLIAM L DITTO 3 1 Department
More informationPipelining and Parallel Processing
Pipelining and Parallel Processing Pipelining ---reduction in the critical path increase the clock speed, or reduce power consumption at same speed Parallel Processing ---multiple outputs are computed
More informationJanus: FPGA Based System for Scientific Computing Filippo Mantovani
Janus: FPGA Based System for Scientific Computing Filippo Mantovani Physics Department Università degli Studi di Ferrara Ferrara, 28/09/2009 Overview: 1. The physical problem: - Ising model and Spin Glass
More informationConjugate gradient method. Descent method. Conjugate search direction. Conjugate Gradient Algorithm (294)
Conjugate gradient method Descent method Hestenes, Stiefel 1952 For A N N SPD In exact arithmetic, solves in N steps In real arithmetic No guaranteed stopping Often converges in many fewer than N steps
More informationControlling the level of sparsity in MPC
Controlling the level of sparsity in MPC Daniel Axehill Linköping University Post Print N.B.: When citing this work, cite the original article. Original Publication: Daniel Axehill. Controlling the level
More informationAlgorithm for Sparse Approximate Inverse Preconditioners in the Conjugate Gradient Method
Algorithm for Sparse Approximate Inverse Preconditioners in the Conjugate Gradient Method Ilya B. Labutin A.A. Trofimuk Institute of Petroleum Geology and Geophysics SB RAS, 3, acad. Koptyug Ave., Novosibirsk
More informationIHS 3: Test of Digital Systems R.Ubar, A. Jutman, H-D. Wuttke
IHS 3: Test of Digital Systems R.Ubar, A. Jutman, H-D. Wuttke Integrierte Hard- und Softwaresysteme RT-Level Design data path and control path on RT-level RT level simulation Functional units (F1,..,F4)
More informationA Digit-Serial Systolic Multiplier for Finite Fields GF(2 m )
A Digit-Serial Systolic Multiplier for Finite Fields GF( m ) Chang Hoon Kim, Sang Duk Han, and Chun Pyo Hong Department of Computer and Information Engineering Taegu University 5 Naeri, Jinryang, Kyungsan,
More informationECC for NAND Flash. Osso Vahabzadeh. TexasLDPC Inc. Flash Memory Summit 2017 Santa Clara, CA 1
ECC for NAND Flash Osso Vahabzadeh TexasLDPC Inc. 1 Overview Why Is Error Correction Needed in Flash Memories? Error Correction Codes Fundamentals Low-Density Parity-Check (LDPC) Codes LDPC Encoding and
More informationEECS 579: Logic and Fault Simulation. Simulation
EECS 579: Logic and Fault Simulation Simulation: Use of computer software models to verify correctness Fault Simulation: Use of simulation for fault analysis and ATPG Circuit description Input data for
More informationHilbert Transformator IP Cores
Introduction Hilbert Transformator IP Cores Martin Kumm December 27, 28 The Hilbert Transform is an important component in communication systems, e.g. for single sideband modulation/demodulation, amplitude
More informationClassification of Hand-Written Digits Using Scattering Convolutional Network
Mid-year Progress Report Classification of Hand-Written Digits Using Scattering Convolutional Network Dongmian Zou Advisor: Professor Radu Balan Co-Advisor: Dr. Maneesh Singh (SRI) Background Overview
More informationPerformance Metrics for Computer Systems. CASS 2018 Lavanya Ramapantulu
Performance Metrics for Computer Systems CASS 2018 Lavanya Ramapantulu Eight Great Ideas in Computer Architecture Design for Moore s Law Use abstraction to simplify design Make the common case fast Performance
More informationWITH rapid growth of traditional FPGA industry, heterogeneous
INTL JOURNAL OF ELECTRONICS AND TELECOMMUNICATIONS, 2012, VOL. 58, NO. 1, PP. 15 20 Manuscript received December 31, 2011; revised March 2012. DOI: 10.2478/v10177-012-0002-x Input Variable Partitioning
More informationL15: Custom and ASIC VLSI Integration
L15: Custom and ASIC VLSI Integration Average Cost of one transistor 10 1 0.1 0.01 0.001 0.0001 0.00001 $ 0.000001 Gordon Moore, Keynote Presentation at ISSCC 2003 0.0000001 '68 '70 '72 '74 '76 '78 '80
More informationDesign Exploration of an FPGA-Based Multivariate Gaussian Random Number Generator
Design Exploration of an FPGA-Based Multivariate Gaussian Random Number Generator Chalermpol Saiprasert A thesis submitted for the degree of Doctor of Philosophy in Electrical and Electronic Engineering
More informationParallel Numerics. Scope: Revise standard numerical methods considering parallel computations!
Parallel Numerics Scope: Revise standard numerical methods considering parallel computations! Required knowledge: Numerics Parallel Programming Graphs Literature: Dongarra, Du, Sorensen, van der Vorst:
More informationcsci 210: Data Structures Program Analysis
csci 210: Data Structures Program Analysis Summary Topics commonly used functions analysis of algorithms experimental asymptotic notation asymptotic analysis big-o big-omega big-theta READING: GT textbook
More informationRuntime Model Predictive Verification on Embedded Platforms 1
Runtime Model Predictive Verification on Embedded Platforms 1 Pei Zhang, Jianwen Li, Joseph Zambreno, Phillip H. Jones, Kristin Yvonne Rozier Presenter: Pei Zhang Iowa State University peizhang@iastate.edu
More informationDELFT UNIVERSITY OF TECHNOLOGY
DELFT UNIVERSITY OF TECHNOLOGY REPORT -09 Computational and Sensitivity Aspects of Eigenvalue-Based Methods for the Large-Scale Trust-Region Subproblem Marielba Rojas, Bjørn H. Fotland, and Trond Steihaug
More information