Concurrency models and Modern Processors
|
|
- Randall Stephens
- 5 years ago
- Views:
Transcription
1 Concurrency models and Modern Processors 1 / 17
2 Introduction The classical model of concurrency is the interleaving model. It corresponds to a memory model called Sequential Consistency (SC). Modern processors dot not implement SC, but so-called relaxed memory models, such as Total Store Order (TSO). 2 / 17
3 Memory models Sequential Consistency All memory accesses are immediately visible to all processors. Total Store Order (TSO) All write operations go through a buffer. Each processor reads the most recent value in its buffer, if there is such a value ; If not, the value held in memory is read. 3 / 17
4 Memory models (continued) x86-tso Intended for the programmer Based on TSO Extended with the concept of a global lock Compatible with the tests found in processors documentation Does not aim at closely modeling the internal structure of processors 4 / 17
5 Memory models (continued 2) Examples that can be fully executed under TSO (and X86-TSO), but not under SC initially : x = y = 0 Processor 1 Processor 2 store(p 1, x, 1) (s 1 ) store(p 2, y, 1) (s 2 ) load(p 1, y, 0) (l 1 ) load(p 2, x, 0) (l 2 ) store(p, m, v) p : [m] v load(p, m, v) p : ([m] == v) Possible interleavings : SC TSO s 1 s 1 s 1 s 2 s 1 l 1 s 2 s 2 l 2 l 1 s 2 l 1 blocks l 1 s 1 s 2 l 2 blocks l 2 l 1 l 2 blocks 5 / 17
6 Memory models (continued 3) Examples that can be fully executed under TSO (and X86-TSO), but not under SC initially : x = y = 0 Processor 1 Processor 2 store(p 1, x, 1) (s 1 ) store(p 2, y, 1) (s 2 ) load(p 1, x, 1) (l 1 ) load(p 2, y, 1) (l 3 ) load(p 1, y, 0) (l 2 ) load(p 2, x, 0) (l 4 ) Possible interleavings : SC TSO s 1 s 1 s 1 s 2 s 1 l 1 s 2 s 2 l 3 l 1 l 2 l 1 l 1 s 1 l 2 s 2 l 3 l 2 l 1 s 2 l 3 l 2 blocks l 3 l 2 l 3 l 4 blocks l 4 l 4 l 4 blocks 6 / 17
7 SC Formal definition Order relations : program order (< p ) (partial, per processor order) memory order (< m ) (global) To correspond to the SC model, an execution must satisfy the following conditions : op i, op j : op i < p op j op i < m op j where op x représents a store or load operation. The result of a load is the one compatible with the memory order. 7 / 17
8 SC (continued) Operational definition P 1 P 2 P n Loads/Stores Loads/Stores Loads/Stores Switch Single Port Memory 8 / 17
9 SC (continued 2) Memory orders, first example initially : x = y = 0 Processor 1 Processor 2 store(p 1, x, 1) (s 1 ) store(p 2, y, 1) (s 2 ) load(p 1, y, 0) (l 1 ) load(p 2, x, 0) (l 2 ) where s 1 < p l 1 et s 2 < p l 2. The possible memory orders are all interleavings satisfying the conditions s 1 < m l 1 et s 2 < m l 2. Example : s 1 < m l 1 < m s 2 < m l 2 9 / 17
10 TSO Formal definition To correspond to the TSO model, an execution must satisfy the following conditions : 1. l a, l b : l a < p l b l a < m l b 2. s a, s b : s a < p s b s a < m s b 3. l, s : l < p s l < m s 4. val(l a ) = val(max < m {s a s a < m l a s a < p l a }). Note that stores can be delayed, but loads have access to the latest locally written value. 10 / 17
11 TSO (continued) Operational definition P 1 P 2 P n Stores Stores Loads Loads Loads Stores FIFO Store Buffer Switch Single Port Memory Transferring a store from a buffer to main memory is called a commit. 11 / 17
12 TSO (continued 2) Memory orders, first example initially : x = y = 0 Processor 1 Processor 2 store(p 1, x, 1) (s 1 ) store(p 2, y, 1) (s 2 ) load(p 1, y, 0) (l 1 ) load(p 2, x, 0) (l 2 ) where s 1 < p l 1 et s 2 < p l 2. The compatible memory orders, are the interleavings of s 1, l 1, s 2 et l 2. Example : l 1 < m s 2 < m l 2 < m s 1 or l 1 < m l 2 < m s 1 < m s 2 12 / 17
13 TSO (continued 3) Memory orders, second example initially : x = y = 0 Processor 1 Processor 2 store(p 1, x, 1) (s 1 ) store(p 2, y, 1) (s 2 ) load(p 1, x, 1) (l 1 ) load(p 2, y, 1) (l 3 ) load(p 1, y, 0) (l 2 ) load(p 2, x, 0) (l 4 ) where s 1 < p l 1, l 1 < p l 2, s 2 < p l 3 et l 3 < p l 4. The compatible memory orders are all interleavings of s 1, l 1, l 2, s 2, l 3 et l 4 satisfying l 1 < m l 2 et l 3 < m l 4 Exemple : l 1 < m l 2 < m l 3 < m l 4 < m s 1 < m s 2 13 / 17
14 TSO (continued 4) Memory orders, second example - Details Sequence of operations Memory order store(p 1, x, 1) (s 1 ) - 14 / 17
15 TSO (continued 4) Memory orders, second example - Details Sequence of operations Memory order store(p 1, x, 1) (s 1 ) - load(p 1, x, 1) (l 1 ) l 1 14 / 17
16 TSO (continued 4) Memory orders, second example - Details Sequence of operations Memory order store(p 1, x, 1) (s 1 ) - load(p 1, x, 1) (l 1 ) l 1 load(p 1, y, 0) (l 2 ) l 1 < m l 2 14 / 17
17 TSO (continued 4) Memory orders, second example - Details Sequence of operations Memory order store(p 1, x, 1) (s 1 ) - load(p 1, x, 1) (l 1 ) l 1 load(p 1, y, 0) (l 2 ) l 1 < m l 2 store(p 2, y, 1) (s 2 ) l 1 < m l 2 14 / 17
18 TSO (continued 4) Memory orders, second example - Details Sequence of operations Memory order store(p 1, x, 1) (s 1 ) - load(p 1, x, 1) (l 1 ) l 1 load(p 1, y, 0) (l 2 ) l 1 < m l 2 store(p 2, y, 1) (s 2 ) l 1 < m l 2 load(p 2, y, 1) (l 3 ) l 1 < m l 2 < m l 3 14 / 17
19 TSO (continued 4) Memory orders, second example - Details Sequence of operations Memory order store(p 1, x, 1) (s 1 ) - load(p 1, x, 1) (l 1 ) l 1 load(p 1, y, 0) (l 2 ) l 1 < m l 2 store(p 2, y, 1) (s 2 ) l 1 < m l 2 load(p 2, y, 1) (l 3 ) l 1 < m l 2 < m l 3 load(p 2, x, 0) (l 4 ) l 1 < m l 2 < m l 3 < m l 4 14 / 17
20 TSO (continued 4) Memory orders, second example - Details Sequence of operations Memory order store(p 1, x, 1) (s 1 ) - load(p 1, x, 1) (l 1 ) l 1 load(p 1, y, 0) (l 2 ) l 1 < m l 2 store(p 2, y, 1) (s 2 ) l 1 < m l 2 load(p 2, y, 1) (l 3 ) l 1 < m l 2 < m l 3 load(p 2, x, 0) (l 4 ) l 1 < m l 2 < m l 3 < m l 4 (commit(s 1 )) l 1 < m l 2 < m l 3 < m l 4 < m s 1 14 / 17
21 TSO (continued 4) Memory orders, second example - Details Sequence of operations Memory order store(p 1, x, 1) (s 1 ) - load(p 1, x, 1) (l 1 ) l 1 load(p 1, y, 0) (l 2 ) l 1 < m l 2 store(p 2, y, 1) (s 2 ) l 1 < m l 2 load(p 2, y, 1) (l 3 ) l 1 < m l 2 < m l 3 load(p 2, x, 0) (l 4 ) l 1 < m l 2 < m l 3 < m l 4 (commit(s 1 )) l 1 < m l 2 < m l 3 < m l 4 < m s 1 (commit(s 2 )) l 1 < m l 2 < m l 3 < m l 4 < m s 1 < m s 2 14 / 17
22 x86-tso Formal definition The order constraints on the loads and stores are the same as those for TSO. Extended operations : mfence(p) blocks processor p until its buffer is empty. lock(p) If the lock is not already held by another processor, p takes the lock and obtains exclusive access to the global memory : the other processors are not allowed to execute the operations commit or load unlock(p) p flushes the buffer to global memory and releases the lock. 15 / 17
23 x86-tso Operational view Loads P 1 Stores FIFO Store Buffer LoadsP n Stores Switch Single Port Memory Lock 16 / 17
24 Other memory models Partial Store Order (PSO) Relaxed Memory Order (RMO) 17 / 17
Multicore Semantics and Programming
Multicore Semantics and Programming Peter Sewell Tim Harris University of Cambridge Oracle October November, 2015 p. 1 These Lectures Part 1: Multicore Semantics: the concurrency of multiprocessors and
More informationNCU EE -- DSP VLSI Design. Tsung-Han Tsai 1
NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1 Multi-processor vs. Multi-computer architecture µp vs. DSP RISC vs. DSP RISC Reduced-instruction-set Register-to-register operation Higher throughput by using
More informationNoninterference under Weak Memory Models (Progress Report)
Noninterference under Weak Memory Models (Progress Report) Technical Report TUD-CS-2014-0062 March 2014 Heiko Mantel, Matthias Perner, Jens Sauer Noninterference under Weak Memory Models (Progress Report)
More informationCS 152 Computer Architecture and Engineering. Lecture 17: Synchronization and Sequential Consistency
CS 152 Computer Architecture and Engineering Lecture 17: Synchronization and Sequential Consistency Dr. George Michelogiannakis EECS, University of California at Berkeley CRD, Lawrence Berkeley National
More information1 st Semester 2007/2008
Chapter 17: System Departamento de Engenharia Informática Instituto Superior Técnico 1 st Semester 2007/2008 Slides baseados nos slides oficiais do livro Database System c Silberschatz, Korth and Sudarshan.
More informationDistributed Systems Fundamentals
February 17, 2000 ECS 251 Winter 2000 Page 1 Distributed Systems Fundamentals 1. Distributed system? a. What is it? b. Why use it? 2. System Architectures a. minicomputer mode b. workstation model c. processor
More informationCOE 328 Final Exam 2008
COE 328 Final Exam 2008 1. Design a comparator that compares a 4 bit number A to a 4 bit number B and gives an Output F=1 if A is not equal B. You must use 2 input LUTs only. 2. Given the following logic
More informationarxiv: v1 [cs.dc] 9 Feb 2015
Why Transactional Memory Should Not Be Obstruction-Free Petr Kuznetsov 1 Srivatsan Ravi 2 1 Télécom ParisTech 2 TU Berlin arxiv:1502.02725v1 [cs.dc] 9 Feb 2015 Abstract Transactional memory (TM) is an
More informationCSCI-564 Advanced Computer Architecture
CSCI-564 Advanced Computer Architecture Lecture 8: Handling Exceptions and Interrupts / Superscalar Bo Wu Colorado School of Mines Branch Delay Slots (expose control hazard to software) Change the ISA
More informationDenotational event structure for relaxed memory
Denotational event structure for relaxed memory Jade Alglave 1, Simon Castellan 2, Jean-Marie Madiot 3 1 ARM, and University College London, UK 2 Imperial College London, UK 3 INRIA 7th July, 2018 LOLA
More informationLecture: Pipelining Basics
Lecture: Pipelining Basics Topics: Performance equations wrap-up, Basic pipelining implementation Video 1: What is pipelining? Video 2: Clocks and latches Video 3: An example 5-stage pipeline Video 4:
More informationCS 152 Computer Architecture and Engineering. Lecture 17: Synchroniza<on and Sequen<al Consistency. Last Time, Lecture 16: GPUs. NOW Handout Page 1
CS 152 Computer Architecture and Engineering Lecture 17: Synchroniza
More informationComplex Systems Design & Distributed Calculus and Coordination
Complex Systems Design & Distributed Calculus and Coordination Concurrency and Process Algebras: Theory and Practice Francesco Tiezzi University of Camerino francesco.tiezzi@unicam.it A.A. 2014/2015 F.
More informationCSE370: Introduction to Digital Design
CSE370: Introduction to Digital Design Course staff Gaetano Borriello, Brian DeRenzi, Firat Kiyak Course web www.cs.washington.edu/370/ Make sure to subscribe to class mailing list (cse370@cs) Course text
More informationPipelining. Traditional Execution. CS 365 Lecture 12 Prof. Yih Huang. add ld beq CS CS 365 2
Pipelining CS 365 Lecture 12 Prof. Yih Huang CS 365 1 Traditional Execution 1 2 3 4 1 2 3 4 5 1 2 3 add ld beq CS 365 2 1 Pipelined Execution 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5
More informationTDDI04, K. Arvidsson, IDA, Linköpings universitet CPU Scheduling. Overview: CPU Scheduling. [SGG7] Chapter 5. Basic Concepts.
TDDI4 Concurrent Programming, Operating Systems, and Real-time Operating Systems CPU Scheduling Overview: CPU Scheduling CPU bursts and I/O bursts Scheduling Criteria Scheduling Algorithms Multiprocessor
More informationEmbedded Systems 14. Overview of embedded systems design
Embedded Systems 14-1 - Overview of embedded systems design - 2-1 Point of departure: Scheduling general IT systems In general IT systems, not much is known about the computational processes a priori The
More informationLatches. October 13, 2003 Latches 1
Latches The second part of CS231 focuses on sequential circuits, where we add memory to the hardware that we ve already seen. Our schedule will be very similar to before: We first show how primitive memory
More informationInput-queued switches: Scheduling algorithms for a crossbar switch. EE 384X Packet Switch Architectures 1
Input-queued switches: Scheduling algorithms for a crossbar switch EE 84X Packet Switch Architectures Overview Today s lecture - the input-buffered switch architecture - the head-of-line blocking phenomenon
More informationCyrus: Unintrusive Application-Level Record-Replay for Replay Parallelism
Cyrus: Unintrusive Application-Level Record-Replay for Replay Parallelism Nima Honarmand, Nathan Dautenhahn, Josep Torrellas and Samuel T. King (UIUC) Gilles Pokam and Cristiano Pereira (Intel) iacoma.cs.uiuc.edu
More informationLogic. Basic Logic Functions. Switches in series (AND) Truth Tables. Switches in Parallel (OR) Alternative view for OR
TOPIS: Logic Logic Expressions Logic Gates Simplifying Logic Expressions Sequential Logic (Logic with a Memory) George oole (85-864), English mathematician, oolean logic used in digital computers since
More informationOn the Verification Problem for Weak Memory Models
On the Verification Problem for Weak Memory Models Mohamed Faouzi Atig Ahmed Bouajjani LIAFA, University Paris Diderot, Paris, France {atig,abou}@liafa.jussieu.fr Sebastian Burckhardt Madanlal Musuvathi
More informationINF Models of concurrency
Monitors INF4140 - Models of concurrency Monitors, lecture 4 Fall 2017 27. September 2017 2 / 49 Overview Concurrent execution of different processes Communication by shared variables Processes may interfere
More informationReal-time operating systems course. 6 Definitions Non real-time scheduling algorithms Real-time scheduling algorithm
Real-time operating systems course 6 Definitions Non real-time scheduling algorithms Real-time scheduling algorithm Definitions Scheduling Scheduling is the activity of selecting which process/thread should
More informationThis article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and
This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution
More informationEvaluating Overheads of Multi-bit Soft Error Protection Techniques at Hardware Level Sponsored by SRC and Freescale under SRC task number 2042
Evaluating Overheads of Multi-bit Soft Error Protection Techniques at Hardware Level Sponsored by SR and Freescale under SR task number 2042 Lukasz G. Szafaryn, Kevin Skadron Department of omputer Science
More information2 k Factorial Designs Raj Jain
2 k Factorial Designs Raj Jain Washington University in Saint Louis Saint Louis, MO 63130 Jain@cse.wustl.edu These slides are available on-line at: http://www.cse.wustl.edu/~jain/cse567-06/ 17-1 Overview!
More informationLecture 14: State Tables, Diagrams, Latches, and Flip Flop
EE210: Switching Systems Lecture 14: State Tables, Diagrams, Latches, and Flip Flop Prof. YingLi Tian Nov. 6, 2017 Department of Electrical Engineering The City College of New York The City University
More informationIssue = Select + Wakeup. Out-of-order Pipeline. Issue. Issue = Select + Wakeup. OOO execution (2-wide) OOO execution (2-wide)
Out-of-order Pipeline Buffer of instructions Issue = Select + Wakeup Select N oldest, read instructions N=, xor N=, xor and sub Note: ma have execution resource constraints: i.e., load/store/fp Fetch Decode
More informationFACTORS AFFECTING CONCURRENT TRUNCATE
T E C H N I C A L N O T E FACTORS AFFECTING CONCURRENT TRUNCATE DURING BATCH PROCESSES Prepared By David Kurtz, Go-Faster Consultancy Ltd. Technical Note Version 1.00 Thursday 2 April 2009 (E-mail: david.kurtz@go-faster.co.uk,
More informationTaming Release-Acquire Consistency
Taming Release-Acquire Consistency Ori Lahav Nick Giannarakis Viktor Vafeiadis Max Planck Institute for Software Systems (MPI-SWS), Germany {orilahav,nickgian,viktor}@mpi-sws.org * POPL * Artifact Consistent
More informationAnnouncements. Project #1 grades were returned on Monday. Midterm #1. Project #2. Requests for re-grades due by Tuesday
Announcements Project #1 grades were returned on Monday Requests for re-grades due by Tuesday Midterm #1 Re-grade requests due by Monday Project #2 Due 10 AM Monday 1 Page State (hardware view) Page frame
More informationLecture Note #6: More on Task Scheduling EECS 571 Principles of Real-Time Embedded Systems Kang G. Shin EECS Department University of Michigan
Lecture Note #6: More on Task Scheduling EECS 571 Principles of Real-Time Embedded Systems Kang G. Shin EECS Department University of Michigan Note 6-1 Mars Pathfinder Timing Hiccups? When: landed on the
More informationA Brief History of Shared memory C M U
A Brief History of Shared memory S t e p h e n B r o o k e s C M U 1 Outline Revisionist history Rational reconstruction of early models Evolution of recent models A unifying framework Fault-detecting
More informationLogical Time. 1. Introduction 2. Clock and Events 3. Logical (Lamport) Clocks 4. Vector Clocks 5. Efficient Implementation
Logical Time Nicola Dragoni Embedded Systems Engineering DTU Compute 1. Introduction 2. Clock and Events 3. Logical (Lamport) Clocks 4. Vector Clocks 5. Efficient Implementation 2013 ACM Turing Award:
More informationBenefits of Interval Temporal Logic for Specification of Concurrent Systems
Benefits of Interval Temporal Logic for Specification of Concurrent Systems Ben Moszkowski Software Technology Research Laboratory De Montfort University Leicester Great Britain email: benm@dmu.ac.uk http://www.tech.dmu.ac.uk/~benm
More informationParallel Computation of the Eigenstructure of Toeplitz-plus-Hankel matrices on Multicomputers
Parallel Computation of the Eigenstructure of Toeplitz-plus-Hankel matrices on Multicomputers José M. Badía * and Antonio M. Vidal * Departamento de Sistemas Informáticos y Computación Universidad Politécnica
More informationLogic Model Checking
Logic Model Checking Lecture Notes 10:18 Caltech 101b.2 January-March 2004 Course Text: The Spin Model Checker: Primer and Reference Manual Addison-Wesley 2003, ISBN 0-321-22862-6, 608 pgs. the assignment
More informationLower Bounds for Restricted-Use Objects
Lower Bounds for Restricted-Use Objects James Aspnes Keren Censor-Hillel Hagit Attiya Danny Hendler March 13, 2016 Abstract Concurrent objects play a key role in the design of applications for multi-core
More informationDesign of Distributed Systems Melinda Tóth, Zoltán Horváth
Design of Distributed Systems Melinda Tóth, Zoltán Horváth Design of Distributed Systems Melinda Tóth, Zoltán Horváth Publication date 2014 Copyright 2014 Melinda Tóth, Zoltán Horváth Supported by TÁMOP-412A/1-11/1-2011-0052
More informationMachine Learning to Automatically Detect Human Development from Satellite Imagery
Technical Disclosure Commons Defensive Publications Series April 24, 2017 Machine Learning to Automatically Detect Human Development from Satellite Imagery Matthew Manolides Follow this and additional
More informationA subtle problem. An obvious problem. An obvious problem. An obvious problem. No!
A subtle problem An obvious problem when LC = t do S doesn t make sense for Lamport clocks! there is no guarantee that LC will ever be S is anyway executed after LC = t Fixes: if e is internal/send and
More informationSection 6 Fault-Tolerant Consensus
Section 6 Fault-Tolerant Consensus CS586 - Panagiota Fatourou 1 Description of the Problem Consensus Each process starts with an individual input from a particular value set V. Processes may fail by crashing.
More informationLecture 17: Designing Sequential Systems Using Flip Flops
EE210: Switching Systems Lecture 17: Designing Sequential Systems Using Flip Flops Prof. YingLi Tian April 11, 2019 Department of Electrical Engineering The City College of New York The City University
More informationENGG 1203 Tutorial_9 - Review. Boolean Algebra. Simplifying Logic Circuits. Combinational Logic. 1. Combinational & Sequential Logic
ENGG 1203 Tutorial_9 - Review Boolean Algebra 1. Combinational & Sequential Logic 2. Computer Systems 3. Electronic Circuits 4. Signals, Systems, and Control Remark : Multiple Choice Questions : ** Check
More informationSystem Data Bus (8-bit) Data Buffer. Internal Data Bus (8-bit) 8-bit register (R) 3-bit address 16-bit register pair (P) 2-bit address
Intel 8080 CPU block diagram 8 System Data Bus (8-bit) Data Buffer Registry Array B 8 C Internal Data Bus (8-bit) F D E H L ALU SP A PC Address Buffer 16 System Address Bus (16-bit) Internal register addressing:
More informationParallel Scientific Computing
IV-1 Parallel Scientific Computing Matrix-vector multiplication. Matrix-matrix multiplication. Direct method for solving a linear equation. Gaussian Elimination. Iterative method for solving a linear equation.
More informationBuilding a Computer. Quiz #2 on 10/31, open book and notes. (This is the last lecture covered) I wonder where this goes? L16- Building a Computer 1
Building a Computer I wonder where this goes? B LU MIPS Kit Quiz # on /3, open book and notes (This is the last lecture covered) Comp 4 Fall 7 /4/7 L6- Building a Computer THIS IS IT! Motivating Force
More informationProject Two RISC Processor Implementation ECE 485
Project Two RISC Processor Implementation ECE 485 Chenqi Bao Peter Chinetti November 6, 2013 Instructor: Professor Borkar 1 Statement of Problem This project requires the design and test of a RISC processor
More information: MIMO Latency - Revised Definition
96-1268: MIMO Latency - Revised Definition, Gojko Babic Contact: Jain@CIS.Ohio-State.Edu http://www.cis.ohio-state.edu/~jain/ 1 Overview Input frame not contiguous Output frame not contiguous Input Speed
More informationCircuit Modeling for Practical Many-core Architecture Design Exploration
Circuit Modeling for Practical Many-core Architecture Design Exploration Redefining design abstractions Dean Truong Bevan Baas VLSI Computation Lab University of California, Davis Outline Motivation Circuit
More informationWhat Happens-After the First Race? Enhancing the Predictive Power of Happens-Before Based Dynamic Race Detection
145 What Happens-After the First Race? Enhancing the Predictive Power of Happens-Before Based Dynamic Race Detection UMANG MATHUR, University of Illinois, Urbana Champaign, USA DILEEP KINI, Akuna Capital
More informationCPS 104 Computer Organization and Programming Lecture 11: Gates, Buses, Latches. Robert Wagner
CPS 4 Computer Organization and Programming Lecture : Gates, Buses, Latches. Robert Wagner CPS4 GBL. RW Fall 2 Overview of Today s Lecture: The MIPS ALU Shifter The Tristate driver Bus Interconnections
More informationSupporting Intra-Task Parallelism in Real- Time Multiprocessor Systems José Fonseca
Technical Report Supporting Intra-Task Parallelism in Real- Time Multiprocessor Systems José Fonseca CISTER-TR-121007 Version: Date: 1/1/2014 Technical Report CISTER-TR-121007 Supporting Intra-Task Parallelism
More informationCSE 380 Computer Operating Systems
CSE 380 Computer Operating Systems Instructor: Insup Lee & Dianna Xu University of Pennsylvania, Fall 2003 Lecture Note 3: CPU Scheduling 1 CPU SCHEDULING q How can OS schedule the allocation of CPU cycles
More informationExercises Solutions. Automation IEA, LTH. Chapter 2 Manufacturing and process systems. Chapter 5 Discrete manufacturing problems
Exercises Solutions Note, that we have not formulated the answers for all the review questions. You will find the answers for many questions by reading and reflecting about the text in the book. Chapter
More informationDynamic Scheduling for Work Agglomeration on Heterogeneous Clusters
Dynamic Scheduling for Work Agglomeration on Heterogeneous Clusters Jonathan Lifflander, G. Carl Evans, Anshu Arya, Laxmikant Kale University of Illinois Urbana-Champaign May 25, 2012 Work is overdecomposed
More informationTrace semantics: towards a unification of parallel paradigms Stephen Brookes. Department of Computer Science Carnegie Mellon University
Trace semantics: towards a unification of parallel paradigms Stephen Brookes Department of Computer Science Carnegie Mellon University MFCSIT 2002 1 PARALLEL PARADIGMS State-based Shared-memory global
More informationTopics. Dynamic CMOS Sequential Design Memory and Control. John A. Chandy Dept. of Electrical and Computer Engineering University of Connecticut
Topics Dynamic CMOS Sequential Design Memory and Control Dynamic CMOS In static circuits at every point in time (except when switching) the output is connected to either GND or V DD via a low resistance
More informationCprE 281: Digital Logic
CprE 28: Digital Logic Instructor: Alexander Stoytchev http://www.ece.iastate.edu/~alexs/classes/ Simple Processor CprE 28: Digital Logic Iowa State University, Ames, IA Copyright Alexander Stoytchev Digital
More informationComputer Engineering Department. CC 311- Computer Architecture. Chapter 4. The Processor: Datapath and Control. Single Cycle
Computer Engineering Department CC 311- Computer Architecture Chapter 4 The Processor: Datapath and Control Single Cycle Introduction The 5 classic components of a computer Processor Input Control Memory
More informationCS 525 Proof of Theorems 3.13 and 3.15
Eric Rock CS 525 (Winter 2015) Presentation 1 (Week 4) Equivalence of Single-Tape and Multi-Tape Turing Machines 1/30/2015 1 Turing Machine (Definition 3.3) Formal Definition: (Q, Σ, Γ, δ, q 0, q accept,
More informationCSE. 1. In following code. addi. r1, skip1 xor in r2. r3, skip2. counter r4, top. taken): PC1: PC2: PC3: TTTTTT TTTTTT
CSE 560 Practice Problem Set 4 Solution 1. In this question, you will examine several different schemes for branch prediction, using the following code sequence for a simple load store ISA with no branch
More informationReal Time Operating Systems
Real Time Operating ystems hared Resources Luca Abeni Credits: Luigi Palopoli, Giuseppe Lipari, and Marco Di Natale cuola uperiore ant Anna Pisa -Italy Real Time Operating ystems p. 1 Interacting Tasks
More informationExploring the Potential of Instruction-Level Parallelism of Exposed Datapath Architectures with Buffered Processing Units
Exploring the Potential of Instruction-Level Parallelism of Exposed Datapath Architectures with Buffered Processing Units Anoop Bhagyanath and Klaus Schneider Embedded Systems Chair University of Kaiserslautern
More informationSequential programs. Uri Abraham. March 9, 2014
Sequential programs Uri Abraham March 9, 2014 Abstract In this lecture we deal with executions by a single processor, and explain some basic notions which are important for concurrent systems as well.
More informationNetworked Embedded Systems WS 2016/17
Networked Embedded Systems WS 2016/17 Lecture 2: Real-time Scheduling Marco Zimmerling Goal of Today s Lecture Introduction to scheduling of compute tasks on a single processor Tasks need to finish before
More informationD is the voltage difference = (V + - V - ).
1 Operational amplifier is one of the most common electronic building blocks used by engineers. It has two input terminals: V + and V -, and one output terminal Y. It provides a gain A, which is usually
More informationImproved cellular models with parallel Cell-DEVS
Improved cellular models with parallel Cell-DEVS Gabriel A. Wainer Departamento de Computación Facultad de Ciencias Exactas y Naturales Universidad de Buenos Aires Pabellón I - Ciudad Universitaria Buenos
More informationA 2-Approximation Algorithm for Scheduling Parallel and Time-Sensitive Applications to Maximize Total Accrued Utility Value
A -Approximation Algorithm for Scheduling Parallel and Time-Sensitive Applications to Maximize Total Accrued Utility Value Shuhui Li, Miao Song, Peng-Jun Wan, Shangping Ren Department of Engineering Mechanics,
More informationCMPEN 411 VLSI Digital Circuits Spring Lecture 21: Shifters, Decoders, Muxes
CMPEN 411 VLSI Digital Circuits Spring 2011 Lecture 21: Shifters, Decoders, Muxes [Adapted from Rabaey s Digital Integrated Circuits, Second Edition, 2003 J. Rabaey, A. Chandrakasan, B. Nikolic] Sp11 CMPEN
More informationCausality and Time. The Happens-Before Relation
Causality and Time The Happens-Before Relation Because executions are sequences of events, they induce a total order on all the events It is possible that two events by different processors do not influence
More informationParallelization of the QC-lib Quantum Computer Simulator Library
Parallelization of the QC-lib Quantum Computer Simulator Library Ian Glendinning and Bernhard Ömer September 9, 23 PPAM 23 1 Ian Glendinning / September 9, 23 Outline Introduction Quantum Bits, Registers
More informationPerformance, Power & Energy. ELEC8106/ELEC6102 Spring 2010 Hayden Kwok-Hay So
Performance, Power & Energy ELEC8106/ELEC6102 Spring 2010 Hayden Kwok-Hay So Recall: Goal of this class Performance Reconfiguration Power/ Energy H. So, Sp10 Lecture 3 - ELEC8106/6102 2 PERFORMANCE EVALUATION
More informationICS 233 Computer Architecture & Assembly Language
ICS 233 Computer Architecture & Assembly Language Assignment 6 Solution 1. Identify all of the RAW data dependencies in the following code. Which dependencies are data hazards that will be resolved by
More informationEventual Consistency for CRDTs
Eventual Consistency for CRDTs Radha Jagadeesan DePaul University Chicago, USA ESOP 2018 James Riely 1/22 CRDTs? 2/22 CRDTs? C = blah blah R = mumble DT = Data Type 2/22 Data Type An abstract data type
More informationUMBC. At the system level, DFT includes boundary scan and analog test bus. The DFT techniques discussed focus on improving testability of SAFs.
Overview Design for testability(dft) makes it possible to: Assure the detection of all faults in a circuit. Reduce the cost and time associated with test development. Reduce the execution time of performing
More informationDES. 4. Petri Nets. Introduction. Different Classes of Petri Net. Petri net properties. Analysis of Petri net models
4. Petri Nets Introduction Different Classes of Petri Net Petri net properties Analysis of Petri net models 1 Petri Nets C.A Petri, TU Darmstadt, 1962 A mathematical and graphical modeling method. Describe
More informationCompositional System Security with Interface-Confined Adversaries
MFPS 2010 Compositional System Security with Interface-Confined Adversaries Deepak Garg, Jason Franklin, Dilsun Kaynar, Anupam Datta CyLab, Carnegie Mellon University Pittsburgh PA, USA Abstract This paper
More informationCPU SCHEDULING RONG ZHENG
CPU SCHEDULING RONG ZHENG OVERVIEW Why scheduling? Non-preemptive vs Preemptive policies FCFS, SJF, Round robin, multilevel queues with feedback, guaranteed scheduling 2 SHORT-TERM, MID-TERM, LONG- TERM
More informationINF Models of concurrency
INF4140 - Models of concurrency RPC and Rendezvous INF4140 Lecture 15. Nov. 2017 RPC and Rendezvous Outline More on asynchronous message passing interacting processes with different patterns of communication
More informationLeveraging Transactional Memory for a Predictable Execution of Applications Composed of Hard Real-Time and Best-Effort Tasks
Leveraging Transactional Memory for a Predictable Execution of Applications Composed of Hard Real-Time and Best-Effort Tasks Stefan Metzlaff, Sebastian Weis, and Theo Ungerer Department of Computer Science,
More informationCombinational Logic Design Combinational Functions and Circuits
Combinational Logic Design Combinational Functions and Circuits Overview Combinational Circuits Design Procedure Generic Example Example with don t cares: BCD-to-SevenSegment converter Binary Decoders
More informationINF 4140: Models of Concurrency Series 3
Universitetet i Oslo Institutt for Informatikk PMA Olaf Owe, Martin Steffen, Toktam Ramezani INF 4140: Models of Concurrency Høst 2016 Series 3 14. 9. 2016 Topic: Semaphores (Exercises with hints for solution)
More informationLecture 7: Sequential Networks
CSE 140: Components and Design Techniques for Digital Systems Lecture 7: Sequential Networks CK Cheng Dept. of Computer Science and Engineering University of California, San Diego 1 Part II: Sequential
More informationCMP N 301 Computer Architecture. Appendix C
CMP N 301 Computer Architecture Appendix C Outline Introduction Pipelining Hazards Pipelining Implementation Exception Handling Advanced Issues (Dynamic Scheduling, Out of order Issue, Superscalar, etc)
More informationThis Unit: Scheduling (Static + Dynamic) CIS 501 Computer Architecture. Readings. Review Example
This Unit: Scheduling (Static + Dnamic) CIS 50 Computer Architecture Unit 8: Static and Dnamic Scheduling Application OS Compiler Firmware CPU I/O Memor Digital Circuits Gates & Transistors! Previousl:!
More informationI reduce synchronization costs in parallel computations by
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 8, NO. 4, APRIL 1997 337 Isotach Networks Paul F. Reynolds, Jr., Member, /E Computer Society Craig Williams, and Raymond R. Wagner, Jr. Abstract-We
More information14:332:231 DIGITAL LOGIC DESIGN
14:332:231 IGITL LOGI ESIGN Ivan Marsic, Rutgers University Electrical & omputer Engineering all 2013 Lecture #17: locked Synchronous -Machine nalysis locked Synchronous Sequential ircuits lso known as
More informationUnit 9. Multiplexers, Decoders, and Programmable Logic Devices. Unit 9 1
Unit 9 Multiplexers, ecoders, and Programmable Logic evices Unit 9 Outline Multiplexers Three state buffers ecoders Encoders Read Only Memories (ROMs) Programmable logic devices ield Programmable Gate
More informationCMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 17: Dynamic Sequential Circuits And Timing Issues
CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 17: Dynamic Sequential Circuits And Timing Issues [Adapted from Rabaey s Digital Integrated Circuits, Second Edition, 2003 J. Rabaey, A. Chandrakasan,
More informationParallel Programming in C with MPI and OpenMP
Parallel Programming in C with MPI and OpenMP Michael J. Quinn Chapter 13 Finite Difference Methods Outline n Ordinary and partial differential equations n Finite difference methods n Vibrating string
More informationRobustness of Information Systems and Technologies
Robustness of Information Systems and Technologies MARK BURGIN Department of Mathematics University of California, Los Angeles Los Angeles, CA 90095 USA mburgin@math.ucla.edu Abstract: - Robustness of
More informationarxiv: v2 [cs.dc] 20 Nov 2017
The Optimal Pessimistic Transactional Memory Algorithm arxiv:1605.01361v2 [cs.dc] 20 Nov 2017 Paweł T. Wojciechowski, Konrad Siek {pawel.t.wojciechowski,konrad.siek}@cs.put.edu.pl Institute of Computing
More informationAutomata-Theoretic Model Checking of Reactive Systems
Automata-Theoretic Model Checking of Reactive Systems Radu Iosif Verimag/CNRS (Grenoble, France) Thanks to Tom Henzinger (IST, Austria), Barbara Jobstmann (CNRS, Grenoble) and Doron Peled (Bar-Ilan University,
More informationReview: From problem to parallel algorithm
Review: From problem to parallel algorithm Mathematical formulations of interesting problems abound Poisson s equation Sources: Electrostatics, gravity, fluid flow, image processing (!) Numerical solution:
More informationCuts. Cuts. Consistent cuts and consistent global states. Global states and cuts. A cut C is a subset of the global history of H
Cuts Cuts A cut C is a subset of the global history of H C = h c 1 1 hc 2 2...hc n n A cut C is a subset of the global history of H The frontier of C is the set of events e c 1 1,ec 2 2,...ec n n C = h
More informationLower Bound on the Step Complexity of Anonymous Binary Consensus
Lower Bound on the Step Complexity of Anonymous Binary Consensus Hagit Attiya 1, Ohad Ben-Baruch 2, and Danny Hendler 3 1 Department of Computer Science, Technion, hagit@cs.technion.ac.il 2 Department
More informationECE 571 Advanced Microprocessor-Based Design Lecture 10
ECE 571 Advanced Microprocessor-Based Design Lecture 10 Vince Weaver http://web.eece.maine.edu/~vweaver vincent.weaver@maine.edu 23 February 2017 Announcements HW#5 due HW#6 will be posted 1 Oh No, More
More informationDartmouth Computer Science Technical Report TR Efficient Wait-Free Implementation of Multiword LL/SC Variables
Dartmouth Computer Science Technical Report TR2004-523 Efficient Wait-Free Implementation of Multiword LL/SC Variables Prasad Jayanti and Srdjan Petrovic Department of Computer Science Dartmouth College
More information