Concurrency models and Modern Processors

Size: px
Start display at page:

Download "Concurrency models and Modern Processors"

Transcription

1 Concurrency models and Modern Processors 1 / 17

2 Introduction The classical model of concurrency is the interleaving model. It corresponds to a memory model called Sequential Consistency (SC). Modern processors dot not implement SC, but so-called relaxed memory models, such as Total Store Order (TSO). 2 / 17

3 Memory models Sequential Consistency All memory accesses are immediately visible to all processors. Total Store Order (TSO) All write operations go through a buffer. Each processor reads the most recent value in its buffer, if there is such a value ; If not, the value held in memory is read. 3 / 17

4 Memory models (continued) x86-tso Intended for the programmer Based on TSO Extended with the concept of a global lock Compatible with the tests found in processors documentation Does not aim at closely modeling the internal structure of processors 4 / 17

5 Memory models (continued 2) Examples that can be fully executed under TSO (and X86-TSO), but not under SC initially : x = y = 0 Processor 1 Processor 2 store(p 1, x, 1) (s 1 ) store(p 2, y, 1) (s 2 ) load(p 1, y, 0) (l 1 ) load(p 2, x, 0) (l 2 ) store(p, m, v) p : [m] v load(p, m, v) p : ([m] == v) Possible interleavings : SC TSO s 1 s 1 s 1 s 2 s 1 l 1 s 2 s 2 l 2 l 1 s 2 l 1 blocks l 1 s 1 s 2 l 2 blocks l 2 l 1 l 2 blocks 5 / 17

6 Memory models (continued 3) Examples that can be fully executed under TSO (and X86-TSO), but not under SC initially : x = y = 0 Processor 1 Processor 2 store(p 1, x, 1) (s 1 ) store(p 2, y, 1) (s 2 ) load(p 1, x, 1) (l 1 ) load(p 2, y, 1) (l 3 ) load(p 1, y, 0) (l 2 ) load(p 2, x, 0) (l 4 ) Possible interleavings : SC TSO s 1 s 1 s 1 s 2 s 1 l 1 s 2 s 2 l 3 l 1 l 2 l 1 l 1 s 1 l 2 s 2 l 3 l 2 l 1 s 2 l 3 l 2 blocks l 3 l 2 l 3 l 4 blocks l 4 l 4 l 4 blocks 6 / 17

7 SC Formal definition Order relations : program order (< p ) (partial, per processor order) memory order (< m ) (global) To correspond to the SC model, an execution must satisfy the following conditions : op i, op j : op i < p op j op i < m op j where op x représents a store or load operation. The result of a load is the one compatible with the memory order. 7 / 17

8 SC (continued) Operational definition P 1 P 2 P n Loads/Stores Loads/Stores Loads/Stores Switch Single Port Memory 8 / 17

9 SC (continued 2) Memory orders, first example initially : x = y = 0 Processor 1 Processor 2 store(p 1, x, 1) (s 1 ) store(p 2, y, 1) (s 2 ) load(p 1, y, 0) (l 1 ) load(p 2, x, 0) (l 2 ) where s 1 < p l 1 et s 2 < p l 2. The possible memory orders are all interleavings satisfying the conditions s 1 < m l 1 et s 2 < m l 2. Example : s 1 < m l 1 < m s 2 < m l 2 9 / 17

10 TSO Formal definition To correspond to the TSO model, an execution must satisfy the following conditions : 1. l a, l b : l a < p l b l a < m l b 2. s a, s b : s a < p s b s a < m s b 3. l, s : l < p s l < m s 4. val(l a ) = val(max < m {s a s a < m l a s a < p l a }). Note that stores can be delayed, but loads have access to the latest locally written value. 10 / 17

11 TSO (continued) Operational definition P 1 P 2 P n Stores Stores Loads Loads Loads Stores FIFO Store Buffer Switch Single Port Memory Transferring a store from a buffer to main memory is called a commit. 11 / 17

12 TSO (continued 2) Memory orders, first example initially : x = y = 0 Processor 1 Processor 2 store(p 1, x, 1) (s 1 ) store(p 2, y, 1) (s 2 ) load(p 1, y, 0) (l 1 ) load(p 2, x, 0) (l 2 ) where s 1 < p l 1 et s 2 < p l 2. The compatible memory orders, are the interleavings of s 1, l 1, s 2 et l 2. Example : l 1 < m s 2 < m l 2 < m s 1 or l 1 < m l 2 < m s 1 < m s 2 12 / 17

13 TSO (continued 3) Memory orders, second example initially : x = y = 0 Processor 1 Processor 2 store(p 1, x, 1) (s 1 ) store(p 2, y, 1) (s 2 ) load(p 1, x, 1) (l 1 ) load(p 2, y, 1) (l 3 ) load(p 1, y, 0) (l 2 ) load(p 2, x, 0) (l 4 ) where s 1 < p l 1, l 1 < p l 2, s 2 < p l 3 et l 3 < p l 4. The compatible memory orders are all interleavings of s 1, l 1, l 2, s 2, l 3 et l 4 satisfying l 1 < m l 2 et l 3 < m l 4 Exemple : l 1 < m l 2 < m l 3 < m l 4 < m s 1 < m s 2 13 / 17

14 TSO (continued 4) Memory orders, second example - Details Sequence of operations Memory order store(p 1, x, 1) (s 1 ) - 14 / 17

15 TSO (continued 4) Memory orders, second example - Details Sequence of operations Memory order store(p 1, x, 1) (s 1 ) - load(p 1, x, 1) (l 1 ) l 1 14 / 17

16 TSO (continued 4) Memory orders, second example - Details Sequence of operations Memory order store(p 1, x, 1) (s 1 ) - load(p 1, x, 1) (l 1 ) l 1 load(p 1, y, 0) (l 2 ) l 1 < m l 2 14 / 17

17 TSO (continued 4) Memory orders, second example - Details Sequence of operations Memory order store(p 1, x, 1) (s 1 ) - load(p 1, x, 1) (l 1 ) l 1 load(p 1, y, 0) (l 2 ) l 1 < m l 2 store(p 2, y, 1) (s 2 ) l 1 < m l 2 14 / 17

18 TSO (continued 4) Memory orders, second example - Details Sequence of operations Memory order store(p 1, x, 1) (s 1 ) - load(p 1, x, 1) (l 1 ) l 1 load(p 1, y, 0) (l 2 ) l 1 < m l 2 store(p 2, y, 1) (s 2 ) l 1 < m l 2 load(p 2, y, 1) (l 3 ) l 1 < m l 2 < m l 3 14 / 17

19 TSO (continued 4) Memory orders, second example - Details Sequence of operations Memory order store(p 1, x, 1) (s 1 ) - load(p 1, x, 1) (l 1 ) l 1 load(p 1, y, 0) (l 2 ) l 1 < m l 2 store(p 2, y, 1) (s 2 ) l 1 < m l 2 load(p 2, y, 1) (l 3 ) l 1 < m l 2 < m l 3 load(p 2, x, 0) (l 4 ) l 1 < m l 2 < m l 3 < m l 4 14 / 17

20 TSO (continued 4) Memory orders, second example - Details Sequence of operations Memory order store(p 1, x, 1) (s 1 ) - load(p 1, x, 1) (l 1 ) l 1 load(p 1, y, 0) (l 2 ) l 1 < m l 2 store(p 2, y, 1) (s 2 ) l 1 < m l 2 load(p 2, y, 1) (l 3 ) l 1 < m l 2 < m l 3 load(p 2, x, 0) (l 4 ) l 1 < m l 2 < m l 3 < m l 4 (commit(s 1 )) l 1 < m l 2 < m l 3 < m l 4 < m s 1 14 / 17

21 TSO (continued 4) Memory orders, second example - Details Sequence of operations Memory order store(p 1, x, 1) (s 1 ) - load(p 1, x, 1) (l 1 ) l 1 load(p 1, y, 0) (l 2 ) l 1 < m l 2 store(p 2, y, 1) (s 2 ) l 1 < m l 2 load(p 2, y, 1) (l 3 ) l 1 < m l 2 < m l 3 load(p 2, x, 0) (l 4 ) l 1 < m l 2 < m l 3 < m l 4 (commit(s 1 )) l 1 < m l 2 < m l 3 < m l 4 < m s 1 (commit(s 2 )) l 1 < m l 2 < m l 3 < m l 4 < m s 1 < m s 2 14 / 17

22 x86-tso Formal definition The order constraints on the loads and stores are the same as those for TSO. Extended operations : mfence(p) blocks processor p until its buffer is empty. lock(p) If the lock is not already held by another processor, p takes the lock and obtains exclusive access to the global memory : the other processors are not allowed to execute the operations commit or load unlock(p) p flushes the buffer to global memory and releases the lock. 15 / 17

23 x86-tso Operational view Loads P 1 Stores FIFO Store Buffer LoadsP n Stores Switch Single Port Memory Lock 16 / 17

24 Other memory models Partial Store Order (PSO) Relaxed Memory Order (RMO) 17 / 17

Multicore Semantics and Programming

Multicore Semantics and Programming Multicore Semantics and Programming Peter Sewell Tim Harris University of Cambridge Oracle October November, 2015 p. 1 These Lectures Part 1: Multicore Semantics: the concurrency of multiprocessors and

More information

NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1

NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1 NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1 Multi-processor vs. Multi-computer architecture µp vs. DSP RISC vs. DSP RISC Reduced-instruction-set Register-to-register operation Higher throughput by using

More information

Noninterference under Weak Memory Models (Progress Report)

Noninterference under Weak Memory Models (Progress Report) Noninterference under Weak Memory Models (Progress Report) Technical Report TUD-CS-2014-0062 March 2014 Heiko Mantel, Matthias Perner, Jens Sauer Noninterference under Weak Memory Models (Progress Report)

More information

CS 152 Computer Architecture and Engineering. Lecture 17: Synchronization and Sequential Consistency

CS 152 Computer Architecture and Engineering. Lecture 17: Synchronization and Sequential Consistency CS 152 Computer Architecture and Engineering Lecture 17: Synchronization and Sequential Consistency Dr. George Michelogiannakis EECS, University of California at Berkeley CRD, Lawrence Berkeley National

More information

1 st Semester 2007/2008

1 st Semester 2007/2008 Chapter 17: System Departamento de Engenharia Informática Instituto Superior Técnico 1 st Semester 2007/2008 Slides baseados nos slides oficiais do livro Database System c Silberschatz, Korth and Sudarshan.

More information

Distributed Systems Fundamentals

Distributed Systems Fundamentals February 17, 2000 ECS 251 Winter 2000 Page 1 Distributed Systems Fundamentals 1. Distributed system? a. What is it? b. Why use it? 2. System Architectures a. minicomputer mode b. workstation model c. processor

More information

COE 328 Final Exam 2008

COE 328 Final Exam 2008 COE 328 Final Exam 2008 1. Design a comparator that compares a 4 bit number A to a 4 bit number B and gives an Output F=1 if A is not equal B. You must use 2 input LUTs only. 2. Given the following logic

More information

arxiv: v1 [cs.dc] 9 Feb 2015

arxiv: v1 [cs.dc] 9 Feb 2015 Why Transactional Memory Should Not Be Obstruction-Free Petr Kuznetsov 1 Srivatsan Ravi 2 1 Télécom ParisTech 2 TU Berlin arxiv:1502.02725v1 [cs.dc] 9 Feb 2015 Abstract Transactional memory (TM) is an

More information

CSCI-564 Advanced Computer Architecture

CSCI-564 Advanced Computer Architecture CSCI-564 Advanced Computer Architecture Lecture 8: Handling Exceptions and Interrupts / Superscalar Bo Wu Colorado School of Mines Branch Delay Slots (expose control hazard to software) Change the ISA

More information

Denotational event structure for relaxed memory

Denotational event structure for relaxed memory Denotational event structure for relaxed memory Jade Alglave 1, Simon Castellan 2, Jean-Marie Madiot 3 1 ARM, and University College London, UK 2 Imperial College London, UK 3 INRIA 7th July, 2018 LOLA

More information

Lecture: Pipelining Basics

Lecture: Pipelining Basics Lecture: Pipelining Basics Topics: Performance equations wrap-up, Basic pipelining implementation Video 1: What is pipelining? Video 2: Clocks and latches Video 3: An example 5-stage pipeline Video 4:

More information

Complex Systems Design & Distributed Calculus and Coordination

Complex Systems Design & Distributed Calculus and Coordination Complex Systems Design & Distributed Calculus and Coordination Concurrency and Process Algebras: Theory and Practice Francesco Tiezzi University of Camerino francesco.tiezzi@unicam.it A.A. 2014/2015 F.

More information

CSE370: Introduction to Digital Design

CSE370: Introduction to Digital Design CSE370: Introduction to Digital Design Course staff Gaetano Borriello, Brian DeRenzi, Firat Kiyak Course web www.cs.washington.edu/370/ Make sure to subscribe to class mailing list (cse370@cs) Course text

More information

Pipelining. Traditional Execution. CS 365 Lecture 12 Prof. Yih Huang. add ld beq CS CS 365 2

Pipelining. Traditional Execution. CS 365 Lecture 12 Prof. Yih Huang. add ld beq CS CS 365 2 Pipelining CS 365 Lecture 12 Prof. Yih Huang CS 365 1 Traditional Execution 1 2 3 4 1 2 3 4 5 1 2 3 add ld beq CS 365 2 1 Pipelined Execution 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5

More information

TDDI04, K. Arvidsson, IDA, Linköpings universitet CPU Scheduling. Overview: CPU Scheduling. [SGG7] Chapter 5. Basic Concepts.

TDDI04, K. Arvidsson, IDA, Linköpings universitet CPU Scheduling. Overview: CPU Scheduling. [SGG7] Chapter 5. Basic Concepts. TDDI4 Concurrent Programming, Operating Systems, and Real-time Operating Systems CPU Scheduling Overview: CPU Scheduling CPU bursts and I/O bursts Scheduling Criteria Scheduling Algorithms Multiprocessor

More information

Embedded Systems 14. Overview of embedded systems design

Embedded Systems 14. Overview of embedded systems design Embedded Systems 14-1 - Overview of embedded systems design - 2-1 Point of departure: Scheduling general IT systems In general IT systems, not much is known about the computational processes a priori The

More information

Latches. October 13, 2003 Latches 1

Latches. October 13, 2003 Latches 1 Latches The second part of CS231 focuses on sequential circuits, where we add memory to the hardware that we ve already seen. Our schedule will be very similar to before: We first show how primitive memory

More information

Input-queued switches: Scheduling algorithms for a crossbar switch. EE 384X Packet Switch Architectures 1

Input-queued switches: Scheduling algorithms for a crossbar switch. EE 384X Packet Switch Architectures 1 Input-queued switches: Scheduling algorithms for a crossbar switch EE 84X Packet Switch Architectures Overview Today s lecture - the input-buffered switch architecture - the head-of-line blocking phenomenon

More information

Cyrus: Unintrusive Application-Level Record-Replay for Replay Parallelism

Cyrus: Unintrusive Application-Level Record-Replay for Replay Parallelism Cyrus: Unintrusive Application-Level Record-Replay for Replay Parallelism Nima Honarmand, Nathan Dautenhahn, Josep Torrellas and Samuel T. King (UIUC) Gilles Pokam and Cristiano Pereira (Intel) iacoma.cs.uiuc.edu

More information

Logic. Basic Logic Functions. Switches in series (AND) Truth Tables. Switches in Parallel (OR) Alternative view for OR

Logic. Basic Logic Functions. Switches in series (AND) Truth Tables. Switches in Parallel (OR) Alternative view for OR TOPIS: Logic Logic Expressions Logic Gates Simplifying Logic Expressions Sequential Logic (Logic with a Memory) George oole (85-864), English mathematician, oolean logic used in digital computers since

More information

On the Verification Problem for Weak Memory Models

On the Verification Problem for Weak Memory Models On the Verification Problem for Weak Memory Models Mohamed Faouzi Atig Ahmed Bouajjani LIAFA, University Paris Diderot, Paris, France {atig,abou}@liafa.jussieu.fr Sebastian Burckhardt Madanlal Musuvathi

More information

INF Models of concurrency

INF Models of concurrency Monitors INF4140 - Models of concurrency Monitors, lecture 4 Fall 2017 27. September 2017 2 / 49 Overview Concurrent execution of different processes Communication by shared variables Processes may interfere

More information

Real-time operating systems course. 6 Definitions Non real-time scheduling algorithms Real-time scheduling algorithm

Real-time operating systems course. 6 Definitions Non real-time scheduling algorithms Real-time scheduling algorithm Real-time operating systems course 6 Definitions Non real-time scheduling algorithms Real-time scheduling algorithm Definitions Scheduling Scheduling is the activity of selecting which process/thread should

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution

More information

Evaluating Overheads of Multi-bit Soft Error Protection Techniques at Hardware Level Sponsored by SRC and Freescale under SRC task number 2042

Evaluating Overheads of Multi-bit Soft Error Protection Techniques at Hardware Level Sponsored by SRC and Freescale under SRC task number 2042 Evaluating Overheads of Multi-bit Soft Error Protection Techniques at Hardware Level Sponsored by SR and Freescale under SR task number 2042 Lukasz G. Szafaryn, Kevin Skadron Department of omputer Science

More information

2 k Factorial Designs Raj Jain

2 k Factorial Designs Raj Jain 2 k Factorial Designs Raj Jain Washington University in Saint Louis Saint Louis, MO 63130 Jain@cse.wustl.edu These slides are available on-line at: http://www.cse.wustl.edu/~jain/cse567-06/ 17-1 Overview!

More information

Lecture 14: State Tables, Diagrams, Latches, and Flip Flop

Lecture 14: State Tables, Diagrams, Latches, and Flip Flop EE210: Switching Systems Lecture 14: State Tables, Diagrams, Latches, and Flip Flop Prof. YingLi Tian Nov. 6, 2017 Department of Electrical Engineering The City College of New York The City University

More information

Issue = Select + Wakeup. Out-of-order Pipeline. Issue. Issue = Select + Wakeup. OOO execution (2-wide) OOO execution (2-wide)

Issue = Select + Wakeup. Out-of-order Pipeline. Issue. Issue = Select + Wakeup. OOO execution (2-wide) OOO execution (2-wide) Out-of-order Pipeline Buffer of instructions Issue = Select + Wakeup Select N oldest, read instructions N=, xor N=, xor and sub Note: ma have execution resource constraints: i.e., load/store/fp Fetch Decode

More information

FACTORS AFFECTING CONCURRENT TRUNCATE

FACTORS AFFECTING CONCURRENT TRUNCATE T E C H N I C A L N O T E FACTORS AFFECTING CONCURRENT TRUNCATE DURING BATCH PROCESSES Prepared By David Kurtz, Go-Faster Consultancy Ltd. Technical Note Version 1.00 Thursday 2 April 2009 (E-mail: david.kurtz@go-faster.co.uk,

More information

Taming Release-Acquire Consistency

Taming Release-Acquire Consistency Taming Release-Acquire Consistency Ori Lahav Nick Giannarakis Viktor Vafeiadis Max Planck Institute for Software Systems (MPI-SWS), Germany {orilahav,nickgian,viktor}@mpi-sws.org * POPL * Artifact Consistent

More information

Announcements. Project #1 grades were returned on Monday. Midterm #1. Project #2. Requests for re-grades due by Tuesday

Announcements. Project #1 grades were returned on Monday. Midterm #1. Project #2. Requests for re-grades due by Tuesday Announcements Project #1 grades were returned on Monday Requests for re-grades due by Tuesday Midterm #1 Re-grade requests due by Monday Project #2 Due 10 AM Monday 1 Page State (hardware view) Page frame

More information

Lecture Note #6: More on Task Scheduling EECS 571 Principles of Real-Time Embedded Systems Kang G. Shin EECS Department University of Michigan

Lecture Note #6: More on Task Scheduling EECS 571 Principles of Real-Time Embedded Systems Kang G. Shin EECS Department University of Michigan Lecture Note #6: More on Task Scheduling EECS 571 Principles of Real-Time Embedded Systems Kang G. Shin EECS Department University of Michigan Note 6-1 Mars Pathfinder Timing Hiccups? When: landed on the

More information

A Brief History of Shared memory C M U

A Brief History of Shared memory C M U A Brief History of Shared memory S t e p h e n B r o o k e s C M U 1 Outline Revisionist history Rational reconstruction of early models Evolution of recent models A unifying framework Fault-detecting

More information

Logical Time. 1. Introduction 2. Clock and Events 3. Logical (Lamport) Clocks 4. Vector Clocks 5. Efficient Implementation

Logical Time. 1. Introduction 2. Clock and Events 3. Logical (Lamport) Clocks 4. Vector Clocks 5. Efficient Implementation Logical Time Nicola Dragoni Embedded Systems Engineering DTU Compute 1. Introduction 2. Clock and Events 3. Logical (Lamport) Clocks 4. Vector Clocks 5. Efficient Implementation 2013 ACM Turing Award:

More information

Benefits of Interval Temporal Logic for Specification of Concurrent Systems

Benefits of Interval Temporal Logic for Specification of Concurrent Systems Benefits of Interval Temporal Logic for Specification of Concurrent Systems Ben Moszkowski Software Technology Research Laboratory De Montfort University Leicester Great Britain email: benm@dmu.ac.uk http://www.tech.dmu.ac.uk/~benm

More information

Parallel Computation of the Eigenstructure of Toeplitz-plus-Hankel matrices on Multicomputers

Parallel Computation of the Eigenstructure of Toeplitz-plus-Hankel matrices on Multicomputers Parallel Computation of the Eigenstructure of Toeplitz-plus-Hankel matrices on Multicomputers José M. Badía * and Antonio M. Vidal * Departamento de Sistemas Informáticos y Computación Universidad Politécnica

More information

Logic Model Checking

Logic Model Checking Logic Model Checking Lecture Notes 10:18 Caltech 101b.2 January-March 2004 Course Text: The Spin Model Checker: Primer and Reference Manual Addison-Wesley 2003, ISBN 0-321-22862-6, 608 pgs. the assignment

More information

Lower Bounds for Restricted-Use Objects

Lower Bounds for Restricted-Use Objects Lower Bounds for Restricted-Use Objects James Aspnes Keren Censor-Hillel Hagit Attiya Danny Hendler March 13, 2016 Abstract Concurrent objects play a key role in the design of applications for multi-core

More information

Design of Distributed Systems Melinda Tóth, Zoltán Horváth

Design of Distributed Systems Melinda Tóth, Zoltán Horváth Design of Distributed Systems Melinda Tóth, Zoltán Horváth Design of Distributed Systems Melinda Tóth, Zoltán Horváth Publication date 2014 Copyright 2014 Melinda Tóth, Zoltán Horváth Supported by TÁMOP-412A/1-11/1-2011-0052

More information

Machine Learning to Automatically Detect Human Development from Satellite Imagery

Machine Learning to Automatically Detect Human Development from Satellite Imagery Technical Disclosure Commons Defensive Publications Series April 24, 2017 Machine Learning to Automatically Detect Human Development from Satellite Imagery Matthew Manolides Follow this and additional

More information

A subtle problem. An obvious problem. An obvious problem. An obvious problem. No!

A subtle problem. An obvious problem. An obvious problem. An obvious problem. No! A subtle problem An obvious problem when LC = t do S doesn t make sense for Lamport clocks! there is no guarantee that LC will ever be S is anyway executed after LC = t Fixes: if e is internal/send and

More information

Section 6 Fault-Tolerant Consensus

Section 6 Fault-Tolerant Consensus Section 6 Fault-Tolerant Consensus CS586 - Panagiota Fatourou 1 Description of the Problem Consensus Each process starts with an individual input from a particular value set V. Processes may fail by crashing.

More information

Lecture 17: Designing Sequential Systems Using Flip Flops

Lecture 17: Designing Sequential Systems Using Flip Flops EE210: Switching Systems Lecture 17: Designing Sequential Systems Using Flip Flops Prof. YingLi Tian April 11, 2019 Department of Electrical Engineering The City College of New York The City University

More information

ENGG 1203 Tutorial_9 - Review. Boolean Algebra. Simplifying Logic Circuits. Combinational Logic. 1. Combinational & Sequential Logic

ENGG 1203 Tutorial_9 - Review. Boolean Algebra. Simplifying Logic Circuits. Combinational Logic. 1. Combinational & Sequential Logic ENGG 1203 Tutorial_9 - Review Boolean Algebra 1. Combinational & Sequential Logic 2. Computer Systems 3. Electronic Circuits 4. Signals, Systems, and Control Remark : Multiple Choice Questions : ** Check

More information

System Data Bus (8-bit) Data Buffer. Internal Data Bus (8-bit) 8-bit register (R) 3-bit address 16-bit register pair (P) 2-bit address

System Data Bus (8-bit) Data Buffer. Internal Data Bus (8-bit) 8-bit register (R) 3-bit address 16-bit register pair (P) 2-bit address Intel 8080 CPU block diagram 8 System Data Bus (8-bit) Data Buffer Registry Array B 8 C Internal Data Bus (8-bit) F D E H L ALU SP A PC Address Buffer 16 System Address Bus (16-bit) Internal register addressing:

More information

Parallel Scientific Computing

Parallel Scientific Computing IV-1 Parallel Scientific Computing Matrix-vector multiplication. Matrix-matrix multiplication. Direct method for solving a linear equation. Gaussian Elimination. Iterative method for solving a linear equation.

More information

Building a Computer. Quiz #2 on 10/31, open book and notes. (This is the last lecture covered) I wonder where this goes? L16- Building a Computer 1

Building a Computer. Quiz #2 on 10/31, open book and notes. (This is the last lecture covered) I wonder where this goes? L16- Building a Computer 1 Building a Computer I wonder where this goes? B LU MIPS Kit Quiz # on /3, open book and notes (This is the last lecture covered) Comp 4 Fall 7 /4/7 L6- Building a Computer THIS IS IT! Motivating Force

More information

Project Two RISC Processor Implementation ECE 485

Project Two RISC Processor Implementation ECE 485 Project Two RISC Processor Implementation ECE 485 Chenqi Bao Peter Chinetti November 6, 2013 Instructor: Professor Borkar 1 Statement of Problem This project requires the design and test of a RISC processor

More information

: MIMO Latency - Revised Definition

: MIMO Latency - Revised Definition 96-1268: MIMO Latency - Revised Definition, Gojko Babic Contact: Jain@CIS.Ohio-State.Edu http://www.cis.ohio-state.edu/~jain/ 1 Overview Input frame not contiguous Output frame not contiguous Input Speed

More information

Circuit Modeling for Practical Many-core Architecture Design Exploration

Circuit Modeling for Practical Many-core Architecture Design Exploration Circuit Modeling for Practical Many-core Architecture Design Exploration Redefining design abstractions Dean Truong Bevan Baas VLSI Computation Lab University of California, Davis Outline Motivation Circuit

More information

What Happens-After the First Race? Enhancing the Predictive Power of Happens-Before Based Dynamic Race Detection

What Happens-After the First Race? Enhancing the Predictive Power of Happens-Before Based Dynamic Race Detection 145 What Happens-After the First Race? Enhancing the Predictive Power of Happens-Before Based Dynamic Race Detection UMANG MATHUR, University of Illinois, Urbana Champaign, USA DILEEP KINI, Akuna Capital

More information

CPS 104 Computer Organization and Programming Lecture 11: Gates, Buses, Latches. Robert Wagner

CPS 104 Computer Organization and Programming Lecture 11: Gates, Buses, Latches. Robert Wagner CPS 4 Computer Organization and Programming Lecture : Gates, Buses, Latches. Robert Wagner CPS4 GBL. RW Fall 2 Overview of Today s Lecture: The MIPS ALU Shifter The Tristate driver Bus Interconnections

More information

Supporting Intra-Task Parallelism in Real- Time Multiprocessor Systems José Fonseca

Supporting Intra-Task Parallelism in Real- Time Multiprocessor Systems José Fonseca Technical Report Supporting Intra-Task Parallelism in Real- Time Multiprocessor Systems José Fonseca CISTER-TR-121007 Version: Date: 1/1/2014 Technical Report CISTER-TR-121007 Supporting Intra-Task Parallelism

More information

CSE 380 Computer Operating Systems

CSE 380 Computer Operating Systems CSE 380 Computer Operating Systems Instructor: Insup Lee & Dianna Xu University of Pennsylvania, Fall 2003 Lecture Note 3: CPU Scheduling 1 CPU SCHEDULING q How can OS schedule the allocation of CPU cycles

More information

Exercises Solutions. Automation IEA, LTH. Chapter 2 Manufacturing and process systems. Chapter 5 Discrete manufacturing problems

Exercises Solutions. Automation IEA, LTH. Chapter 2 Manufacturing and process systems. Chapter 5 Discrete manufacturing problems Exercises Solutions Note, that we have not formulated the answers for all the review questions. You will find the answers for many questions by reading and reflecting about the text in the book. Chapter

More information

Dynamic Scheduling for Work Agglomeration on Heterogeneous Clusters

Dynamic Scheduling for Work Agglomeration on Heterogeneous Clusters Dynamic Scheduling for Work Agglomeration on Heterogeneous Clusters Jonathan Lifflander, G. Carl Evans, Anshu Arya, Laxmikant Kale University of Illinois Urbana-Champaign May 25, 2012 Work is overdecomposed

More information

Trace semantics: towards a unification of parallel paradigms Stephen Brookes. Department of Computer Science Carnegie Mellon University

Trace semantics: towards a unification of parallel paradigms Stephen Brookes. Department of Computer Science Carnegie Mellon University Trace semantics: towards a unification of parallel paradigms Stephen Brookes Department of Computer Science Carnegie Mellon University MFCSIT 2002 1 PARALLEL PARADIGMS State-based Shared-memory global

More information

Topics. Dynamic CMOS Sequential Design Memory and Control. John A. Chandy Dept. of Electrical and Computer Engineering University of Connecticut

Topics. Dynamic CMOS Sequential Design Memory and Control. John A. Chandy Dept. of Electrical and Computer Engineering University of Connecticut Topics Dynamic CMOS Sequential Design Memory and Control Dynamic CMOS In static circuits at every point in time (except when switching) the output is connected to either GND or V DD via a low resistance

More information

CprE 281: Digital Logic

CprE 281: Digital Logic CprE 28: Digital Logic Instructor: Alexander Stoytchev http://www.ece.iastate.edu/~alexs/classes/ Simple Processor CprE 28: Digital Logic Iowa State University, Ames, IA Copyright Alexander Stoytchev Digital

More information

Computer Engineering Department. CC 311- Computer Architecture. Chapter 4. The Processor: Datapath and Control. Single Cycle

Computer Engineering Department. CC 311- Computer Architecture. Chapter 4. The Processor: Datapath and Control. Single Cycle Computer Engineering Department CC 311- Computer Architecture Chapter 4 The Processor: Datapath and Control Single Cycle Introduction The 5 classic components of a computer Processor Input Control Memory

More information

CS 525 Proof of Theorems 3.13 and 3.15

CS 525 Proof of Theorems 3.13 and 3.15 Eric Rock CS 525 (Winter 2015) Presentation 1 (Week 4) Equivalence of Single-Tape and Multi-Tape Turing Machines 1/30/2015 1 Turing Machine (Definition 3.3) Formal Definition: (Q, Σ, Γ, δ, q 0, q accept,

More information

CSE. 1. In following code. addi. r1, skip1 xor in r2. r3, skip2. counter r4, top. taken): PC1: PC2: PC3: TTTTTT TTTTTT

CSE. 1. In following code. addi. r1, skip1 xor in r2. r3, skip2. counter r4, top. taken): PC1: PC2: PC3: TTTTTT TTTTTT CSE 560 Practice Problem Set 4 Solution 1. In this question, you will examine several different schemes for branch prediction, using the following code sequence for a simple load store ISA with no branch

More information

Real Time Operating Systems

Real Time Operating Systems Real Time Operating ystems hared Resources Luca Abeni Credits: Luigi Palopoli, Giuseppe Lipari, and Marco Di Natale cuola uperiore ant Anna Pisa -Italy Real Time Operating ystems p. 1 Interacting Tasks

More information

Exploring the Potential of Instruction-Level Parallelism of Exposed Datapath Architectures with Buffered Processing Units

Exploring the Potential of Instruction-Level Parallelism of Exposed Datapath Architectures with Buffered Processing Units Exploring the Potential of Instruction-Level Parallelism of Exposed Datapath Architectures with Buffered Processing Units Anoop Bhagyanath and Klaus Schneider Embedded Systems Chair University of Kaiserslautern

More information

Sequential programs. Uri Abraham. March 9, 2014

Sequential programs. Uri Abraham. March 9, 2014 Sequential programs Uri Abraham March 9, 2014 Abstract In this lecture we deal with executions by a single processor, and explain some basic notions which are important for concurrent systems as well.

More information

Networked Embedded Systems WS 2016/17

Networked Embedded Systems WS 2016/17 Networked Embedded Systems WS 2016/17 Lecture 2: Real-time Scheduling Marco Zimmerling Goal of Today s Lecture Introduction to scheduling of compute tasks on a single processor Tasks need to finish before

More information

D is the voltage difference = (V + - V - ).

D is the voltage difference = (V + - V - ). 1 Operational amplifier is one of the most common electronic building blocks used by engineers. It has two input terminals: V + and V -, and one output terminal Y. It provides a gain A, which is usually

More information

Improved cellular models with parallel Cell-DEVS

Improved cellular models with parallel Cell-DEVS Improved cellular models with parallel Cell-DEVS Gabriel A. Wainer Departamento de Computación Facultad de Ciencias Exactas y Naturales Universidad de Buenos Aires Pabellón I - Ciudad Universitaria Buenos

More information

A 2-Approximation Algorithm for Scheduling Parallel and Time-Sensitive Applications to Maximize Total Accrued Utility Value

A 2-Approximation Algorithm for Scheduling Parallel and Time-Sensitive Applications to Maximize Total Accrued Utility Value A -Approximation Algorithm for Scheduling Parallel and Time-Sensitive Applications to Maximize Total Accrued Utility Value Shuhui Li, Miao Song, Peng-Jun Wan, Shangping Ren Department of Engineering Mechanics,

More information

CMPEN 411 VLSI Digital Circuits Spring Lecture 21: Shifters, Decoders, Muxes

CMPEN 411 VLSI Digital Circuits Spring Lecture 21: Shifters, Decoders, Muxes CMPEN 411 VLSI Digital Circuits Spring 2011 Lecture 21: Shifters, Decoders, Muxes [Adapted from Rabaey s Digital Integrated Circuits, Second Edition, 2003 J. Rabaey, A. Chandrakasan, B. Nikolic] Sp11 CMPEN

More information

Causality and Time. The Happens-Before Relation

Causality and Time. The Happens-Before Relation Causality and Time The Happens-Before Relation Because executions are sequences of events, they induce a total order on all the events It is possible that two events by different processors do not influence

More information

Parallelization of the QC-lib Quantum Computer Simulator Library

Parallelization of the QC-lib Quantum Computer Simulator Library Parallelization of the QC-lib Quantum Computer Simulator Library Ian Glendinning and Bernhard Ömer September 9, 23 PPAM 23 1 Ian Glendinning / September 9, 23 Outline Introduction Quantum Bits, Registers

More information

Performance, Power & Energy. ELEC8106/ELEC6102 Spring 2010 Hayden Kwok-Hay So

Performance, Power & Energy. ELEC8106/ELEC6102 Spring 2010 Hayden Kwok-Hay So Performance, Power & Energy ELEC8106/ELEC6102 Spring 2010 Hayden Kwok-Hay So Recall: Goal of this class Performance Reconfiguration Power/ Energy H. So, Sp10 Lecture 3 - ELEC8106/6102 2 PERFORMANCE EVALUATION

More information

ICS 233 Computer Architecture & Assembly Language

ICS 233 Computer Architecture & Assembly Language ICS 233 Computer Architecture & Assembly Language Assignment 6 Solution 1. Identify all of the RAW data dependencies in the following code. Which dependencies are data hazards that will be resolved by

More information

Eventual Consistency for CRDTs

Eventual Consistency for CRDTs Eventual Consistency for CRDTs Radha Jagadeesan DePaul University Chicago, USA ESOP 2018 James Riely 1/22 CRDTs? 2/22 CRDTs? C = blah blah R = mumble DT = Data Type 2/22 Data Type An abstract data type

More information

UMBC. At the system level, DFT includes boundary scan and analog test bus. The DFT techniques discussed focus on improving testability of SAFs.

UMBC. At the system level, DFT includes boundary scan and analog test bus. The DFT techniques discussed focus on improving testability of SAFs. Overview Design for testability(dft) makes it possible to: Assure the detection of all faults in a circuit. Reduce the cost and time associated with test development. Reduce the execution time of performing

More information

DES. 4. Petri Nets. Introduction. Different Classes of Petri Net. Petri net properties. Analysis of Petri net models

DES. 4. Petri Nets. Introduction. Different Classes of Petri Net. Petri net properties. Analysis of Petri net models 4. Petri Nets Introduction Different Classes of Petri Net Petri net properties Analysis of Petri net models 1 Petri Nets C.A Petri, TU Darmstadt, 1962 A mathematical and graphical modeling method. Describe

More information

Compositional System Security with Interface-Confined Adversaries

Compositional System Security with Interface-Confined Adversaries MFPS 2010 Compositional System Security with Interface-Confined Adversaries Deepak Garg, Jason Franklin, Dilsun Kaynar, Anupam Datta CyLab, Carnegie Mellon University Pittsburgh PA, USA Abstract This paper

More information

CPU SCHEDULING RONG ZHENG

CPU SCHEDULING RONG ZHENG CPU SCHEDULING RONG ZHENG OVERVIEW Why scheduling? Non-preemptive vs Preemptive policies FCFS, SJF, Round robin, multilevel queues with feedback, guaranteed scheduling 2 SHORT-TERM, MID-TERM, LONG- TERM

More information

INF Models of concurrency

INF Models of concurrency INF4140 - Models of concurrency RPC and Rendezvous INF4140 Lecture 15. Nov. 2017 RPC and Rendezvous Outline More on asynchronous message passing interacting processes with different patterns of communication

More information

Leveraging Transactional Memory for a Predictable Execution of Applications Composed of Hard Real-Time and Best-Effort Tasks

Leveraging Transactional Memory for a Predictable Execution of Applications Composed of Hard Real-Time and Best-Effort Tasks Leveraging Transactional Memory for a Predictable Execution of Applications Composed of Hard Real-Time and Best-Effort Tasks Stefan Metzlaff, Sebastian Weis, and Theo Ungerer Department of Computer Science,

More information

Combinational Logic Design Combinational Functions and Circuits

Combinational Logic Design Combinational Functions and Circuits Combinational Logic Design Combinational Functions and Circuits Overview Combinational Circuits Design Procedure Generic Example Example with don t cares: BCD-to-SevenSegment converter Binary Decoders

More information

INF 4140: Models of Concurrency Series 3

INF 4140: Models of Concurrency Series 3 Universitetet i Oslo Institutt for Informatikk PMA Olaf Owe, Martin Steffen, Toktam Ramezani INF 4140: Models of Concurrency Høst 2016 Series 3 14. 9. 2016 Topic: Semaphores (Exercises with hints for solution)

More information

Lecture 7: Sequential Networks

Lecture 7: Sequential Networks CSE 140: Components and Design Techniques for Digital Systems Lecture 7: Sequential Networks CK Cheng Dept. of Computer Science and Engineering University of California, San Diego 1 Part II: Sequential

More information

CMP N 301 Computer Architecture. Appendix C

CMP N 301 Computer Architecture. Appendix C CMP N 301 Computer Architecture Appendix C Outline Introduction Pipelining Hazards Pipelining Implementation Exception Handling Advanced Issues (Dynamic Scheduling, Out of order Issue, Superscalar, etc)

More information

This Unit: Scheduling (Static + Dynamic) CIS 501 Computer Architecture. Readings. Review Example

This Unit: Scheduling (Static + Dynamic) CIS 501 Computer Architecture. Readings. Review Example This Unit: Scheduling (Static + Dnamic) CIS 50 Computer Architecture Unit 8: Static and Dnamic Scheduling Application OS Compiler Firmware CPU I/O Memor Digital Circuits Gates & Transistors! Previousl:!

More information

I reduce synchronization costs in parallel computations by

I reduce synchronization costs in parallel computations by IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 8, NO. 4, APRIL 1997 337 Isotach Networks Paul F. Reynolds, Jr., Member, /E Computer Society Craig Williams, and Raymond R. Wagner, Jr. Abstract-We

More information

14:332:231 DIGITAL LOGIC DESIGN

14:332:231 DIGITAL LOGIC DESIGN 14:332:231 IGITL LOGI ESIGN Ivan Marsic, Rutgers University Electrical & omputer Engineering all 2013 Lecture #17: locked Synchronous -Machine nalysis locked Synchronous Sequential ircuits lso known as

More information

Unit 9. Multiplexers, Decoders, and Programmable Logic Devices. Unit 9 1

Unit 9. Multiplexers, Decoders, and Programmable Logic Devices. Unit 9 1 Unit 9 Multiplexers, ecoders, and Programmable Logic evices Unit 9 Outline Multiplexers Three state buffers ecoders Encoders Read Only Memories (ROMs) Programmable logic devices ield Programmable Gate

More information

CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 17: Dynamic Sequential Circuits And Timing Issues

CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 17: Dynamic Sequential Circuits And Timing Issues CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 17: Dynamic Sequential Circuits And Timing Issues [Adapted from Rabaey s Digital Integrated Circuits, Second Edition, 2003 J. Rabaey, A. Chandrakasan,

More information

Parallel Programming in C with MPI and OpenMP

Parallel Programming in C with MPI and OpenMP Parallel Programming in C with MPI and OpenMP Michael J. Quinn Chapter 13 Finite Difference Methods Outline n Ordinary and partial differential equations n Finite difference methods n Vibrating string

More information

Robustness of Information Systems and Technologies

Robustness of Information Systems and Technologies Robustness of Information Systems and Technologies MARK BURGIN Department of Mathematics University of California, Los Angeles Los Angeles, CA 90095 USA mburgin@math.ucla.edu Abstract: - Robustness of

More information

arxiv: v2 [cs.dc] 20 Nov 2017

arxiv: v2 [cs.dc] 20 Nov 2017 The Optimal Pessimistic Transactional Memory Algorithm arxiv:1605.01361v2 [cs.dc] 20 Nov 2017 Paweł T. Wojciechowski, Konrad Siek {pawel.t.wojciechowski,konrad.siek}@cs.put.edu.pl Institute of Computing

More information

Automata-Theoretic Model Checking of Reactive Systems

Automata-Theoretic Model Checking of Reactive Systems Automata-Theoretic Model Checking of Reactive Systems Radu Iosif Verimag/CNRS (Grenoble, France) Thanks to Tom Henzinger (IST, Austria), Barbara Jobstmann (CNRS, Grenoble) and Doron Peled (Bar-Ilan University,

More information

Review: From problem to parallel algorithm

Review: From problem to parallel algorithm Review: From problem to parallel algorithm Mathematical formulations of interesting problems abound Poisson s equation Sources: Electrostatics, gravity, fluid flow, image processing (!) Numerical solution:

More information

Cuts. Cuts. Consistent cuts and consistent global states. Global states and cuts. A cut C is a subset of the global history of H

Cuts. Cuts. Consistent cuts and consistent global states. Global states and cuts. A cut C is a subset of the global history of H Cuts Cuts A cut C is a subset of the global history of H C = h c 1 1 hc 2 2...hc n n A cut C is a subset of the global history of H The frontier of C is the set of events e c 1 1,ec 2 2,...ec n n C = h

More information

Lower Bound on the Step Complexity of Anonymous Binary Consensus

Lower Bound on the Step Complexity of Anonymous Binary Consensus Lower Bound on the Step Complexity of Anonymous Binary Consensus Hagit Attiya 1, Ohad Ben-Baruch 2, and Danny Hendler 3 1 Department of Computer Science, Technion, hagit@cs.technion.ac.il 2 Department

More information

ECE 571 Advanced Microprocessor-Based Design Lecture 10

ECE 571 Advanced Microprocessor-Based Design Lecture 10 ECE 571 Advanced Microprocessor-Based Design Lecture 10 Vince Weaver http://web.eece.maine.edu/~vweaver vincent.weaver@maine.edu 23 February 2017 Announcements HW#5 due HW#6 will be posted 1 Oh No, More

More information

Dartmouth Computer Science Technical Report TR Efficient Wait-Free Implementation of Multiword LL/SC Variables

Dartmouth Computer Science Technical Report TR Efficient Wait-Free Implementation of Multiword LL/SC Variables Dartmouth Computer Science Technical Report TR2004-523 Efficient Wait-Free Implementation of Multiword LL/SC Variables Prasad Jayanti and Srdjan Petrovic Department of Computer Science Dartmouth College

More information