Loop Scheduling and Software Pipelining \course\cpeg421-08s\topic-7.ppt 1

Size: px
Start display at page:

Download "Loop Scheduling and Software Pipelining \course\cpeg421-08s\topic-7.ppt 1"

Transcription

1 Loop Scheduling and Software Pipelining \course\cpeg421-08s\topic-7.ppt 1

2 Reading List Slides: Topic 7 and 7a Other papers as assigned in class or homework: \course\cpeg421-08s\topic-7.ppt 2

3 ABET Outcome Ability to apply knowledge of basic code generation techniques, e.g. Loop scheduling e.g. software pipelining techniques to solve code generation problems. An ability to identify, formulate and solve loops scheduling problems using software pipelining techniques Ability to analyze the basic algorithms on the above techniques and conduct experiments to show their effectiveness. Ability to use a modern compiler development platform and tools for the practice of above. A Knowledge on contemporary issues on this topic \course\cpeg421-08s\topic-7.ppt 3

4 Outline Brief overview Problem formulation of the modulo scheduling problem Solution methods Summary \course\cpeg421-08s\topic-7.ppt 4

5 General Compiler Framework Source ME Inter-Procedural Optimization (IPA) Loop Nest Optimization (LNO) Global Optimization (OPT) Innermost Loop scheduling BE/CG Global inst scheduling Reg alloc Local inst scheduling Arch Models Good IPO Good LNO Good global optimization Good integration of IPO/LNO/OPT Smooth information passing between FE and CG Complete and flexible support of inner-loop scheduling (SWP), instruction scheduling and register allocation Executable \course\cpeg421-08s\topic-7.ppt 5

6 Questions? How to formulate the loop scheduling problem? How to model it? How to solve it? \course\cpeg421-08s\topic-7.ppt 6

7 Questions (cont d) Instruction scheduling for code without loops a review Dependence graphs may become cyclic! So, critical path length is less obvious! Is it becoming harder!(?) What new insights are required to formulate and solve it? \course\cpeg421-08s\topic-7.ppt 7

8 Challenges of Loop Scheduling A DDG With Cycles Right strategy? \course\cpeg421-08s\topic-7.ppt 8

9 Observations Execution of good loops tend to be regular and repetitive - a pattern may appear This gives cyclic scheduling problem a new twist! How to efficiently derive a pattern? \course\cpeg421-08s\topic-7.ppt 9

10 Problem Formulation (I) Given a weighted dependence graph, derive a schedule which is time-optimal under a machine model M. Def: A schedule S of a loop L is time-optimal if among all legal schedules of L, no other schedule that is faster than S. Note: There may be more than one time-optimal schedule \course\cpeg421-08s\topic-7.ppt 10

11 A Short Tour on Data Dependence Graphs for Loops \course\cpeg421-08s\topic-7.ppt 11

12 Basic Concept and Motivation Data dependence between 2 accesses The same memory location Exist an execution path between them One of them is a write Three types of data dependence Dependence graphs Things are not simple when dealing with loops \course\cpeg421-08s\topic-7.ppt 12

13 Types of Data Dependence Flow dependence X = = X... δ Anti-dependence X = = X δ -1 Output dependence X = X =... 0 δ \course\cpeg421-08s\topic-7.ppt 13

14 Data Dependence Example 1: S1: A = 0 S2: B = A S3: C = A + D S4: D = 2 S1 S2 S3 S4 S x δ S y S y depends on S x \course\cpeg421-08s\topic-7.ppt 14

15 Data Dependence Con d Example 2: S1: A = 0 S2: B = A S3: A = B + 1 S4: C = A S1 S2 S3 S4 S 1 S 2 δ δ 0 S 3 : Output-dep -1 S 3 : anti-dep \course\cpeg421-08s\topic-7.ppt 15

16 Should we consider input dependence? = X = X Is the reading of the same X important? Well, it may be! (if we intend to group the 2 reads together for cache optimization!) \course\cpeg421-08s\topic-7.ppt 16

17 Subscript Variables Extension of def-use chains to employ a more precise treatment of arrays, especially in iterative loops. DO I = 1, N A(I + 1) = X(I + 1) + B(I) X(I) = A(I) * 5 ENDDO \course\cpeg421-08s\topic-7.ppt 17

18 Dependence Graph Con d Applications - register allocation - instruction scheduling - loop scheduling - vectorization - parallelization - memory hierarchy optimization \course\cpeg421-08s\topic-7.ppt 18

19 Data Dependence in Loops An Example Find the dependence relations due to the array X in the program below: (1) for I = 2 to 9 do (2) X[I] = Y[I] + Z[I] (3) A[I] = X[I-1] + 1 (4) end for Solution To find the data dependence relations in a simple loop, we can unroll the loop and see which statement instances depend on which others: I = 2 I = 3 I = 4 (2) X[2]=Y[2]+Z[2] (3) A[2]=X[1]+1 X[3] =Y[3]+Z[3] A[3] =X[2]+1 X[4]=Y[4]+Z[4] A[4]=X[3] \course\cpeg421-08s\topic-7.ppt 19

20 Data Dependence in Loops Con d In our example, there is a loop-carried, lexically forward flow dependence relation. S 2 S 3 (1) Dependence distance = 1 Data dependence graph for statements in a loop. - Loop-carried vs loop-independent - Lexical-forward vs lexical backward \course\cpeg421-08s\topic-7.ppt 20

21 An Example for i = 0 to N - 1 do a: a[i] = a[i - 1] + R[i]; b: b[i] = a[i] + c[i - 1]; c: c[i] = b[i] + 1; a b time i = 0 i = 1 i = 2 0 a end; 1 b Note: We use a token here to represent a flow dependence of distance = 1 II time 2 c a 3 b 4 c a So, iteration interval II = 2 iterations i = 3... c Assume each operation takes 1 cycle and there is only one addition unit! \course\cpeg421-08s\topic-7.ppt 21

22 Software Pipeline: Concept Software Pipeline is a technique that reduces the execution time of important loops by interweaving operations from many iterations to optimize the use of resources. stf fadds ldf cmp bg sub time \course\cpeg421-08s\topic-7.ppt 22

23 The Structrure of the SWP code % prologue a[0] = a[-1] + R[0]; % pattern for i = 0 to N-2 do b[i] = a[i] + c[i -1]; a[i + 1] = a[i] + R[i + 1]; c[i] = b[i] + 1; end; % epilogue b[n - 1] = a[n - 1] + c[n - 2]; c[n - 1] = b[n - 1] + 1 prologue b i, c i, a i+1 epilogue \course\cpeg421-08s\topic-7.ppt 23

24 Software Pipeline (Cont d) What limits the speed of a loop? Data dependencies: recurrence initiation interval (rec_mii) Processor resources: resource initiation interval (res_mii) Initiation interval stf fadds ldf cmp bg sub time \course\cpeg421-08s\topic-7.ppt 24

25 Previous Approaches Approach I (Operational): Emulate the loop execution under the machine model and a pattern will eventually occur [AikenNic88, EbciogluNic89, GaoEtAl91] Approach II (Periodic scheduling): Specify the scheduling problem into a periodical scheduling problem and find optimal solution [Lam88, RauEtAl81,GovindAltmanGao94] \course\cpeg421-08s\topic-7.ppt 25

26 Periodic Schedule (Modulo Scheduling) The time (cycle) when the I-th instance of the operation v is scheduled : t(i, v) = T * i+ A[v] where T = II so t(i + 1, v) -t(i, v) = T*(i + 1) T*(i) = T For our example: t(i, v) = 2i + A[v] where A(a) = 1 A(b) = 0 A(c) = 1 Question: Is this an optimal schedule? \course\cpeg421-08s\topic-7.ppt 26

27 Periodic Schedule (cont d) Yes, the schedule t(i, v) = 2i + A[v] is time-optimal! With II = \course\cpeg421-08s\topic-7.ppt 27

28 Restate the problem Given a DDG of a loop L, how to determine the fastest computation rate of L -- also called minimum initiation interval (MII)? Hint: Consider the Critical Cycles as well as critical resource usage in L \course\cpeg421-08s\topic-7.ppt 28

29 Recurrence MII -- RecMII An Example \course\cpeg421-08s\topic-7.ppt 29

30 An Example (Revisit) : RecMII for i = 0 to N - 1 do a: a[i] = a[i - 1] + R[i]; b: b[i] = a[i] + c[i - 1]; c: c[i] = b[i] + 1; end; a b II time time i = 0 i = 1 i = 2 0 a 1 b 2 c a 3 b 4 c a So, RecMII = 2 iterations i = 3... c Assume each operation takes 1 cycle and there is only one addition unit! \course\cpeg421-08s\topic-7.ppt 30

31 Hint: now one must think about deadlines. Maximum Computation Rate Theorem: The maximum computation rate of a loop is bounded by the following ratio r opt (C) = min{ Dc/Wc} where C is a dependence cycle, Dc is the total dependence distance along C and Wc is total execution time of C. i.e. Dc Wc = c = c di wi (d i is the dependence distance along the edge i in C) (w i is the edge weight along the edge i in C) \course\cpeg421-08s\topic-7.ppt 31

32 RecMII (Contn d) And the optimal period MII = T opt = 1/r opt Def: A cycle is critical if the period of the cycle equal to MII (We should write it as RecMII) Note: a loop may have multiple critical cycles! \course\cpeg421-08s\topic-7.ppt 32

33 An Example (Revisit) : RecMII for i = 0 to N - 1 do a: a[i] = a[i - 1] + R[i]; b: b[i] = a[i] + c[i - 1]; c: c[i] = b[i] + 1; end; a b II time time i = 0 i = 1 i = 2 0 a 1 b 2 c a 3 b 4 c a So, RecMII = 2 iterations i = 3... c Assume each operation takes 1 cycle and there is only one addition unit! \course\cpeg421-08s\topic-7.ppt 33

34 How About Machine Resource and Register Constraints? Numbers and types of FUs, etc, (ResMII) Number of registers \course\cpeg421-08s\topic-7.ppt 34

35 Software Pipelining Review of previous work [RauFisher93] [Rau94] Minimum initiation interval [MII] MII = max {RecMII, ResMII} where RecMII: determined by critical recurrence cycles in DDG ResMII: determined by resource constraints (#s and types of FUs) \course\cpeg421-08s\topic-7.ppt 35

36 Consider a simple example of n nodes and 1 FU -- what is ResMII? The Resource Constraint MII (ResMII) Can be calculated by totaling, for each resource, the usage requirement imposed by one iteration of the loop can be done by bin-packing a resource reservation table (expensive) usually derive a lower-bound is enough as a starting point to begin the search process More price with complex hardware pipelines? \course\cpeg421-08s\topic-7.ppt 36

37 Example 1: Reservation Table - An Example 1 Stage X X X X X 2 3 X (a) a Pipeline M (b) The Reservation Table of M \course\cpeg421-08s\topic-7.ppt 37

38 A Quiz? What is the RT of a fully pipelined adder with 3 stages? How about an unpipelined adder? How to computer ResMII for each case? \course\cpeg421-08s\topic-7.ppt 38

39 Modulo Reservation Table The modulo reservation table only has length = II Each entry in the table records a sequence of reservations corresponds a sequence of slots every II cycles Hints: think about your weekly calendar \course\cpeg421-08s\topic-7.ppt 39

40 Modulo Scheduling MII II = II + 1 Choose Next Operation X failed try to place x in a time slot i in II A legal schedule is found succeed Remove from L is L =<>? No \course\cpeg421-08s\topic-7.ppt 40

41 Heuristic Method for Modulo Scheduling Why a simple variant list scheduling may not work? Hint: consider the deadline constraints of operations in a cycle \course\cpeg421-08s\topic-7.ppt 41

42 Counter Example I: List Scheduling May Fail! (a) [1] A 4 C 2 B 2 D 4 (b) MII = RecMII = 4 Note: if simple list scheduling is used B cannot be scheduled due to the deadline set by scheduling C: deadlock! A B C D (c) MII = RecMII = 4 Note: we cannot fire C as early as possible! \course\cpeg421-08s\topic-7.ppt A B C D

43 Example I (Cont d) In previous figure, We show an example demonstrating the problem with greedy scheduling in the presence of recurrences. (a) The data dependence graph with a cycle. (b) The resulting partial schedule when C is scheduled greedily. B cannot be scheduled. (c) The resulting valid schedule when C is not scheduled greedyly delayed to two cycles later \course\cpeg421-08s\topic-7.ppt 43

44 Example 2: Problems with Greedy schedule due to ResMII A1 C2 A3 A4 M M Adder Mult Bus A1 M6 0 1 A4 C2 2 A Adder Mult Bus A1 M6 C2 A4 A3 M5 ResMII = 6 ResMII = 6 (a) DDG A: non-pipelined adders M: non-pipelined multipliers (b) A greedy schedule which (c) a non-greedy schedule cannot schedule A4 with which achieves ResMII ResMII \course\cpeg421-08s\topic-7.ppt 44

45 Example 2 (Cont d) In previous figure, We show an example demonstrating the problem with greedy scheduling in the presence of complex reservation tables. (a) The data dependence graph without cycle. (b) The resulting partial schedule when A1, M6, C2, and A3 are scheduled greedily. A4 cannot be scheduled. (c) The resulting valid schedule when A3 is scheduled one cycle later \course\cpeg421-08s\topic-7.ppt 45

46 Example 3: infeasibility of MII Con d A1 2 M2 3 MA3 5 Adder Mult Bus Adder Mult Bus Adder Mult Bus Note: ResMII = 2. But is there a legal schedule under II = 2? Note: The presence of complex reservation tables \course\cpeg421-08s\topic-7.ppt 46 (a)

47 Example 3 (cont d) Adder Mult Bus A1 M2 M2 Adder Mult Bus A1 M2 A1 MA3 A1 MA3 M2 MII = 2 II = 2 (b) cannot fit MA3 under II=2 MII = 2 II = 3 (c) A feasible schedule for II=3 Note: It is possible that there is no valid schedule at MII! \course\cpeg421-08s\topic-7.ppt 47

48 Infeasibility of MII Con d The previous slide shows an example demonstrating the infeasibility of the MII in the presence of complex reservation tables. (a) The three operations and their reservation tables. (b) The MRT corresponding to the dead-end partial schedule for an II of 2 after A1 and M2 have been scheduled. (c) The MRT corresponding to a valid schedule for an II of \course\cpeg421-08s\topic-7.ppt 48

49 Example 4: Infeasibility of MII Assume: fully pipelined adders A1 A2 2 2 A3 2 Adder Mult Bus 2 A3 2 [2] A4 2 2 A4 2 ResMII = 4 RecMII = 4 (a) Note: It is possible that there is no valid schedule at MII! And, the reservation table here is simple! \course\cpeg421-08s\topic-7.ppt 49 (b)

50 Example 4 (cont d) The previous slides shows an example demonstrating the infeasibility of the MII due to the interaction between the recurrence constraints and the resource usage constraints. (a) The data dependence graph. (b) The MRT corresponding to the dead-end partial schedule for an II of 4 after A1 and A2 have been scheduled \course\cpeg421-08s\topic-7.ppt 50

51 How to derive a best feasible schedule? It is possible do so via exhaustive search. But, it is expensive! \course\cpeg421-08s\topic-7.ppt 51

52 Software Pipelining A Taxonomy of Software Pipelining Operational Approach Periodic Scheduling (Modulo Scheduling) FSA Based Method (Model hardware pipeline with sharing and hazards) Heuristic (Aiken88, AikenNic88, Ebcioglu89, etc.) Formal Model (GaoWonNin91) PLDI'91 In-Exact Method (Heuristic) (RauGla81, Lam88, RauEtAI92, Huff93, DehnertTow93, Rau94, WanEis93, StouchininEtAI00) Basic Formulation (DongenGao92) CONPAR'92 Register Optimal (Ninggao91, NingGao93, Ning93) POPL'93 Exact Method ILP based Resource Constrained (GovindAITGao94) Micro27 "Showdown" (RuttenbergGao StouchininWoody96) PLDI'96 Exhaustive Search EuroPar'96 FSA Co-Scheduling formulation (GovindAltmanGao'96) FSA Construction Method (GovindAltmanGao98) FSA Heuristic/optimization (ZhangGovindRyanGao99) Theory of Co-Scheduling (GovindAltmanGao00) Resource & Register (GovindAITGao95, Altman95, PLDI'95 EichenbergerDav95) (Altman95) \course\cpeg421-08s\topic-7.ppt 52

53 Advanced Topics Consider register constraints Realistic pipeline architecture constraints (with structural hazards) Loop body with conditionals Multi-Dimensional Loops \course\cpeg421-08s\topic-7.ppt 53

Announcements PA2 due Friday Midterm is Wednesday next week, in class, one week from today

Announcements PA2 due Friday Midterm is Wednesday next week, in class, one week from today Loop Transformations Announcements PA2 due Friday Midterm is Wednesday next week, in class, one week from today Today Recall stencil computations Intro to loop transformations Data dependencies between

More information

Loop Interchange. Loop Transformations. Taxonomy. do I = 1, N do J = 1, N S 1 A(I,J) = A(I-1,J) + 1 enddo enddo. Loop unrolling.

Loop Interchange. Loop Transformations. Taxonomy. do I = 1, N do J = 1, N S 1 A(I,J) = A(I-1,J) + 1 enddo enddo. Loop unrolling. Advanced Topics Which Loops are Parallel? review Optimization for parallel machines and memory hierarchies Last Time Dependence analysis Today Loop transformations An example - McKinley, Carr, Tseng loop

More information

Fortran program + Partial data layout specifications Data Layout Assistant.. regular problems. dynamic remapping allowed Invoked only a few times Not part of the compiler Can use expensive techniques HPF

More information

Advanced Restructuring Compilers. Advanced Topics Spring 2009 Prof. Robert van Engelen

Advanced Restructuring Compilers. Advanced Topics Spring 2009 Prof. Robert van Engelen Advanced Restructuring Compilers Advanced Topics Spring 2009 Prof. Robert van Engelen Overview Data and control dependences The theory and practice of data dependence analysis K-level loop-carried dependences

More information

Saturday, April 23, Dependence Analysis

Saturday, April 23, Dependence Analysis Dependence Analysis Motivating question Can the loops on the right be run in parallel? i.e., can different processors run different iterations in parallel? What needs to be true for a loop to be parallelizable?

More information

NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1

NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1 NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1 Multi-processor vs. Multi-computer architecture µp vs. DSP RISC vs. DSP RISC Reduced-instruction-set Register-to-register operation Higher throughput by using

More information

DSP Design Lecture 7. Unfolding cont. & Folding. Dr. Fredrik Edman.

DSP Design Lecture 7. Unfolding cont. & Folding. Dr. Fredrik Edman. SP esign Lecture 7 Unfolding cont. & Folding r. Fredrik Edman fredrik.edman@eit.lth.se Unfolding Unfolding creates a program with more than one iteration, J=unfolding factor Unfolding is a structured way

More information

Topic-I-C Dataflow Analysis

Topic-I-C Dataflow Analysis Topic-I-C Dataflow Analysis 2012/3/2 \course\cpeg421-08s\topic4-a.ppt 1 Global Dataflow Analysis Motivation We need to know variable def and use information between basic blocks for: constant folding dead-code

More information

Dependence Analysis. Dependence Examples. Last Time: Brief introduction to interprocedural analysis. do I = 2, 100 A(I) = A(I-1) + 1 enddo

Dependence Analysis. Dependence Examples. Last Time: Brief introduction to interprocedural analysis. do I = 2, 100 A(I) = A(I-1) + 1 enddo Dependence Analysis Dependence Examples Last Time: Brief introduction to interprocedural analysis Today: Optimization for parallel machines and memory hierarchies Dependence analysis Loop transformations

More information

INF2270 Spring Philipp Häfliger. Lecture 8: Superscalar CPUs, Course Summary/Repetition (1/2)

INF2270 Spring Philipp Häfliger. Lecture 8: Superscalar CPUs, Course Summary/Repetition (1/2) INF2270 Spring 2010 Philipp Häfliger Summary/Repetition (1/2) content From Scalar to Superscalar Lecture Summary and Brief Repetition Binary numbers Boolean Algebra Combinational Logic Circuits Encoder/Decoder

More information

Worst-Case Execution Time Analysis. LS 12, TU Dortmund

Worst-Case Execution Time Analysis. LS 12, TU Dortmund Worst-Case Execution Time Analysis Prof. Dr. Jian-Jia Chen LS 12, TU Dortmund 02, 03 May 2016 Prof. Dr. Jian-Jia Chen (LS 12, TU Dortmund) 1 / 53 Most Essential Assumptions for Real-Time Systems Upper

More information

Embedded Systems Design: Optimization Challenges. Paul Pop Embedded Systems Lab (ESLAB) Linköping University, Sweden

Embedded Systems Design: Optimization Challenges. Paul Pop Embedded Systems Lab (ESLAB) Linköping University, Sweden of /4 4 Embedded Systems Design: Optimization Challenges Paul Pop Embedded Systems Lab (ESLAB) Linköping University, Sweden Outline! Embedded systems " Example area: automotive electronics " Embedded systems

More information

Pipelining. Traditional Execution. CS 365 Lecture 12 Prof. Yih Huang. add ld beq CS CS 365 2

Pipelining. Traditional Execution. CS 365 Lecture 12 Prof. Yih Huang. add ld beq CS CS 365 2 Pipelining CS 365 Lecture 12 Prof. Yih Huang CS 365 1 Traditional Execution 1 2 3 4 1 2 3 4 5 1 2 3 add ld beq CS 365 2 1 Pipelined Execution 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5

More information

Loop Parallelization Techniques and dependence analysis

Loop Parallelization Techniques and dependence analysis Loop Parallelization Techniques and dependence analysis Data-Dependence Analysis Dependence-Removing Techniques Parallelizing Transformations Performance-enchancing Techniques 1 When can we run code in

More information

Exploring the Potential of Instruction-Level Parallelism of Exposed Datapath Architectures with Buffered Processing Units

Exploring the Potential of Instruction-Level Parallelism of Exposed Datapath Architectures with Buffered Processing Units Exploring the Potential of Instruction-Level Parallelism of Exposed Datapath Architectures with Buffered Processing Units Anoop Bhagyanath and Klaus Schneider Embedded Systems Chair University of Kaiserslautern

More information

Energy-efficient scheduling

Energy-efficient scheduling Energy-efficient scheduling Guillaume Aupy 1, Anne Benoit 1,2, Paul Renaud-Goud 1 and Yves Robert 1,2,3 1. Ecole Normale Supérieure de Lyon, France 2. Institut Universitaire de France 3. University of

More information

CSCI-564 Advanced Computer Architecture

CSCI-564 Advanced Computer Architecture CSCI-564 Advanced Computer Architecture Lecture 8: Handling Exceptions and Interrupts / Superscalar Bo Wu Colorado School of Mines Branch Delay Slots (expose control hazard to software) Change the ISA

More information

Compiler Optimisation

Compiler Optimisation Compiler Optimisation 8 Dependence Analysis Hugh Leather IF 1.18a hleather@inf.ed.ac.uk Institute for Computing Systems Architecture School of Informatics University of Edinburgh 2018 Introduction This

More information

The Data-Dependence graph of Adjoint Codes

The Data-Dependence graph of Adjoint Codes The Data-Dependence graph of Adjoint Codes Laurent Hascoët INRIA Sophia-Antipolis November 19th, 2012 (INRIA Sophia-Antipolis) Data-Deps of Adjoints 19/11/2012 1 / 14 Data-Dependence graph of this talk

More information

Multiprocessor Scheduling I: Partitioned Scheduling. LS 12, TU Dortmund

Multiprocessor Scheduling I: Partitioned Scheduling. LS 12, TU Dortmund Multiprocessor Scheduling I: Partitioned Scheduling Prof. Dr. Jian-Jia Chen LS 12, TU Dortmund 22/23, June, 2015 Prof. Dr. Jian-Jia Chen (LS 12, TU Dortmund) 1 / 47 Outline Introduction to Multiprocessor

More information

ENEE350 Lecture Notes-Weeks 14 and 15

ENEE350 Lecture Notes-Weeks 14 and 15 Pipelining & Amdahl s Law ENEE350 Lecture Notes-Weeks 14 and 15 Pipelining is a method of processing in which a problem is divided into a number of sub problems and solved and the solu8ons of the sub problems

More information

Chapter 5. Digital Design and Computer Architecture, 2 nd Edition. David Money Harris and Sarah L. Harris. Chapter 5 <1>

Chapter 5. Digital Design and Computer Architecture, 2 nd Edition. David Money Harris and Sarah L. Harris. Chapter 5 <1> Chapter 5 Digital Design and Computer Architecture, 2 nd Edition David Money Harris and Sarah L. Harris Chapter 5 Chapter 5 :: Topics Introduction Arithmetic Circuits umber Systems Sequential Building

More information

Special Nodes for Interface

Special Nodes for Interface fi fi Special Nodes for Interface SW on processors Chip-level HW Board-level HW fi fi C code VHDL VHDL code retargetable compilation high-level synthesis SW costs HW costs partitioning (solve ILP) cluster

More information

Performance, Power & Energy. ELEC8106/ELEC6102 Spring 2010 Hayden Kwok-Hay So

Performance, Power & Energy. ELEC8106/ELEC6102 Spring 2010 Hayden Kwok-Hay So Performance, Power & Energy ELEC8106/ELEC6102 Spring 2010 Hayden Kwok-Hay So Recall: Goal of this class Performance Reconfiguration Power/ Energy H. So, Sp10 Lecture 3 - ELEC8106/6102 2 PERFORMANCE EVALUATION

More information

Data Structures in Java

Data Structures in Java Data Structures in Java Lecture 21: Introduction to NP-Completeness 12/9/2015 Daniel Bauer Algorithms and Problem Solving Purpose of algorithms: find solutions to problems. Data Structures provide ways

More information

CS 700: Quantitative Methods & Experimental Design in Computer Science

CS 700: Quantitative Methods & Experimental Design in Computer Science CS 700: Quantitative Methods & Experimental Design in Computer Science Sanjeev Setia Dept of Computer Science George Mason University Logistics Grade: 35% project, 25% Homework assignments 20% midterm,

More information

Evaluation and Validation

Evaluation and Validation Evaluation and Validation Jian-Jia Chen (Slides are based on Peter Marwedel) TU Dortmund, Informatik 12 Germany Springer, 2010 2016 年 01 月 05 日 These slides use Microsoft clip arts. Microsoft copyright

More information

Introduction to Computer Science and Programming for Astronomers

Introduction to Computer Science and Programming for Astronomers Introduction to Computer Science and Programming for Astronomers Lecture 8. István Szapudi Institute for Astronomy University of Hawaii March 7, 2018 Outline Reminder 1 Reminder 2 3 4 Reminder We have

More information

Outline. policies for the first part. with some potential answers... MCS 260 Lecture 10.0 Introduction to Computer Science Jan Verschelde, 9 July 2014

Outline. policies for the first part. with some potential answers... MCS 260 Lecture 10.0 Introduction to Computer Science Jan Verschelde, 9 July 2014 Outline 1 midterm exam on Friday 11 July 2014 policies for the first part 2 questions with some potential answers... MCS 260 Lecture 10.0 Introduction to Computer Science Jan Verschelde, 9 July 2014 Intro

More information

COVER SHEET: Problem#: Points

COVER SHEET: Problem#: Points EEL 4712 Midterm 3 Spring 2017 VERSION 1 Name: UFID: Sign here to give permission for your test to be returned in class, where others might see your score: IMPORTANT: Please be neat and write (or draw)

More information

Lecture 10: Big-Oh. Doina Precup With many thanks to Prakash Panagaden and Mathieu Blanchette. January 27, 2014

Lecture 10: Big-Oh. Doina Precup With many thanks to Prakash Panagaden and Mathieu Blanchette. January 27, 2014 Lecture 10: Big-Oh Doina Precup With many thanks to Prakash Panagaden and Mathieu Blanchette January 27, 2014 So far we have talked about O() informally, as a way of capturing the worst-case computation

More information

Register Allocation. Maryam Siahbani CMPT 379 4/5/2016 1

Register Allocation. Maryam Siahbani CMPT 379 4/5/2016 1 Register Allocation Maryam Siahbani CMPT 379 4/5/2016 1 Register Allocation Intermediate code uses unlimited temporaries Simplifying code generation and optimization Complicates final translation to assembly

More information

VLSI Signal Processing

VLSI Signal Processing VLSI Signal Processing Lecture 1 Pipelining & Retiming ADSP Lecture1 - Pipelining & Retiming (cwliu@twins.ee.nctu.edu.tw) 1-1 Introduction DSP System Real time requirement Data driven synchronized by data

More information

Classes of data dependence. Dependence analysis. Flow dependence (True dependence)

Classes of data dependence. Dependence analysis. Flow dependence (True dependence) Dependence analysis Classes of data dependence Pattern matching and replacement is all that is needed to apply many source-to-source transformations. For example, pattern matching can be used to determine

More information

Dependence analysis. However, it is often necessary to gather additional information to determine the correctness of a particular transformation.

Dependence analysis. However, it is often necessary to gather additional information to determine the correctness of a particular transformation. Dependence analysis Pattern matching and replacement is all that is needed to apply many source-to-source transformations. For example, pattern matching can be used to determine that the recursion removal

More information

Advanced Compiler Construction

Advanced Compiler Construction CS 526 Advanced Compiler Construction http://misailo.cs.illinois.edu/courses/cs526 DEPENDENCE ANALYSIS The slides adapted from Vikram Adve and David Padua Kinds of Data Dependence Direct Dependence X =

More information

Topic 17. Analysis of Algorithms

Topic 17. Analysis of Algorithms Topic 17 Analysis of Algorithms Analysis of Algorithms- Review Efficiency of an algorithm can be measured in terms of : Time complexity: a measure of the amount of time required to execute an algorithm

More information

EE382V: System-on-a-Chip (SoC) Design

EE382V: System-on-a-Chip (SoC) Design EE82V: SystemonChip (SoC) Design Lecture EE82V: SystemonaChip (SoC) Design Lecture Operation Scheduling Source: G. De Micheli, Integrated Systems Center, EPFL Synthesis and Optimization of Digital Circuits,

More information

Divisible Load Scheduling

Divisible Load Scheduling Divisible Load Scheduling Henri Casanova 1,2 1 Associate Professor Department of Information and Computer Science University of Hawai i at Manoa, U.S.A. 2 Visiting Associate Professor National Institute

More information

Real-Time Workload Models with Efficient Analysis

Real-Time Workload Models with Efficient Analysis Real-Time Workload Models with Efficient Analysis Advanced Course, 3 Lectures, September 2014 Martin Stigge Uppsala University, Sweden Fahrplan 1 DRT Tasks in the Model Hierarchy Liu and Layland and Sporadic

More information

EE382V-ICS: System-on-a-Chip (SoC) Design

EE382V-ICS: System-on-a-Chip (SoC) Design EE8VICS: SystemonChip (SoC) EE8VICS: SystemonaChip (SoC) Scheduling Source: G. De Micheli, Integrated Systems Center, EPFL Synthesis and Optimization of Digital Circuits, McGraw Hill, 00. Additional sources:

More information

Algorithms. NP -Complete Problems. Dong Kyue Kim Hanyang University

Algorithms. NP -Complete Problems. Dong Kyue Kim Hanyang University Algorithms NP -Complete Problems Dong Kyue Kim Hanyang University dqkim@hanyang.ac.kr The Class P Definition 13.2 Polynomially bounded An algorithm is said to be polynomially bounded if its worst-case

More information

ICS 233 Computer Architecture & Assembly Language

ICS 233 Computer Architecture & Assembly Language ICS 233 Computer Architecture & Assembly Language Assignment 6 Solution 1. Identify all of the RAW data dependencies in the following code. Which dependencies are data hazards that will be resolved by

More information

Analysis of Algorithms [Reading: CLRS 2.2, 3] Laura Toma, csci2200, Bowdoin College

Analysis of Algorithms [Reading: CLRS 2.2, 3] Laura Toma, csci2200, Bowdoin College Analysis of Algorithms [Reading: CLRS 2.2, 3] Laura Toma, csci2200, Bowdoin College Why analysis? We want to predict how the algorithm will behave (e.g. running time) on arbitrary inputs, and how it will

More information

' $ Dependence Analysis & % 1

' $ Dependence Analysis & % 1 Dependence Analysis 1 Goals - determine what operations can be done in parallel - determine whether the order of execution of operations can be altered Basic idea - determine a partial order on operations

More information

8-1: Backpropagation Prof. J.C. Kao, UCLA. Backpropagation. Chain rule for the derivatives Backpropagation graphs Examples

8-1: Backpropagation Prof. J.C. Kao, UCLA. Backpropagation. Chain rule for the derivatives Backpropagation graphs Examples 8-1: Backpropagation Prof. J.C. Kao, UCLA Backpropagation Chain rule for the derivatives Backpropagation graphs Examples 8-2: Backpropagation Prof. J.C. Kao, UCLA Motivation for backpropagation To do gradient

More information

Worst-Case Execution Time Analysis. LS 12, TU Dortmund

Worst-Case Execution Time Analysis. LS 12, TU Dortmund Worst-Case Execution Time Analysis Prof. Dr. Jian-Jia Chen LS 12, TU Dortmund 09/10, Jan., 2018 Prof. Dr. Jian-Jia Chen (LS 12, TU Dortmund) 1 / 43 Most Essential Assumptions for Real-Time Systems Upper

More information

Enumerate all possible assignments and take the An algorithm is a well-defined computational

Enumerate all possible assignments and take the An algorithm is a well-defined computational EMIS 8374 [Algorithm Design and Analysis] 1 EMIS 8374 [Algorithm Design and Analysis] 2 Designing and Evaluating Algorithms A (Bad) Algorithm for the Assignment Problem Enumerate all possible assignments

More information

A Simple Architectural Enhancement for Fast and Flexible Elliptic Curve Cryptography over Binary Finite Fields GF(2 m )

A Simple Architectural Enhancement for Fast and Flexible Elliptic Curve Cryptography over Binary Finite Fields GF(2 m ) A Simple Architectural Enhancement for Fast and Flexible Elliptic Curve Cryptography over Binary Finite Fields GF(2 m ) Stefan Tillich, Johann Großschädl Institute for Applied Information Processing and

More information

CS 4407 Algorithms Lecture 3: Iterative and Divide and Conquer Algorithms

CS 4407 Algorithms Lecture 3: Iterative and Divide and Conquer Algorithms CS 4407 Algorithms Lecture 3: Iterative and Divide and Conquer Algorithms Prof. Gregory Provan Department of Computer Science University College Cork 1 Lecture Outline CS 4407, Algorithms Growth Functions

More information

Data Dependences and Parallelization. Stanford University CS243 Winter 2006 Wei Li 1

Data Dependences and Parallelization. Stanford University CS243 Winter 2006 Wei Li 1 Data Dependences and Parallelization Wei Li 1 Agenda Introduction Single Loop Nested Loops Data Dependence Analysis 2 Motivation DOALL loops: loops whose iterations can execute in parallel for i = 11,

More information

Automatic Loop Interchange

Automatic Loop Interchange RETROSPECTIVE: Automatic Loop Interchange Randy Allen Catalytic Compilers 1900 Embarcadero Rd, #206 Palo Alto, CA. 94043 jra@catacomp.com Ken Kennedy Department of Computer Science Rice University Houston,

More information

Digital Integrated Circuits A Design Perspective. Arithmetic Circuits. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.

Digital Integrated Circuits A Design Perspective. Arithmetic Circuits. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic. Digital Integrated Circuits A Design Perspective Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic Arithmetic Circuits January, 2003 1 A Generic Digital Processor MEMORY INPUT-OUTPUT CONTROL DATAPATH

More information

Clock-driven scheduling

Clock-driven scheduling Clock-driven scheduling Also known as static or off-line scheduling Michal Sojka Czech Technical University in Prague, Faculty of Electrical Engineering, Department of Control Engineering November 8, 2017

More information

Algorithms. Outline! Approximation Algorithms. The class APX. The intelligence behind the hardware. ! Based on

Algorithms. Outline! Approximation Algorithms. The class APX. The intelligence behind the hardware. ! Based on 6117CIT - Adv Topics in Computing Sci at Nathan 1 Algorithms The intelligence behind the hardware Outline! Approximation Algorithms The class APX! Some complexity classes, like PTAS and FPTAS! Illustration

More information

Fall 2008 CSE Qualifying Exam. September 13, 2008

Fall 2008 CSE Qualifying Exam. September 13, 2008 Fall 2008 CSE Qualifying Exam September 13, 2008 1 Architecture 1. (Quan, Fall 2008) Your company has just bought a new dual Pentium processor, and you have been tasked with optimizing your software for

More information

Digital Integrated Circuits A Design Perspective. Arithmetic Circuits. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.

Digital Integrated Circuits A Design Perspective. Arithmetic Circuits. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic. Digital Integrated Circuits A Design Perspective Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic Arithmetic Circuits January, 2003 1 A Generic Digital Processor MEM ORY INPUT-OUTPUT CONTROL DATAPATH

More information

Digital Design. Register Transfer Specification And Design

Digital Design. Register Transfer Specification And Design Principles Of Digital Design Chapter 8 Register Transfer Specification And Design Chapter preview Boolean algebra 3 Logic gates and flip-flops 3 Finite-state machine 6 Logic design techniques 4 Sequential

More information

Algorithm Design CS 515 Fall 2015 Sample Final Exam Solutions

Algorithm Design CS 515 Fall 2015 Sample Final Exam Solutions Algorithm Design CS 515 Fall 2015 Sample Final Exam Solutions Copyright c 2015 Andrew Klapper. All rights reserved. 1. For the functions satisfying the following three recurrences, determine which is the

More information

Andrew Morton University of Waterloo Canada

Andrew Morton University of Waterloo Canada EDF Feasibility and Hardware Accelerators Andrew Morton University of Waterloo Canada Outline 1) Introduction and motivation 2) Review of EDF and feasibility analysis 3) Hardware accelerators and scheduling

More information

Lecture: Pipelining Basics

Lecture: Pipelining Basics Lecture: Pipelining Basics Topics: Performance equations wrap-up, Basic pipelining implementation Video 1: What is pipelining? Video 2: Clocks and latches Video 3: An example 5-stage pipeline Video 4:

More information

Professor Fearing EECS150/Problem Set Solution Fall 2013 Due at 10 am, Thu. Oct. 3 (homework box under stairs)

Professor Fearing EECS150/Problem Set Solution Fall 2013 Due at 10 am, Thu. Oct. 3 (homework box under stairs) Professor Fearing EECS50/Problem Set Solution Fall 203 Due at 0 am, Thu. Oct. 3 (homework box under stairs). (25 pts) List Processor Timing. The list processor as discussed in lecture is described in RT

More information

A point p is said to be dominated by point q if p.x=q.x&p.y=q.y 2 true. In RAM computation model, RAM stands for Random Access Model.

A point p is said to be dominated by point q if p.x=q.x&p.y=q.y 2 true. In RAM computation model, RAM stands for Random Access Model. In analysis the upper bound means the function grows asymptotically no faster than its largest term. 1 true A point p is said to be dominated by point q if p.x=q.x&p.y=q.y 2 true In RAM computation model,

More information

Datapath Component Tradeoffs

Datapath Component Tradeoffs Datapath Component Tradeoffs Faster Adders Previously we studied the ripple-carry adder. This design isn t feasible for larger adders due to the ripple delay. ʽ There are several methods that we could

More information

A Novel Congruent Organizational Design Methodology Using Group Technology and a Nested Genetic Algorithm

A Novel Congruent Organizational Design Methodology Using Group Technology and a Nested Genetic Algorithm A Novel Congruent Organizational Design Methodology Using Group Technology and a Nested Genetic Algorithm Feili Yu Georgiy M. Levchuk Candra Meirina Sui Ruan Krishna Pattipati 9-th International CCRTS,

More information

Compiling Techniques

Compiling Techniques Lecture 11: Introduction to 13 November 2015 Table of contents 1 Introduction Overview The Backend The Big Picture 2 Code Shape Overview Introduction Overview The Backend The Big Picture Source code FrontEnd

More information

Optimization Techniques for Parallel Code 1. Parallel programming models

Optimization Techniques for Parallel Code 1. Parallel programming models Optimization Techniques for Parallel Code 1. Parallel programming models Sylvain Collange Inria Rennes Bretagne Atlantique http://www.irisa.fr/alf/collange/ sylvain.collange@inria.fr OPT - 2017 Goals of

More information

CS 4407 Algorithms Lecture 2: Iterative and Divide and Conquer Algorithms

CS 4407 Algorithms Lecture 2: Iterative and Divide and Conquer Algorithms CS 4407 Algorithms Lecture 2: Iterative and Divide and Conquer Algorithms Prof. Gregory Provan Department of Computer Science University College Cork 1 Lecture Outline CS 4407, Algorithms Growth Functions

More information

CS 161 Summer 2009 Homework #2 Sample Solutions

CS 161 Summer 2009 Homework #2 Sample Solutions CS 161 Summer 2009 Homework #2 Sample Solutions Regrade Policy: If you believe an error has been made in the grading of your homework, you may resubmit it for a regrade. If the error consists of more than

More information

CSE. 1. In following code. addi. r1, skip1 xor in r2. r3, skip2. counter r4, top. taken): PC1: PC2: PC3: TTTTTT TTTTTT

CSE. 1. In following code. addi. r1, skip1 xor in r2. r3, skip2. counter r4, top. taken): PC1: PC2: PC3: TTTTTT TTTTTT CSE 560 Practice Problem Set 4 Solution 1. In this question, you will examine several different schemes for branch prediction, using the following code sequence for a simple load store ISA with no branch

More information

Asymptotic Running Time of Algorithms

Asymptotic Running Time of Algorithms Asymptotic Complexity: leading term analysis Asymptotic Running Time of Algorithms Comparing searching and sorting algorithms so far: Count worst-case of comparisons as function of array size. Drop lower-order

More information

CSE 431/531: Analysis of Algorithms. Dynamic Programming. Lecturer: Shi Li. Department of Computer Science and Engineering University at Buffalo

CSE 431/531: Analysis of Algorithms. Dynamic Programming. Lecturer: Shi Li. Department of Computer Science and Engineering University at Buffalo CSE 431/531: Analysis of Algorithms Dynamic Programming Lecturer: Shi Li Department of Computer Science and Engineering University at Buffalo Paradigms for Designing Algorithms Greedy algorithm Make a

More information

Homework #2: assignments/ps2/ps2.htm Due Thursday, March 7th.

Homework #2:   assignments/ps2/ps2.htm Due Thursday, March 7th. Homework #2: http://www.cs.cornell.edu/courses/cs612/2002sp/ assignments/ps2/ps2.htm Due Thursday, March 7th. 1 Transformations and Dependences 2 Recall: Polyhedral algebra tools for determining emptiness

More information

CIS 4930/6930: Principles of Cyber-Physical Systems

CIS 4930/6930: Principles of Cyber-Physical Systems CIS 4930/6930: Principles of Cyber-Physical Systems Chapter 11 Scheduling Hao Zheng Department of Computer Science and Engineering University of South Florida H. Zheng (CSE USF) CIS 4930/6930: Principles

More information

/ : Computer Architecture and Design

/ : Computer Architecture and Design 16.482 / 16.561: Computer Architecture and Design Summer 2015 Homework #5 Solution 1. Dynamic scheduling (30 points) Given the loop below: DADDI R3, R0, #4 outer: DADDI R2, R1, #32 inner: L.D F0, 0(R1)

More information

LESSON 7.2 FACTORING POLYNOMIALS II

LESSON 7.2 FACTORING POLYNOMIALS II LESSON 7.2 FACTORING POLYNOMIALS II LESSON 7.2 FACTORING POLYNOMIALS II 305 OVERVIEW Here s what you ll learn in this lesson: Trinomials I a. Factoring trinomials of the form x 2 + bx + c; x 2 + bxy +

More information

Fall 2011 Prof. Hyesoon Kim

Fall 2011 Prof. Hyesoon Kim Fall 2011 Prof. Hyesoon Kim Add: 2 cycles FE_stage add r1, r2, r3 FE L ID L EX L MEM L WB L add add sub r4, r1, r3 sub sub add add mul r5, r2, r3 mul sub sub add add mul sub sub add add mul sub sub add

More information

Digital Logic Design: a rigorous approach c

Digital Logic Design: a rigorous approach c Digital Logic Design: a rigorous approach c Chapter 13: Decoders and Encoders Guy Even Moti Medina School of Electrical Engineering Tel-Aviv Univ. May 17, 2018 Book Homepage: http://www.eng.tau.ac.il/~guy/even-medina

More information

Real-time Scheduling of Periodic Tasks (2) Advanced Operating Systems Lecture 3

Real-time Scheduling of Periodic Tasks (2) Advanced Operating Systems Lecture 3 Real-time Scheduling of Periodic Tasks (2) Advanced Operating Systems Lecture 3 Lecture Outline The rate monotonic algorithm (cont d) Maximum utilisation test The deadline monotonic algorithm The earliest

More information

Bin packing and scheduling

Bin packing and scheduling Sanders/van Stee: Approximations- und Online-Algorithmen 1 Bin packing and scheduling Overview Bin packing: problem definition Simple 2-approximation (Next Fit) Better than 3/2 is not possible Asymptotic

More information

Static Program Analysis

Static Program Analysis Static Program Analysis Xiangyu Zhang The slides are compiled from Alex Aiken s Michael D. Ernst s Sorin Lerner s A Scary Outline Type-based analysis Data-flow analysis Abstract interpretation Theorem

More information

Timing Driven Power Gating in High-Level Synthesis

Timing Driven Power Gating in High-Level Synthesis Timing Driven Power Gating in High-Level Synthesis Shih-Hsu Huang and Chun-Hua Cheng Department of Electronic Engineering Chung Yuan Christian University, Taiwan Outline Introduction Motivation Our Approach

More information

Performance, Power & Energy

Performance, Power & Energy Recall: Goal of this class Performance, Power & Energy ELE8106/ELE6102 Performance Reconfiguration Power/ Energy Spring 2010 Hayden Kwok-Hay So H. So, Sp10 Lecture 3 - ELE8106/6102 2 What is good performance?

More information

ECE 5775 (Fall 17) High-Level Digital Design Automation. Scheduling: Exact Methods

ECE 5775 (Fall 17) High-Level Digital Design Automation. Scheduling: Exact Methods ECE 5775 (Fall 17) High-Level Digital Design Automation Scheduling: Exact Methods Announcements Sign up for the first student-led discussions today One slot remaining Presenters for the 1st session will

More information

Module 1: Analyzing the Efficiency of Algorithms

Module 1: Analyzing the Efficiency of Algorithms Module 1: Analyzing the Efficiency of Algorithms Dr. Natarajan Meghanathan Associate Professor of Computer Science Jackson State University Jackson, MS 39217 E-mail: natarajan.meghanathan@jsums.edu Based

More information

Analysis of clocked sequential networks

Analysis of clocked sequential networks Analysis of clocked sequential networks keywords: Mealy, Moore Consider : a sequential parity checker an 8th bit is added to each group of 7 bits such that the total # of 1 bits is odd for odd parity if

More information

Analysis of Algorithm Efficiency. Dr. Yingwu Zhu

Analysis of Algorithm Efficiency. Dr. Yingwu Zhu Analysis of Algorithm Efficiency Dr. Yingwu Zhu Measure Algorithm Efficiency Time efficiency How fast the algorithm runs; amount of time required to accomplish the task Our focus! Space efficiency Amount

More information

COMP 515: Advanced Compilation for Vector and Parallel Processors. Vivek Sarkar Department of Computer Science Rice University

COMP 515: Advanced Compilation for Vector and Parallel Processors. Vivek Sarkar Department of Computer Science Rice University COMP 515: Advanced Compilation for Vector and Parallel Processors Vivek Sarkar Department of Computer Science Rice University vsarkar@rice.edu COMP 515 Lecture 10 10 February 2009 Announcement Feb 17 th

More information

Algorithms and Programming I. Lecture#1 Spring 2015

Algorithms and Programming I. Lecture#1 Spring 2015 Algorithms and Programming I Lecture#1 Spring 2015 CS 61002 Algorithms and Programming I Instructor : Maha Ali Allouzi Office: 272 MSB Office Hours: T TH 2:30:3:30 PM Email: mallouzi@kent.edu The Course

More information

Exam Spring Embedded Systems. Prof. L. Thiele

Exam Spring Embedded Systems. Prof. L. Thiele Exam Spring 20 Embedded Systems Prof. L. Thiele NOTE: The given solution is only a proposal. For correctness, completeness, or understandability no responsibility is taken. Sommer 20 Eingebettete Systeme

More information

CMP N 301 Computer Architecture. Appendix C

CMP N 301 Computer Architecture. Appendix C CMP N 301 Computer Architecture Appendix C Outline Introduction Pipelining Hazards Pipelining Implementation Exception Handling Advanced Issues (Dynamic Scheduling, Out of order Issue, Superscalar, etc)

More information

Data-Flow Analysis. Compiler Design CSE 504. Preliminaries

Data-Flow Analysis. Compiler Design CSE 504. Preliminaries Data-Flow Analysis Compiler Design CSE 504 1 Preliminaries 2 Live Variables 3 Data Flow Equations 4 Other Analyses Last modifled: Thu May 02 2013 at 09:01:11 EDT Version: 1.3 15:28:44 2015/01/25 Compiled

More information

Quantum Computing Approach to V&V of Complex Systems Overview

Quantum Computing Approach to V&V of Complex Systems Overview Quantum Computing Approach to V&V of Complex Systems Overview Summary of Quantum Enabled V&V Technology June, 04 Todd Belote Chris Elliott Flight Controls / VMS Integration Discussion Layout I. Quantum

More information

Real-time Scheduling of Periodic Tasks (1) Advanced Operating Systems Lecture 2

Real-time Scheduling of Periodic Tasks (1) Advanced Operating Systems Lecture 2 Real-time Scheduling of Periodic Tasks (1) Advanced Operating Systems Lecture 2 Lecture Outline Scheduling periodic tasks The rate monotonic algorithm Definition Non-optimality Time-demand analysis...!2

More information

Math Models of OR: Branch-and-Bound

Math Models of OR: Branch-and-Bound Math Models of OR: Branch-and-Bound John E. Mitchell Department of Mathematical Sciences RPI, Troy, NY 12180 USA November 2018 Mitchell Branch-and-Bound 1 / 15 Branch-and-Bound Outline 1 Branch-and-Bound

More information

Hardware Acceleration of the Tate Pairing in Characteristic Three

Hardware Acceleration of the Tate Pairing in Characteristic Three Hardware Acceleration of the Tate Pairing in Characteristic Three CHES 2005 Hardware Acceleration of the Tate Pairing in Characteristic Three Slide 1 Introduction Pairing based cryptography is a (fairly)

More information

Section Summary. Sequences. Recurrence Relations. Summations Special Integer Sequences (optional)

Section Summary. Sequences. Recurrence Relations. Summations Special Integer Sequences (optional) Section 2.4 Section Summary Sequences. o Examples: Geometric Progression, Arithmetic Progression Recurrence Relations o Example: Fibonacci Sequence Summations Special Integer Sequences (optional) Sequences

More information

Hardware Design I Chap. 4 Representative combinational logic

Hardware Design I Chap. 4 Representative combinational logic Hardware Design I Chap. 4 Representative combinational logic E-mail: shimada@is.naist.jp Already optimized circuits There are many optimized circuits which are well used You can reduce your design workload

More information

Real-time Systems: Scheduling Periodic Tasks

Real-time Systems: Scheduling Periodic Tasks Real-time Systems: Scheduling Periodic Tasks Advanced Operating Systems Lecture 15 This work is licensed under the Creative Commons Attribution-NoDerivatives 4.0 International License. To view a copy of

More information

Discrete-Time Systems

Discrete-Time Systems FIR Filters With this chapter we turn to systems as opposed to signals. The systems discussed in this chapter are finite impulse response (FIR) digital filters. The term digital filter arises because these

More information