Shedding the Shackles of Time-Division Multiplexing

Similar documents
Efficient TDM-based Arbitration for Mixed-Criticality Systems on Multi-Cores

Andrew Morton University of Waterloo Canada

Scheduling Slack Time in Fixed Priority Pre-emptive Systems

Non-Work-Conserving Non-Preemptive Scheduling: Motivations, Challenges, and Potential Solutions

Task Models and Scheduling

Real-time operating systems course. 6 Definitions Non real-time scheduling algorithms Real-time scheduling algorithm

Scheduling. Uwe R. Zimmer & Alistair Rendell The Australian National University

Real-time Scheduling of Periodic Tasks (2) Advanced Operating Systems Lecture 3

CycleTandem: Energy-Saving Scheduling for Real-Time Systems with Hardware Accelerators

Lecture 6. Real-Time Systems. Dynamic Priority Scheduling

There are three priority driven approaches that we will look at

Priority-driven Scheduling of Periodic Tasks (1) Advanced Operating Systems (M) Lecture 4

Process Scheduling for RTS. RTS Scheduling Approach. Cyclic Executive Approach

Real-Time and Embedded Systems (M) Lecture 5

Lecture 13. Real-Time Scheduling. Daniel Kästner AbsInt GmbH 2013

CPU SCHEDULING RONG ZHENG

Embedded Systems 14. Overview of embedded systems design

Online Energy-Aware I/O Device Scheduling for Hard Real-Time Systems with Shared Resources

Schedule Table Generation for Time-Triggered Mixed Criticality Systems

Online Scheduling Switch for Maintaining Data Freshness in Flexible Real-Time Systems

Real-Time Scheduling. Real Time Operating Systems and Middleware. Luca Abeni

Real-Time Systems. Event-Driven Scheduling

Real-time Scheduling of Periodic Tasks (1) Advanced Operating Systems Lecture 2

System Model. Real-Time systems. Giuseppe Lipari. Scuola Superiore Sant Anna Pisa -Italy

Real-Time Systems. Lecture #14. Risat Pathan. Department of Computer Science and Engineering Chalmers University of Technology

CEC 450 Real-Time Systems

EDF Feasibility and Hardware Accelerators

Embedded Systems Development

A Utilization Bound for Aperiodic Tasks and Priority Driven Scheduling

3. Scheduling issues. Common approaches 3. Common approaches 1. Preemption vs. non preemption. Common approaches 2. Further definitions

Design and Analysis of Time-Critical Systems Response-time Analysis with a Focus on Shared Resources

Networked Embedded Systems WS 2016/17

Non-Preemptive and Limited Preemptive Scheduling. LS 12, TU Dortmund

Aperiodic Task Scheduling

Leveraging Transactional Memory for a Predictable Execution of Applications Composed of Hard Real-Time and Best-Effort Tasks

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University

Load Regulating Algorithm for Static-Priority Task Scheduling on Multiprocessors

Timing analysis and predictability of architectures

Clock-driven scheduling

Embedded Systems 15. REVIEW: Aperiodic scheduling. C i J i 0 a i s i f i d i

On-line scheduling of periodic tasks in RT OS

Component-based Analysis of Worst-case Temperatures

Replica-Aware Co-Scheduling for Mixed-Criticality

Chapter 6: CPU Scheduling

EECS 571 Principles of Real-Time Embedded Systems. Lecture Note #7: More on Uniprocessor Scheduling

RUN-TIME EFFICIENT FEASIBILITY ANALYSIS OF UNI-PROCESSOR SYSTEMS WITH STATIC PRIORITIES

The Concurrent Consideration of Uncertainty in WCETs and Processor Speeds in Mixed Criticality Systems

CMSC 451: Lecture 7 Greedy Algorithms for Scheduling Tuesday, Sep 19, 2017

Embedded Systems 23 BF - ES

CHAPTER 5 - PROCESS SCHEDULING

TDDI04, K. Arvidsson, IDA, Linköpings universitet CPU Scheduling. Overview: CPU Scheduling. [SGG7] Chapter 5. Basic Concepts.

Scheduling Lecture 1: Scheduling on One Machine

UC Santa Barbara. Operating Systems. Christopher Kruegel Department of Computer Science UC Santa Barbara

Real-Time Systems. Event-Driven Scheduling

CPU Scheduling. CPU Scheduler

Non-Work-Conserving Scheduling of Non-Preemptive Hard Real-Time Tasks Based on Fixed Priorities

Exam Spring Embedded Systems. Prof. L. Thiele

Semantic Importance Dual-Priority Server: Properties

CSE 380 Computer Operating Systems

LSN 15 Processor Scheduling

Efficient Power Management Schemes for Dual-Processor Fault-Tolerant Systems

Embedded Systems Design: Optimization Challenges. Paul Pop Embedded Systems Lab (ESLAB) Linköping University, Sweden

CIS 4930/6930: Principles of Cyber-Physical Systems

Temperature-Aware Analysis and Scheduling

A Response-Time Analysis for Non-preemptive Job Sets under Global Scheduling

Real-Time Calculus. LS 12, TU Dortmund

MIRROR: Symmetric Timing Analysis for Real-Time Tasks on Multicore Platforms with Shared Resources

Simulation of Process Scheduling Algorithms

Mixed Criticality in Safety-Critical Systems. LS 12, TU Dortmund

arxiv: v1 [cs.os] 6 Jun 2013

Energy-Constrained Scheduling for Weakly-Hard Real-Time Systems

Module 5: CPU Scheduling

Conference Paper. A Response-Time Analysis for Non-Preemptive Job Sets under Global Scheduling. Mitra Nasri Geoffrey Nelissen Björn B.

The FMLP + : An Asymptotically Optimal Real-Time Locking Protocol for Suspension-Aware Analysis

Cyclic Schedules: General Structure. Frame Size Constraints

Real-Time Systems. LS 12, TU Dortmund

Real-Time Scheduling

Non-preemptive Fixed Priority Scheduling of Hard Real-Time Periodic Tasks

A Dynamic Real-time Scheduling Algorithm for Reduced Energy Consumption

Worst-Case Execution Time Analysis. LS 12, TU Dortmund

CPU Scheduling. Heechul Yun

Real-Time Scheduling and Resource Management

Task Reweighting under Global Scheduling on Multiprocessors

Real-time Systems: Scheduling Periodic Tasks

Feedback EDF Scheduling of Real-Time Tasks Exploiting Dynamic Voltage Scaling

Journal of Global Research in Computer Science

Resource Sharing in an Enhanced Rate-Based Execution Model

TDDB68 Concurrent programming and operating systems. Lecture: CPU Scheduling II

Worst-Case Execution Time Analysis. LS 12, TU Dortmund

Paper Presentation. Amo Guangmo Tong. University of Taxes at Dallas January 24, 2014

Failure Tolerance of Multicore Real-Time Systems scheduled by a Pfair Algorithm

An Exact and Sustainable Analysis of Non-Preemptive Scheduling

EDF Scheduling. Giuseppe Lipari CRIStAL - Université de Lille 1. October 4, 2015

Coscheduling of Real-Time Tasks and PCI Bus Transactions

Spare CASH: Reclaiming Holes to Minimize Aperiodic Response Times in a Firm Real-Time Environment

Toward Precise PLRU Cache Analysis

Proportional Share Resource Allocation Outline. Proportional Share Resource Allocation Concept

Analysis and Implementation of Global Preemptive Fixed-Priority Scheduling with Dynamic Cache Allocation*

A Response-Time Analysis for Non-Preemptive Job Sets under Global Scheduling

Necessary and Sufficient Conditions for Non-Preemptive Robustness

Transcription:

Shedding the Shackles of Time-Division Multiplexing Farouk Hebbache with Florian Brandner, 2 Mathieu Jan, Laurent Pautet 2 CEA List, LS 2 LTCI, Télécom ParisTech, Université Paris-Saclay

Multi-core Architectures Complex WCET analysis: Pessimistic WCET Tasks rarely use their full time budget Low resource utilization How to improve the use of hardware resources? "Mixed-Criticality" scheduling: Timing guarantees depends on criticality level Non-critical tasks with best effort processing Improves resource utilization, but only at scheduling level Worst-Case Execution Time 2/29

Motivation Memory arbitration scheme in multi-core architectures Time-Division Multiplexing (TDM) Fixed time windows Exclusive memory access Isolation between cores More precises analysis techniques compared to Round-Robin 2 Non-work-conserving Goal: improve memory utilization while keeping TDM guarantees. 2 H. Rihani et al, WCET Analysis in Shared Resources Real-Time Systems with TDMA Buses, RTNS 25 /29

System Model m processing cores Each core executes a single task (critical or non-critical) = No tasks scheduling needed = Cores emit critical/non-critical memory requests Memory arbitration Time-Division Multiplexing (TDM) Dedicated slot for critical cores No slots for non-critical cores 4/29

Example: Strict Time-Division Multiplexing T T2 T T2 T T2 T T2 T T2 T T2 T T2 T T2 2 4 5 6 7 8 9 2 4 5 6 T c T c T c 2 T2 c T2 c 2 Critical tasks T and T2 with dedicated TDM slots as well as non-critical tasks t and t4. 5/29

Example: Strict Time-Division Multiplexing T T2 T T2 T T2 T T2 T T2 T T2 T T2 T T2 2 4 5 6 7 8 9 2 4 5 6 T c T c T c T 2 T2 T2 c T2 c t 2 t4 Critical tasks T and T2 with dedicated TDM slots as well as non-critical tasks t and t4. Rows: different tasks as they execute 5/29

Example: Strict Time-Division Multiplexing T T2 T T2 T T2 T T2 T T2 T T2 T T2 T T2 2 4 5 6 7 8 9 2 4 5 6 T c T c T c 2 T2 c T2 c 2 Critical tasks T and T2 with dedicated TDM slots as well as non-critical tasks t and t4. Green bars: requests being processed by the memory (only a single active request at a time) 5/29

Example: Strict Time-Division Multiplexing T T2 T T2 T T2 T T2 T T2 T T2 T T2 T T2 2 4 5 6 7 8 9 2 4 5 6 T c T c T c 2 T2 c T2 c 2 Critical tasks T and T2 with dedicated TDM slots as well as non-critical tasks t and t4. Arcs: memory wait time, from issuing to start of processing (depends on arbitration) 5/29

Example: Strict Time-Division Multiplexing T T2 T T2 T T2 T T2 T T2 T T2 T T2 T T2 2 4 5 6 7 8 9 2 4 5 6 T 2 c T c T c 2 42 T2 c 9 T2 c 9 5 2 2 t2 nc 4 Critical tasks T and T2 with dedicated TDM slots as well as non-critical tasks t and t4. Gaps: computation time between requests (these values remain unchanged) 5/29

Example: Strict Time-Division Multiplexing T T2 T T2 T T2 T T2 T T2 T T2 T T2 T T2 2 4 5 6 7 8 9 2 4 5 6 T c T c T c 2 T2 c T2 c 2 Critical tasks T and T2 with dedicated TDM slots as well as non-critical tasks t and t4. Non-critical requests can only reclaim unused TDM slots. 5/29

x Issues with TDM

Issue with TDM () T T2 T T2 T T2 T T2 T T2 T T2 T T2 T T2 2 4 5 6 7 8 9 2 4 5 6 T c T c T c 2 T2 c T2 c 2 Issue Delay: Number of cycles during which the memory is idle, despite pending requests at the arbiter. 7/29

Issue with TDM (2) T T2 T T2 T T2 T T2 T T2 T T2 T T2 T T2 2 4 5 6 7 8 9 2 4 5 6 T c T c T c 2 T2 c T2 c 2 Release Delay: Number of cycles that memory requests finish earlier than the TDM slot length. 8/29

Total Memory Idling T T2 T T2 T T2 T T2 T T2 T T2 T T2 T T2 2 4 5 6 7 8 9 2 4 5 6 T c T c T c 2 T2 c T2 c 2 Number of cycles where the memory is idle Issue & release delays = non-work-conserving Goal: Eliminate issue & release delays. 9/29

x Dynamic TDM-based Arbitration

Dynamic TDM-based Arbitration (TDMds) Goal: Reallocating free slots to other pending requests F. Hebbache et al, Dynamic arbitration of memory requests with tdm-like guarantees, (Workshop CRTS 7) How? Each critical request is associated with a deadline Deadline is at end of a TDM slot of the request owner Scheduled using Earliest-Deadline-First strategy (EDF) Best-effort for non-critical requests Scheduled using First-In First-Out strategy (FIFO) (other alternatives possible, e.g., fixed priorities) Track slack when critical requests complete before deadline /29

TDM-based Arbitration without slots (TDMer) Goal: Eliminate issue/release delays within slots... How? Requests handled independently from TDM slots Operates at the granularity of clock cycles Request can complete within the next slot... = Owner of the next slot can miss its deadline! How to guarantee that critical tasks respects their deadlines? 2/29

Example: Dynamic TDM without slots (TDMer) T T2 T T2 T T2 T T2 T T2 T T2 2 4 5 6 7 8 9 2 T c, T c, T c,2 2 T2 c, T2 c,6 2 Same task set using dynamic TDM-based arbitration (TDMer). /29

Strict TDM vs. TDMer T T2 T T2 T T2 T T2 T T2 T T2 T T2 T T2 2 4 5 6 7 8 9 2 4 5 6 T c T c T c 2 T2 c T2 c 2 T T2 T T2 T T2 T T2 T T2 T T2 2 4 5 6 7 8 9 2 T c, T c, T c,2 2 T2 c, T2 c,6 2 Critical requests complete earlier than under strict TDM. 4/29

x Making deadlines match

Slack Counters T T2 T T2 T T2 T T2 T T2 T T2 2 4 5 6 7 8 9 2 T T c, T T c, 2 T c,2 2 T2 c, T2 c,6 2 Critical tasks accumulate slack whenever a request finishes before its deadline (i.e., earlier than strict TDM). 6/29

Deadlines computation T T2 T T2 T T2 T T2 T T2 T T2 2 4 5 6 7 8 9 2 T c, T T c, 2 T c,2 2 T2 c, T2 c,6 2 Deadlines are derived by finding the next TDM slot after the delayed issue date (now + slack). 7/29

x Scheduling requests at any moment

Early requests processing () T T2 T T2 T T2 T T2 T T2 T T2 2 4 5 6 7 8 9 2 T c, T c, T c,2 2 T2 T2 c, T2 c,6 2 Critical requests may be processed when the upcoming slot belongs to the request owner. 9/29

Early request processing (2) T T2 T T2 T T2 T T2 T T2 T T2 2 4 5 6 7 8 9 2 T c, T c, T c,2 2 T2 c, T2 c,6 2 Requests are processed any time, independent from TDM slots, if critical tasks have enough slack (here T). 2/29

Memory idling considerably reduced (TDMer) T T2 T T2 T T2 T T2 T T2 T T2 T T2 T T2 2 4 5 6 7 8 9 2 4 5 6 T c T c T c 2 T2 c T2 c 2 T T2 T T2 T T2 T T2 T T2 T T2 2 4 5 6 7 8 9 2 T c, T c, T c,2 2 T2 c, T2 c,6 2 Remaining Issue delay... Unauthorized overflow on slot T2 = Insufficient slack? The total memory idling is considerably improved 2/29

Initial Slack (TDMeri) T T2 T T2 T T2 T T2 T T2 T T2 2 4 5 6 7 8 9 2 T c,8 T c,8 T c,28 2 T2 c,8 T2 c,2 2 Providing initial slack of a single TDM slot (8 ) eliminates (almost) all issue delays... 22/29

x Experiments

Benchmark Setup Up to 24 Patmos cores simulated TDM slot length set to 4 cycles Access latencies between [2, 4] cycles Various TDM-based arbitration schemes (TDMfs, TDMds, TDMer, and TDMeri) Overall 42 simulation runs Based on randomized memory traces (calibrated from actual traces of MiBench on Patmos) Ratio of critical/non-critical tasks: / and / Evaluation metric: memory utilization (issue & release delays, memory idling) 24/29

Results: reduced issue delay (a) Strict TDM (TDMfs) (b) Dynamic TDM respecting slots (TDMds) Dynamic TDM arbitration largely eliminates issue delays However: Up to 2% of release delay... 25/29

Results: release delay vanishes (b) Dynamic TDM respecting slots (TDMds) (a) Dynamic TDM without slots (TDMer) TDMer arbitration noticeably better under high load but little change in total memory idling... 26/29

Results: work-conserving? (a) Dynamic TDM without slots (TDMer) (b) Dynamic TDM without slots with Initial Slack (TDMeri) Initial slack (4 cycles) results in work-conserving arbitration total memory idling improves considerably under high load 27/29

Conclusion Dynamic TDM-based arbitration Improve resource utilization (work-conserving) Preserves TDM s guarantees (proof in the paper) Preserves precise analysis techniques Future work Multiple tasks on a single core = preemption Hardware implementation Other forms of slack to be accumulated 28/29

TDMeri under high load (a) Strict TDM (TDMfs) (b) Dynamic TDM without slots With initial slack (4 cycles) (TDMeri) Memory idle time considering 24 tasks 6 critical and 8 non-critical tasks 29/29