Dependency Graph Approach for Multiprocessor Real-Time Synchronization. TU Dortmund, Germany

Similar documents
Multiprocessor Scheduling I: Partitioned Scheduling. LS 12, TU Dortmund

CIS 4930/6930: Principles of Cyber-Physical Systems

Task Models and Scheduling

Schedulability and Optimization Analysis for Non-Preemptive Static Priority Scheduling Based on Task Utilization and Blocking Factors

Non-Preemptive and Limited Preemptive Scheduling. LS 12, TU Dortmund

Aperiodic Task Scheduling

Real-Time Systems. Lecture #14. Risat Pathan. Department of Computer Science and Engineering Chalmers University of Technology

Scheduling Periodic Real-Time Tasks on Uniprocessor Systems. LS 12, TU Dortmund

Non-preemptive Fixed Priority Scheduling of Hard Real-Time Periodic Tasks

Multiprocessor Scheduling II: Global Scheduling. LS 12, TU Dortmund

Real-Time Scheduling and Resource Management

Embedded Systems 15. REVIEW: Aperiodic scheduling. C i J i 0 a i s i f i d i

Exact speedup factors and sub-optimality for non-preemptive scheduling

Real-time operating systems course. 6 Definitions Non real-time scheduling algorithms Real-time scheduling algorithm

Optimality Results for Multiprocessor Real-Time Locking

On the Soft Real-Time Optimality of Global EDF on Multiprocessors: From Identical to Uniform Heterogeneous

Real-Time Systems. LS 12, TU Dortmund

2.1 Task and Scheduling Model. 2.2 Definitions and Schedulability Guarantees

Lecture Note #6: More on Task Scheduling EECS 571 Principles of Real-Time Embedded Systems Kang G. Shin EECS Department University of Michigan

The FMLP + : An Asymptotically Optimal Real-Time Locking Protocol for Suspension-Aware Analysis

arxiv: v1 [cs.os] 6 Jun 2013

A Note on Modeling Self-Suspending Time as Blocking Time in Real-Time Systems

arxiv: v1 [cs.os] 28 Feb 2018

Computers and Intractability. The Bandersnatch problem. The Bandersnatch problem. The Bandersnatch problem. A Guide to the Theory of NP-Completeness

APTAS for Bin Packing

Computers and Intractability

Many suspensions, many problems: a review of self-suspending tasks in real-time systems

Real Time Operating Systems

CycleTandem: Energy-Saving Scheduling for Real-Time Systems with Hardware Accelerators

Non-Work-Conserving Non-Preemptive Scheduling: Motivations, Challenges, and Potential Solutions

Optimal Utilization Bounds for the Fixed-priority Scheduling of Periodic Task Systems on Identical Multiprocessors. Sanjoy K.

Real-Time Systems. Event-Driven Scheduling

The Concurrent Consideration of Uncertainty in WCETs and Processor Speeds in Mixed Criticality Systems

Reservation-Based Federated Scheduling for Parallel Real-Time Tasks

Embedded Systems 14. Overview of embedded systems design

Clock-driven scheduling

Partition is reducible to P2 C max. c. P2 Pj = 1, prec Cmax is solvable in polynomial time. P Pj = 1, prec Cmax is NP-hard

Embedded Systems - FS 2018

Resource Sharing in an Enhanced Rate-Based Execution Model

EDF Scheduling. Giuseppe Lipari May 11, Scuola Superiore Sant Anna Pisa

Real Time Operating Systems

Improved Analysis and Evaluation of Real-Time Semaphore Protocols for P-FP Scheduling (extended version)

Bounding the Maximum Length of Non-Preemptive Regions Under Fixed Priority Scheduling

P C max. NP-complete from partition. Example j p j What is the makespan on 2 machines? 3 machines? 4 machines?

Shared resources. Sistemi in tempo reale. Giuseppe Lipari. Scuola Superiore Sant Anna Pisa -Italy

On the Soft Real-Time Optimality of Global EDF on Uniform Multiprocessors

Lightweight Real-Time Synchronization under P-EDF on Symmetric and Asymmetric Multiprocessors

Uniprocessor real-time scheduling

Schedulability of Periodic and Sporadic Task Sets on Uniprocessor Systems

Global Real-Time Semaphore Protocols: A Survey, Unified Analysis, and Comparison

Tardiness Bounds under Global EDF Scheduling on a Multiprocessor

Load Regulating Algorithm for Static-Priority Task Scheduling on Multiprocessors

Paper Presentation. Amo Guangmo Tong. University of Taxes at Dallas February 11, 2014

Segment-Fixed Priority Scheduling for Self-Suspending Real-Time Tasks

Lightweight Real-Time Synchronization under P-EDF on Symmetric and Asymmetric Multiprocessors

Real-Time Systems. Lecture 4. Scheduling basics. Task scheduling - basic taxonomy Basic scheduling techniques Static cyclic scheduling

Mixed Criticality in Safety-Critical Systems. LS 12, TU Dortmund

EDF Scheduling. Giuseppe Lipari CRIStAL - Université de Lille 1. October 4, 2015

An Optimal k-exclusion Real-Time Locking Protocol Motivated by Multi-GPU Systems

Energy-Efficient Real-Time Task Scheduling in Multiprocessor DVS Systems

Task assignment in heterogeneous multiprocessor platforms

Utilization Bounds on Allocating Rate-Monotonic Scheduled Multi-Mode Tasks on Multiprocessor Systems

Approximation Algorithms for scheduling

Lecture 13. Real-Time Scheduling. Daniel Kästner AbsInt GmbH 2013

Multiprocessor Real-Time Scheduling Considering Concurrency and Urgency

Scheduling Lecture 1: Scheduling on One Machine

Tardiness Bounds under Global EDF Scheduling on a. Multiprocessor

Paper Presentation. Amo Guangmo Tong. University of Taxes at Dallas January 24, 2014

Real-time scheduling of sporadic task systems when the number of distinct task types is small

EDF and RM Multiprocessor Scheduling Algorithms: Survey and Performance Evaluation

An Optimal Semi-Partitioned Scheduler for Uniform Heterogeneous Multiprocessors

An Improved Schedulability Test for Uniprocessor Periodic Task Systems

PASS: Priority Assignment of Real-Time Tasks with Dynamic Suspending Behavior under Fixed-Priority Scheduling

Supporting Read/Write Applications in Embedded Real-time Systems via Suspension-aware Analysis

Multi-core Real-Time Scheduling for Generalized Parallel Task Models

Semi-Partitioned Fixed-Priority Scheduling on Multiprocessors

EECS 571 Principles of Real-Time Embedded Systems. Lecture Note #7: More on Uniprocessor Scheduling

An Improved Schedulability Test for Uniprocessor. Periodic Task Systems

Partitioned scheduling of sporadic task systems: an ILP-based approach

Scheduling Lecture 1: Scheduling on One Machine

Supplement of Improvement of Real-Time Multi-Core Schedulability with Forced Non- Preemption

Resource Sharing Protocols for Real-Time Task Graph Systems

Networked Embedded Systems WS 2016/17

Metode şi Algoritmi de Planificare (MAP) Curs 2 Introducere în problematica planificării

On Machine Dependency in Shop Scheduling

Controlling Preemption for Better Schedulability in Multi-Core Systems

Response Time Analysis for Tasks Scheduled under EDF within Fixed Priorities

arxiv: v3 [cs.ds] 23 Sep 2016

Lecture 6. Real-Time Systems. Dynamic Priority Scheduling

TDDI04, K. Arvidsson, IDA, Linköpings universitet CPU Scheduling. Overview: CPU Scheduling. [SGG7] Chapter 5. Basic Concepts.

Module 5: CPU Scheduling

Rate Monotonic Analysis (RMA)

Design and Analysis of Time-Critical Systems Response-time Analysis with a Focus on Shared Resources

doktors der ingenieurwissenschaften

TDDB68 Concurrent programming and operating systems. Lecture: CPU Scheduling II

Chapter 6: CPU Scheduling

Real-Time and Embedded Systems (M) Lecture 5

There are three priority driven approaches that we will look at

Embedded Systems Development

Schedulability Analysis of the Linux Push and Pull Scheduler with Arbitrary Processor Affinities

Transcription:

Dependency Graph Approach for Multiprocessor Real-Time Synchronization Jian-Jia Chen, Georg von der Bru ggen, Junjie Shi, and Niklas Ueter TU Dortmund, Germany 14,12,2018 at RTSS Jian-Jia Chen 1 / 21

Multiprocessor Scheduling Partitioned Multiprocessor Scheduling ready queue ready queue ready queue P 1 P 2 P m Jian-Jia Chen 2 / 21

Multiprocessor Scheduling Global Multiprocessor Scheduling ready queue P 1 P 2 P m Jian-Jia Chen 2 / 21

Multiprocessor Scheduling Semi-Partitioned Multiprocessor Scheduling ready queue ready queue ready queue P 1 P 2 P m Jian-Jia Chen 2 / 21

Resource Sharing Shared Resources: Data structures, variables, main memory area, file, set of registers, I/O unit Mutual exclusion Critical section Uniprocessor systems: Priority Inheritance Protocols (PIP) Priority Ceiling Protocols (PCP) Stack Resource Policy (SRP) Jian-Jia Chen 3 / 21

Existing Multiprocessor Locking Protocols Partitioned scheduling MPCP, 1990 MSRP, 2001 MrsP, 2013 FMLP, 2007 Jian-Jia Chen 4 / 21

Existing Multiprocessor Locking Protocols Partitioned scheduling MPCP, 1990 MSRP, 2001 MrsP, 2013 FMLP, 2007 Semi-partitioned scheduling DPCP, 1988 ROP, 2016, and ROP-Enforce, 2017 Jian-Jia Chen 4 / 21

Existing Multiprocessor Locking Protocols Partitioned scheduling MPCP, 1990 MSRP, 2001 MrsP, 2013 FMLP, 2007 Semi-partitioned scheduling DPCP, 1988 ROP, 2016, and ROP-Enforce, 2017 Global scheduling FMLP, 2007 FMLP +, 2014 DFLP, 2014 gedf-vpr, 2014 etc. Jian-Jia Chen 4 / 21

Two Correlated Problems Scheduler Design Problem Design locking protocols to synchronize the critical sections Design scheduling policies to schedule the synchronized tasks Partition the tasks to processors if the protocol is restricted to partitioned or semi-partitioned scheduling Schedulability Test Problem Validate the schedulability of a scheduling algorithm. Jian-Jia Chen 5 / 21

Open Problems for Multiprocessor Locking Protocols Performance of these protocols highly depends on How the tasks are partitioned How the tasks are prioritized Whether a job/task being blocked should spin or suspend itself How the resources are shared locally and globally Jian-Jia Chen 6 / 21

Open Problems for Multiprocessor Locking Protocols Performance of these protocols highly depends on How the tasks are partitioned Lakshmanan et al. (RTSS 2009), Nemati et al. (OPODIS 2010): grouping strategies Hsiu et al. (EMSOFT 2013): distributed execution mechanism, priority-based Huang et al. (RTSS 2016): resource-oriented partitioning (ROP) How the tasks are prioritized Whether a job/task being blocked should spin or suspend itself How the resources are shared locally and globally Jian-Jia Chen 6 / 21

Open Problems for Multiprocessor Locking Protocols Performance of these protocols highly depends on How the tasks are partitioned Lakshmanan et al. (RTSS 2009), Nemati et al. (OPODIS 2010): grouping strategies Hsiu et al. (EMSOFT 2013): distributed execution mechanism, priority-based Huang et al. (RTSS 2016): resource-oriented partitioning (ROP) How the tasks are prioritized von der Brüggen et al. (RTNS 2017): different priority-assignment strategies under ROP Afshar et al. (RTCSA 2017): optimal priority assignment for spin-based protocols Whether a job/task being blocked should spin or suspend itself How the resources are shared locally and globally Jian-Jia Chen 6 / 21

Open Problems for Multiprocessor Locking Protocols Performance of these protocols highly depends on How the tasks are partitioned Lakshmanan et al. (RTSS 2009), Nemati et al. (OPODIS 2010): grouping strategies Hsiu et al. (EMSOFT 2013): distributed execution mechanism, priority-based Huang et al. (RTSS 2016): resource-oriented partitioning (ROP) How the tasks are prioritized von der Brüggen et al. (RTNS 2017): different priority-assignment strategies under ROP Afshar et al. (RTCSA 2017): optimal priority assignment for spin-based protocols Whether a job/task being blocked should spin or suspend itself Yang et al. (RTSS 2016) for global scheduling How the resources are shared locally and globally Jian-Jia Chen 6 / 21

Open Problems for Multiprocessor Locking Protocols Performance of these protocols highly depends on How the tasks are partitioned Lakshmanan et al. (RTSS 2009), Nemati et al. (OPODIS 2010): grouping strategies Hsiu et al. (EMSOFT 2013): distributed execution mechanism, priority-based Huang et al. (RTSS 2016): resource-oriented partitioning (ROP) How the tasks are prioritized von der Brüggen et al. (RTNS 2017): different priority-assignment strategies under ROP Afshar et al. (RTCSA 2017): optimal priority assignment for spin-based protocols Whether a job/task being blocked should spin or suspend itself Yang et al. (RTSS 2016) for global scheduling How the resources are shared locally and globally Huang et al. (RTSS 2016): resource-oriented partitioning (ROP) Jian-Jia Chen 6 / 21

Research Questions What is the fundamental difficulty? What is the performance gap of partitioned, semi-partitioned, and global scheduling? Is it always beneficial to prioritize critical sections? Jian-Jia Chen 7 / 21

A Simple Task/Synchronization Model C 1,1 C 2,1 C 3,1 C 4,1 C 5,1 A 1,1 A 2,1 A 3,1 A 4,1 A 5,1 C 1,2 C 2,2 C 3,2 C 4,2 C 5,2 Mutex-Lock S1 Mutex-Lock S2 A set of tasks that arrive at the same time Jian-Jia Chen 8 / 21

A Simple Task/Synchronization Model C 1,1 C 2,1 C 3,1 C 4,1 C 5,1 A 1,1 A 2,1 A 3,1 A 4,1 A 5,1 C 1,2 C 2,2 C 3,2 C 4,2 C 5,2 Mutex-Lock S1 Mutex-Lock S2 A set of tasks that arrive at the same time Task τ i has one critical section using one share resource: A i,1 Jian-Jia Chen 8 / 21

A Simple Task/Synchronization Model C 1,1 C 2,1 C 3,1 C 4,1 C 5,1 A 1,1 A 2,1 A 3,1 A 4,1 A 5,1 C 1,2 C 2,2 C 3,2 C 4,2 C 5,2 Mutex-Lock S1 Mutex-Lock S2 A set of tasks that arrive at the same time Task τ i has one critical section using one share resource: A i,1 Task τ i has two non-critical sections: C i,1, C i,2 Jian-Jia Chen 8 / 21

A Simple Task/Synchronization Model C 1,1 C 2,1 C 3,1 C 4,1 C 5,1 A 1,1 A 2,1 A 3,1 A 4,1 A 5,1 C 1,2 C 2,2 C 3,2 C 4,2 C 5,2 Mutex-Lock S1 Mutex-Lock S2 A set of tasks that arrive at the same time Task τ i has one critical section using one share resource: A i,1 Task τ i has two non-critical sections: C i,1, C i,2 Objective: minimize the makespan, i.e., the maximum completion time. Jian-Jia Chen 8 / 21

Abundant Processors? normal execution critical section critical sec. release P 1: τ 1 P 2: τ 2 P 3: τ 3 P 4: τ 4 P 5: τ 5 P 6: τ 6 0 2 4 6 8 10 12 14 16 18 t Jian-Jia Chen 9 / 21

Abundant Processors? normal execution critical section critical sec. release P 0: sync τ2 τ5 τ1 τ6 τ3 τ4 P 1: τ 1 P 2: τ 2 P 3: τ 3 P 4: τ 4 P 5: τ 5 P 6: τ 6 0 2 4 6 8 10 12 14 16 18 t If tasks share one resource, image a virtual processor P 0 the critical section of task τ i is released at time C i,1 the critical section of task τ i has deadline of H C i,2 H is the target deadline Jian-Jia Chen 9 / 21

Strongly NP-Hard Scheduling problem 1 r j L max, strongly NP-complete 1: uniprocessor systems r j : a set of jobs {J j } with different arrival times r j and deadlines d j default is non-preemptive scheduling, the execution time of job J j is p j (decision version) L max: whether there is a feasible schedule Jian-Jia Chen 10 / 21

Strongly NP-Hard Scheduling problem 1 r j L max, strongly NP-complete 1: uniprocessor systems r j : a set of jobs {J j } with different arrival times r j and deadlines d j default is non-preemptive scheduling, the execution time of job J j is p j (decision version) L max: whether there is a feasible schedule Reduction from 1 r j L max Makespan H is a positive integer with H > d j for every job J j J j, we construct a task τ j with one critical section 1 C j,1 is set to r j 2 C j,2 is set to H d j 3 A j,1 is set to p j All constructed tasks share one resource in the critical sections Jian-Jia Chen 10 / 21

Strongly NP-Hard Scheduling problem 1 r j L max, strongly NP-complete 1: uniprocessor systems r j : a set of jobs {J j } with different arrival times r j and deadlines d j default is non-preemptive scheduling, the execution time of job J j is p j (decision version) L max: whether there is a feasible schedule Reduction from 1 r j L max Makespan H is a positive integer with H > d j for every job J j J j, we construct a task τ j with one critical section 1 C j,1 is set to r j 2 C j,2 is set to H d j 3 A j,1 is set to p j All constructed tasks share one resource in the critical sections There is a feasible non-preemptive uniprocessor for the jobs {J j } iff there is a feasible schedule for the constructed tasks {τ j } on a sufficient number of processors Jian-Jia Chen 10 / 21

Computational Complexity and Implications Assume that the task set needs synchronization. Makespan: Theorem The makespan problem on M processors is N P-hard in the strong sense even if M is sufficiently large under any scheduling paradigm. Jian-Jia Chen 11 / 21

Computational Complexity and Implications Assume that the task set needs synchronization. Makespan: Theorem The makespan problem on M processors is N P-hard in the strong sense even if M is sufficiently large under any scheduling paradigm. Bin Packing: Theorem Minimizing the number of processors for a given common deadline of T is N P-hard in the strong sense under any scheduling paradigm. Theorem There is no polynomial-time (approximation) algorithm to minimize the number of processors for a given common deadline of T under any scheduling paradigm unless P = N P. Jian-Jia Chen 11 / 21

Dependency-Graph Approach (DGA) A two-step approach: First step: create a dependency graph G = (V, E) A task τ i has three vertices, for C i,1, A i,1, and C i,2, in a chain C1,1 C2,1 C3,1 C4,1 C5,1 A1,1 A2,1 A3,1 A4,1 A5,1 C1,2 C2,2 C3,2 C4,2 C5,2 Mutex-Lock S1 Mutex-Lock S2 Jian-Jia Chen 12 / 21

Dependency-Graph Approach (DGA) A two-step approach: First step: create a dependency graph G = (V, E) A task τ i has three vertices, for C i,1, A i,1, and C i,2, in a chain If τ i and τ j share the same binary Mutex-Lock, their A i,1 and A j,1 must be defined in the precedence constraint C1,1 C2,1 C3,1 C4,1 C5,1 A1,1 A2,1 A3,1 A4,1 A5,1 C1,2 C2,2 C3,2 C4,2 C5,2 Mutex-Lock S1 Mutex-Lock S2 Jian-Jia Chen 12 / 21

Dependency-Graph Approach (DGA) A two-step approach: First step: create a dependency graph G = (V, E) A task τ i has three vertices, for C i,1, A i,1, and C i,2, in a chain If τ i and τ j share the same binary Mutex-Lock, their A i,1 and A j,1 must be defined in the precedence constraint C1,1 C2,1 C3,1 C4,1 C5,1 A1,1 A2,1 A3,1 A4,1 A5,1 C1,2 C2,2 C3,2 C4,2 C5,2 Mutex-Lock S1 Mutex-Lock S2 Second step: generate a schedule of G on M processors either preemptive, or non-preemptive schedules either global, semi-partitioned, or partitioned schedules Jian-Jia Chen 12 / 21

Generate A Dependency Graph Jackson s rule (non-preemptive EDF) for the sync. processor: The deadline of task τ i is H C i,2 normal execution critical section critical sec. release P 0 : sync P 1 : τ 1 P 2 : τ 2 P 3 : τ 3 P 4 : τ 4 τ 2 C i,1 A i,1 C i,2 τ 1 3 3 3 τ 2 1 2 5 τ 3 4 3 2 τ 4 9 1 4 τ 5 2 2 4 τ 6 3 2 0.5 P 5 : τ 5 P 6 : τ 6 0 2 4 6 8 10 12 14 16 18 t Jian-Jia Chen 13 / 21

Generate A Dependency Graph Jackson s rule (non-preemptive EDF) for the sync. processor: The deadline of task τ i is H C i,2 normal execution critical section critical sec. release P 0 : sync P 1 : τ 1 P 2 : τ 2 P 3 : τ 3 P 4 : τ 4 τ 2 τ 5 C i,1 A i,1 C i,2 τ 1 3 3 3 τ 2 1 2 5 τ 3 4 3 2 τ 4 9 1 4 τ 5 2 2 4 τ 6 3 2 0.5 P 5 : τ 5 P 6 : τ 6 0 2 4 6 8 10 12 14 16 18 t Jian-Jia Chen 13 / 21

Generate A Dependency Graph Jackson s rule (non-preemptive EDF) for the sync. processor: The deadline of task τ i is H C i,2 normal execution critical section critical sec. release P 0 : sync P 1 : τ 1 P 2 : τ 2 P 3 : τ 3 P 4 : τ 4 τ 1 τ 2 τ 5 C i,1 A i,1 C i,2 τ 1 3 3 3 τ 2 1 2 5 τ 3 4 3 2 τ 4 9 1 4 τ 5 2 2 4 τ 6 3 2 0.5 P 5 : τ 5 P 6 : τ 6 0 2 4 6 8 10 12 14 16 18 t Jian-Jia Chen 13 / 21

Generate A Dependency Graph Jackson s rule (non-preemptive EDF) for the sync. processor: The deadline of task τ i is H C i,2 normal execution critical section critical sec. release P 0 : sync P 1 : τ 1 P 2 : τ 2 P 3 : τ 3 P 4 : τ 4 τ 1 τ 2 τ 5 τ 3 C i,1 A i,1 C i,2 τ 1 3 3 3 τ 2 1 2 5 τ 3 4 3 2 τ 4 9 1 4 τ 5 2 2 4 τ 6 3 2 0.5 P 5 : τ 5 P 6 : τ 6 0 2 4 6 8 10 12 14 16 18 t Jian-Jia Chen 13 / 21

Generate A Dependency Graph Jackson s rule (non-preemptive EDF) for the sync. processor: The deadline of task τ i is H C i,2 normal execution critical section critical sec. release P 0 : sync P 1 : τ 1 P 2 : τ 2 P 3 : τ 3 P 4 : τ 4 τ 1 τ 2 τ 5 τ 3 τ 4 C i,1 A i,1 C i,2 τ 1 3 3 3 τ 2 1 2 5 τ 3 4 3 2 τ 4 9 1 4 τ 5 2 2 4 τ 6 3 2 0.5 P 5 : τ 5 P 6 : τ 6 0 2 4 6 8 10 12 14 16 18 t Jian-Jia Chen 13 / 21

Generate A Dependency Graph Jackson s rule (non-preemptive EDF) for the sync. processor: The deadline of task τ i is H C i,2 normal execution critical section critical sec. release P 0 : sync P 1 : τ 1 P 2 : τ 2 P 3 : τ 3 P 4 : τ 4 τ 1 τ 2 τ 5 τ 3 τ 4 τ 6 C i,1 A i,1 C i,2 τ 1 3 3 3 τ 2 1 2 5 τ 3 4 3 2 τ 4 9 1 4 τ 5 2 2 4 τ 6 3 2 0.5 P 5 : τ 5 P 6 : τ 6 0 2 4 6 8 10 12 14 16 18 t Jian-Jia Chen 13 / 21

Generate A Dependency Graph Jackson s rule (non-preemptive EDF) for the sync. processor: The deadline of task τ i is H C i,2 P 0 : sync τ 2 τ 5 τ 1 τ 3 τ 4 τ 6 0 2 4 6 8 10 12 14 16 18 t C 2,1 C 5,1 C 1,1 C 3,1 C 4,1 C 6,1 A 2,1 A 5,1 A 1,1 A 3,1 A 4,1 A 6,1 C 2,2 C 5,2 C 1,2 C 3,2 C 4,2 C 6,2 Jian-Jia Chen 13 / 21

Generate A Dependency Graph Jackson s rule (non-preemptive EDF) for the sync. processor: The deadline of task τ i is H C i,2 P 0 : sync τ 2 τ 5 τ 1 τ 3 τ 4 τ 6 0 2 4 6 8 10 12 14 16 18 t C 2,1 C 5,1 C 1,1 C 3,1 C 4,1 C 6,1 A 2,1 A 5,1 A 1,1 A 3,1 A 4,1 A 6,1 C 2,2 C 5,2 C 1,2 C 3,2 C 4,2 C 6,2 Jian-Jia Chen 13 / 21

Schedules for the DAG tasks (List Scheduling) 1 2 2 1 2 2 normal execution (1st/2nd) 1 2 1 2 2 1 3 4 1 3 2 2 τ 1 τ 2 τ 3 τ 4 τ 5 τ 6 Semaphore S2 Mutex-Lock S1 critical section (S1/S2) P 1 : τ 1 τ 1 τ 3 τ 3 τ 5 τ 5 τ 1 τ 3 τ 5 τ 6 P 2 : τ 2 τ 2 τ 4 τ 4 τ 6 τ 6 τ 2 τ 4 0 2 4 6 8 10 12 14 16 18 20 t Jian-Jia Chen 14 / 21

Schedules for the DAG tasks (List Scheduling) 1 2 2 1 2 2 normal execution (1st/2nd) 1 2 1 2 2 1 3 4 1 3 2 2 τ 1 τ 2 τ 3 τ 4 τ 5 τ 6 Semaphore S2 Mutex-Lock S1 critical section (S1/S2) P 1 : τ 1 τ 1 τ 3 τ 3 τ 5 τ 5 τ 1 τ 3 τ 5 τ 6 P 2 : τ 2 τ 2 τ 4 τ 4 τ 6 τ 6 τ 2 τ 4 τ 6 0 2 4 6 8 10 12 14 16 18 20 t Jian-Jia Chen 14 / 21

Makespan Analysis Algorithms used in two steps: first step: α-approximation algorithm for the problem 1 r j L max under the delivery-time model: d j is negative Jian-Jia Chen 15 / 21

Makespan Analysis Algorithms used in two steps: first step: α-approximation algorithm for the problem 1 r j L max under the delivery-time model: d j is negative Extended Jackson s rule: α = 2 Potts iterative process: α = 1.5 Hall and Shmoys bidirectional iterative process: α = 4/3 Hall and Shmoys polynomial-time approximation scheme: α = 1 + ɛ for any ɛ > 0 Jian-Jia Chen 15 / 21

Makespan Analysis Algorithms used in two steps: first step: α-approximation algorithm for the problem 1 r j L max under the delivery-time model: d j is negative Extended Jackson s rule: α = 2 Potts iterative process: α = 1.5 Hall and Shmoys bidirectional iterative process: α = 4/3 Hall and Shmoys polynomial-time approximation scheme: α = 1 + ɛ for any ɛ > 0 second step: list scheduling (semi-partitioned) to schedule a DAG for the problem P prec C max Jian-Jia Chen 15 / 21

Makespan Analysis Algorithms used in two steps: first step: α-approximation algorithm for the problem 1 r j L max under the delivery-time model: d j is negative Extended Jackson s rule: α = 2 Potts iterative process: α = 1.5 Hall and Shmoys bidirectional iterative process: α = 4/3 Hall and Shmoys polynomial-time approximation scheme: α = 1 + ɛ for any ɛ > 0 second step: list scheduling (semi-partitioned) to schedule a DAG for the problem P prec C max Approximation ratio for minimizing the makespan for task synchronization for N tasks on M processors is { 1 + α α M if M < N α if M N Jian-Jia Chen 15 / 21

Limitation of DGA The two-step approach in DGA results in non-optimal solutions even if both steps are done optimally regardless of complexity proof based on a concrete example Jian-Jia Chen 16 / 21

Limitation of DGA The two-step approach in DGA results in non-optimal solutions even if both steps are done optimally regardless of complexity proof based on a concrete example For minimizing the makespan, the approximation ratio by optimizing both steps optimally is at least 2 2 M + 1 M 2 2 1 M under any scheduling paradigm under partitioned or semi-partitioned scheduling Jian-Jia Chen 16 / 21

Evaluation Setup Evaluations with M = 4, 8, and 16 processors 1000 task sets, each with 10M tasks RandomFixedSum by Emberson and Davis was modified The number of shared resources: z {4, 8, 16} The length A i,1 is a fraction of the total execution time C i,1 + C i,2 + A i,1 of task τ i, depended on β {5% 50%} Jian-Jia Chen 17 / 21

Evaluation Setup Evaluations with M = 4, 8, and 16 processors 1000 task sets, each with 10M tasks RandomFixedSum by Emberson and Davis was modified The number of shared resources: z {4, 8, 16} The length A i,1 is a fraction of the total execution time C i,1 + C i,2 + A i,1 of task τ i, depended on β {5% 50%} Metrics: Makespan performance Schedulability analysis for frame-based periodic tasks Jian-Jia Chen 17 / 21

Makespan Performance JKS-SP-NP JKS-P-NP JKS-SP-P JKS-P-P POTTS-SP-NP POTTS-P-NP POTTS-SP-P POTTS-P-P 100 (a) M=8 z=8 β=10%-40% Acceptance Ratio (%) 80 60 40 20 LB is the lower bound (detailed in the paper) 0 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 D/LB Jian-Jia Chen 18 / 21

Makespan Performance JKS-SP-NP JKS-P-NP JKS-SP-P JKS-P-P POTTS-SP-NP POTTS-P-NP POTTS-SP-P POTTS-P-P 100 (a) M=8 z=8 β=10%-40% Acceptance Ratio (%) 80 60 40 20 D is the target deadline 0 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 D/LB Jian-Jia Chen 18 / 21

Makespan Performance JKS-SP-NP JKS-P-NP JKS-SP-P JKS-P-P POTTS-SP-NP POTTS-P-NP POTTS-SP-P POTTS-P-P 100 (a) M=8 z=8 β=10%-40% Acceptance Ratio (%) 80 60 40 20 D/LB: performance gap between the approximation and the optimal solutions 0 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 D/LB Jian-Jia Chen 18 / 21

Makespan Performance JKS-SP-NP JKS-P-NP JKS-SP-P JKS-P-P POTTS-SP-NP POTTS-P-NP POTTS-SP-P POTTS-P-P 100 (a) M=8 z=8 β=10%-40% Acceptance Ratio (%) 80 60 40 20 percentage of task sets meeting their deadlines 0 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 D/LB Jian-Jia Chen 18 / 21

Makespan Performance JKS-SP-NP JKS-P-NP JKS-SP-P JKS-P-P POTTS-SP-NP POTTS-P-NP POTTS-SP-P POTTS-P-P Acceptance Ratio (%) 100 80 60 40 20 (a) M=8 z=8 β=10%-40% Partition(P) or Semi-Partition (SP) Preemptive(P) or Non-Preemptive (NP) 0 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 D/LB Jian-Jia Chen 18 / 21

Makespan Performance JKS-SP-NP JKS-P-NP JKS-SP-P JKS-P-P POTTS-SP-NP POTTS-P-NP POTTS-SP-P POTTS-P-P 100 (a) M=8 z=8 β=10%-40% Acceptance Ratio (%) 80 60 40 20 0 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 D/LB Jian-Jia Chen 18 / 21

Makespan Performance JKS-P-NP JKS-SP-P POTTS-P-NP POTTS-SP-P 100 (a) M=8 z=8 β=10%-40% Acceptance Ratio (%) 80 60 40 20 0 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 D/LB Jian-Jia Chen 18 / 21

Makespan Performance POTTS-SP-NP POTTS-P-NP POTTS-SP-P POTTS-P-P 100 (a) M=8 z=8 β=10%-40% Acceptance Ratio (%) 80 60 40 20 0 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 D/LB Jian-Jia Chen 18 / 21

Experimental Results: Schedulability Acceptance Ratio (%) 1.0 JKS-SP-P POTTS-SP-P ROP-EDF ROP-FP (a) M=8 z=8 β=5%-10% 100 0.8 80 0.6 60 40 0.4 20 0.2 0 40 50 60 70 80 90 100 Utilization (%) / M (b) M=8 z=8 β=40%-50% 100 80 60 40 20 0 40 50 60 70 80 90 100 0.0 0.0 0.2 0.4 0.6 0.8 1.0 ROP-FP: ROP under fixed-priority scheduling and release enforcement. ROP-EDF: ROP under EDF and release enforcement POTTS-SP-P: Our approach by applying algorithm Potts for generating G, semi-partitioned list scheduling, and preemptive scheduling for the second non-critical section JKS-SP-P: Our approach by applying algorithm JKS for generating G, semi-partitioned list scheduling, and preemptive scheduling for the second non-critical section Jian-Jia Chen 19 / 21

Conclusion Difficulty mainly due to the sequencing of the mutual exclusion of share resources. None of the following tricks helps adding more processors removing periodicity and job recurrence introducing task migration allowing preemption Jian-Jia Chen 20 / 21

Conclusion Difficulty mainly due to the sequencing of the mutual exclusion of share resources. None of the following tricks helps adding more processors removing periodicity and job recurrence introducing task migration allowing preemption Performance gap of partitioned and semi-partitioned scheduling mainly due to scheduling the dependency graph Partitioned scheduling P prec, tied Cmax is less understood Jian-Jia Chen 20 / 21

Conclusion Difficulty mainly due to the sequencing of the mutual exclusion of share resources. None of the following tricks helps adding more processors removing periodicity and job recurrence introducing task migration allowing preemption Performance gap of partitioned and semi-partitioned scheduling mainly due to scheduling the dependency graph Partitioned scheduling P prec, tied Cmax is less understood A prototype implementation on LITMUSRT for frame-based tasks, detailed in paper Jian-Jia Chen 20 / 21

Open Problems Work-conserving is not the best Existing protocols assume work-conserving for critical sections DGA reveals a potential of non-work-conserving protocols Jian-Jia Chen 21 / 21

Open Problems Work-conserving is not the best Existing protocols assume work-conserving for critical sections DGA reveals a potential of non-work-conserving protocols Extension to periodic task sets One special case is directly applicable: a mutex lock is only shared among the tasks that have the same period For each of the z mutex locks, a DAG is constructed The z resulting DAGs can be scheduled using any approach for multiprocessor DAG scheduling Other cases remain open Jian-Jia Chen 21 / 21

Open Problems Work-conserving is not the best Existing protocols assume work-conserving for critical sections DGA reveals a potential of non-work-conserving protocols Extension to periodic task sets One special case is directly applicable: a mutex lock is only shared among the tasks that have the same period For each of the z mutex locks, a DAG is constructed The z resulting DAGs can be scheduled using any approach for multiprocessor DAG scheduling Other cases remain open Multiple critical sections How to generate the dependency graphs? Does the approach still work well? Jian-Jia Chen 21 / 21

Open Problems Work-conserving is not the best Existing protocols assume work-conserving for critical sections DGA reveals a potential of non-work-conserving protocols Extension to periodic task sets One special case is directly applicable: a mutex lock is only shared among the tasks that have the same period For each of the z mutex locks, a DAG is constructed The z resulting DAGs can be scheduled using any approach for multiprocessor DAG scheduling Other cases remain open Multiple critical sections How to generate the dependency graphs? Does the approach still work well? Thank You. Jian-Jia Chen 21 / 21