Online Energy-Aware I/O Device Scheduling for Hard Real-Time Systems with Shared Resources

Size: px
Start display at page:

Download "Online Energy-Aware I/O Device Scheduling for Hard Real-Time Systems with Shared Resources"

Transcription

1 Online Energy-Aware I/O Device Scheduling for Hard Real-Time Systems with Shared Resources Abstract The challenge in conserving energy in embedded real-time systems is to reduce power consumption while preserving temporal correctness. Previous research has focused on power conservation for the processor, while power conservation for I/O devices has received little attention. In this paper, we analyze the problem of online energyaware I/O scheduling for hard real-time systems based on the preemptive periodic task model with non-preemptive shared resources. We propose two online energy-aware I/O device scheduling algorithms: Conservative Energy- Aware EDF (CEA-EDF) and Enhanced Aggressive Shut Down (EASD). The CEA-EDF algorithm makes conservative predictions for device usage and guarantees that a device is in the active state before or at the time the job that requires it is released. The EASD algorithm utilizes device slack to perform device power state transitions to save energy, without jeopardizing temporal correctness. Both algorithms are preemptive but support non-preemptive shared critical regions. An evaluation of the two approaches shows that both yield significant energy savings with respect to no Dynamic Power Management (DPM) techniques. The actual savings depends on the task set, shared devices, and the power requirements of the devices. EASD provides better power savings than CEA-EDF. However CEA-EDF has very low overhead and performs comparable to EASD when the system workload is low. 1 Introduction In recent years, many embedded real-time systems have emerged with energy conservation requirements. Most of these systems consist of a microprocessor with I/O devices and batteries with limited power capacity. Therefore, aggressive energy conservation techniques are needed to extend their lifetimes. Traditionally, the research community has focused on processor-based power management techniques, with many articles published on processor energy conservation. On the other hand, research of energy conservation with I/O devices has received little 1

2 attention. In practice, however, I/O devices are also important power consumers, but typically support fewer power states than processors. At an arbitrary instance, most devices can only be in one of two states: active or idle. To increase energy savings, the time for which a device is idle must be increased, but I/O devices take much longer to perform power state transitions than processors. A DSP device can take as long as 500 ms to switch states [19]. Furthermore, energy consumption during state transitions is not negligible. Too many context switches may increase power consumption rather than decreasing it. The problem of saving energy for I/O devices in hard real-time systems is a dilemma: we want to shut down a device whenever the device is not being used, but at a risk of turning on devices too late, which makes some jobs miss their deadlines, or causes unnecessary state switches that waste energy. We discuss this in detail in Section 3. In this paper, we analyze the problem of energy-aware I/O scheduling for hard real-time systems based on the preemptive periodic task model with non-preemptive shared resources. Without knowing the actual job executions a priori, an optimal solution is not possible for either online or offline scheduling algorithms. Here we define optimal as the maximum energy savings for a task set. The actual savings depends on the task set, actual execution times, shared devices, and the power requirements of the devices. Two online scheduling algorithms that support shared resources are proposed: Conservative Energy-Aware EDF (CEA-EDF) and Enhanced Aggressive Shut Down (EASD). Both of these algorithms use Earliest Deadline First (EDF) [6] to schedule jobs, and use Stack Resource Policy (SRP) [2] to control access to shared resources, which are typically granted to jobs on a non-preemptive basis and used in a mutually exclusive manner. When performing preemptive scheduling with I/O devices, I/O devices become important shared resources whose access needs to be carefully managed. For example, a job that performs an uninterruptible I/O operation can block the execution of all jobs with higher priorities. Thus the time for the uninterruptible I/O operation needs to be treated as a nonpreemptive resource access. Other resources besides I/O devices include critical sections of code, reader/writer buffers, etc. The rest of this paper is organized as follows. Section 2 discusses related work. The problem of energy-aware I/O device scheduling is analyzed in Section 3. Section 4 describes the proposed algorithms. Section 5 describes how we evaluated our system and presents the results. Section 6 presents our conclusions and describes future work. 2

3 2 Related Work In the past decade, much research work has been conducted on low-power design methodologies for realtime embedded systems. For hard real-time systems, the research has focused primarily on reducing the power consumption of the processor. The research on power conservation technologies for I/O devices, though important, has received little attention. Most Dynamic Power Management (DPM) techniques for devices are based on switching a device to a low power state (or shutdown) during an idle interval. DPM techniques for I/O devices in non-real-time systems focus on switching the devices into low power states based on various policies (e.g., [9, 10, 8, 5, 23]). These strategies cannot be directly applied to real-time systems because of their non-deterministic nature. Some energy-aware I/O scheduling algorithms [18, 19, 20, 21] have been developed for hard real-time systems. Among them, [18, 19, 20] are non-preemptive methods, which are known to have limitations. With non-preemptive scheduling, a higher priority task that has been released might have to wait a long time to run (until the current task gives up the CPU). This reduces the set of tasks that the scheduler can support with hard temporal guarantees. For example, non-preemptive scheduling algorithms cannot support any task set in which there is a task with a period shorter than or equal to the Worst Case Execution Time (WCET) of another task. For this reason, most commercial real-time kernels support preemptive task scheduling. In [18], Swaminathan et al. presented the Low Energy Device Scheduler (LEDES) for energy-aware I/O device scheduling for hard real-time systems. LEDES takes as input a pre-determined task schedule and a device-usage list for each task and generates a sequence of sleep/working states for each device. LEDES determines this sequence such that the energy consumed by the devices is minimized while guaranteeing that no task misses its deadline. However, LEDES differs from our work in that it is based on the assumption that scheduling points always occur at task start or completion times. In other words, it can only support non preemptive task scheduling. Another assumption is that the execution times of all tasks are greater than the transition times of required devices. This assumption may not be valid if some required devices have relatively large transition delays, e.g. disks. An extension of LEDES to handle I/O devices with multiple power states is presented in [20] by Swaminathan and Chakrabarty. Multi-state Constrained Low Energy Scheduler (MUSCLES) takes as input a pre-computed task schedule and a per-task device usage list to generate a sequence of power state switching times for I/O devices while guaranteeing that real-time constraints are not violated. MUSCLES is also a non-preemptive method. The pruning-based scheduling algorithm, Energy-optimal Device Scheduler (EDS), is an off-line method in 3

4 which jobs are rearranged to find the minimum energy task schedule [19]. EDS generates a schedule tree by selectively pruning the branches of the tree. Pruning is done based on both temporal and energy constraints. Similar to LEDES and MUSCLES, EDS can only support non-preemptive scheduling systems. The only known published energy-aware algorithm for preemptive schedules, Maximum Device Overlap (MDO), is an offline method proposed by the same authors in [21]. The MDO algorithm uses a real-time scheduling algorithm, e.g., EDF or RM, to generate a feasible real-time job schedule, and then iteratively swaps job segments to reduce energy consumption in device power state transitions. After the heuristic-based job schedule is generated, the device schedule is extracted. That is, device power state transition actions and times are recorded prior to runtime and used at runtime. A deficiency of the MDO algorithm is that it does not explicitly address the issue of resource blocking. It is usually impossible to estimate when a resource blocking will happen at the offline phase. Thus it is hard to integrate a resource accessing policy into MDO. Without considering resource blocking, it is possible that a feasible offline heuristic job schedule results in an invalid online schedule, especially with swapping of job segments. Another problem with MDO is that it does not consider the situation when job executions are less than their WCET; the schedule is generated with jobs WCET. Even without resource blocking, the actual job executions can be very different from the pre-generated job schedule. A fixed device schedule cannot effectively adapt to actual job executions. This problem is further discussed in Section 5. The CEA-EDF and EASD algorithms proposed in this paper remove these drawbacks by providing energysaving scheduling for periodic task sets that have feasible preemptive schedules with blocking for shared resources. To the best of our knowledge, no previous publication has addressed this problem. Another advantage of CEA-EDF and EASD over existing algorithms is that they support actual execution times less than WCET. Unused WCET is dynamically reclaimed to increase energy savings. 3 Problem description Modern I/O devices usually have at least two power states: active and idle. To save energy, a device can be switched to the idle state when it is not in use. In a real-time system, in order to guarantee that jobs will meet their deadlines, a device cannot be made idle without knowing when it will be requested by a job, but, the precise time at which an application requests the operating system for a device is usually not known. Even without knowing the exact time at which requests are made, we can safely assume that devices are requested within the time of execution of the job making the request. Throughout the paper, we assume that task scheduling is based on EDF and resource access is based on SRP. 4

5 The EDF algorithm is a well-known optimal scheduling algorithm. SRP has two advantages over other resource accessing policies: (1) it has low context switch overhead. No job is ever blocked once its execution starts, and no job ever suffers more than two context switches. For other policies such as the Priority-Ceiling Protocol (PCP) [14], four context switches can occur if a job requires one or more resources. (2) A job can be blocked for at most the duration of one critical section. Therefore, the blocking time is bounded. The non-preemptable segment of a job is called a critical section. 3.1 Preliminaries Suppose that the set of devices and resources required by each task during its execution is specified along with the temporal parameters of a periodic task set. More formally, given a periodic task set with deadlines equal to periods, τ = {T 1, T 2,...T n }, let task T i be specified by the four tuple (P (T i ), wcet(t i ), Dev(T i ), Res(T i )) where, P (T i ) is the period, wcet(t i ) is the WCET, Dev(T i ) = {λ 1, λ 2,..., λ m } is the set of required devices for the task T i, and Res(T i ) = {r 1, r 2,...r n } is the set of resources required by the task. Note that Dev(T i ) specifies physical devices required by a task T i, while Res(T i ) specifies how these devices appear as shared resources to task T i. A non-preemptive device may appear as a shared resource with different access times to different tasks; and a preemptive device may be included in Dev(T i ) but not in Res(T i ). Furthermore, if the I/O operation of device λ i Dev(T i ) is non-interruptible, then a shared resource representing λ i should be put in the resource set of T i and all tasks with higher priorities to prevent a possible preemption. For example, suppose that task T i performs a non-interruptible I/O operation for 10 ms during its execution, then a resource with access time of 10 ms should be put in the resource set of T i ; and the same resource with access time of 0 should be put in the resource set of all tasks that may preempt T i (e.g., tasks with shorter periods for EDF). T i can be preempted by higher priority tasks at any time before or after this I/O operation. In summary, a shared resource is a general concept in this model. Suppose a section of task T i is non-preemptive to a subset of tasks, α, then this section should be treated as a shared resource to T i and all tasks in α. A task is an infinite sequence of jobs released every P (T i ) time units. We refer to the j th job of a task T i as J i,j. We let Dev(J i,j ) denote the set of devices that are required by J i,j. Throughout this paper, we have Dev(J i,j )=Dev(T i ). We let et(j i,j, [t, t ]) denote the execution time of job J i,j during the interval [t, t ]. It follows that et(j i,j, [0, t]) is the actual execution time of J i,j, if t is equal to or larger than the time job J i,j finishes its execution. Furthermore, the priorities of all jobs are assigned according to EDF. For any two jobs, the job with the earlier deadline has a higher priority. If two jobs have equivalent deadlines, the job with the earlier release time has 5

6 a higher priority. The original assigned priority of a job J i,j is denoted by Org P rio(j i,j ). Note Org P rio(j i,j ) is the original assigned priority and is not changed during execution, though the actual priority may change due to priority inheritance with SRP. Associated with each device λ i are the following parameters: the transition time from the idle state to the active state represented by t wu (λ i ); the transition time from the active state to the idle state represented by t sd (λ i ); the energy consumed per unit time in the active state represented by P active (λ i ); the energy consumed per unit time in the idle state represented by P idle (λ i ); the energy consumed per unit time during the transition from the active state to the idle state represented by P sd (λ i ); and the energy consumed per unit time during the transition from the idle state to the active state represented by P wu (λ i ). We assume that for any device, the state switch can only be performed when the device is in a stable state, i.e. the idle state or the active state. We will use these parameters in the problem discussion and algorithm descriptions. 3.2 Motivation The generalized problem that we aim to solve in this paper can be stated as, given a periodic task set τ = {T 1, T 2,...T n }, T i = (P (T i ), wcet(t i ), Dev(T i ), Res(T i )), is there a preemptive schedule that meets all deadlines and also reduces the energy consumed by the I/O system? It is clear that the total energy consumed by a device λ i in the hyperperiod H, is given by, E λi = E active + E idle + E sw (1) where, E active is the energy consumed when λ i is in the active state; E idle is the energy consumed by λ i when it is in the idle state; and E sw is the energy consumed when λ i is in transition states. Let T active (λ i ), T idle (λ i ), T wu (λ i ) and T sd (λ i ) denote the time that the device λ i is active, idle, and during wake up/shut shown state transitions respectively. Then we have E active = P active (λ i ) T active (λ i ), E idle = P idle (λ i ) T idle (λ i ) and E sw = P wu (λ i ) T wu (λ i ) + P sd (λ i ) T sd (λ i ). In addition, for most devices, we have, P active (λ i ), P wu (λ i ), P sd (λ i ) > P idle (λ i ) (2) From Equations (1) and (2), it can be seen that to increase energy savings, an energy-aware scheduler should make it the first priority to decrease T active (λ i ) as well as the number of power state transitions. However, it is usually hard to decrease both at the same time while not affecting temporal correctness. For example, consider the obvious approach of aggressively shutting down devices whenever they are not needed, which is called the Aggressive Shut Down (ASD) algorithm. ASD reduces T active as much as possible, but may increase energy consumption because it may introduce too many device switches. 6

7 T 1 J 1,1 J 1,2 T 2 J 2,1 J 2,1 λ k Figure 1. The device state transition delay can cause system failure even when the system utilization is low. T 1 = {12, 2,, }; T 2 = {30, 5, {λ k }, }; t wu (λ k ) = t sd (λ k ) = 8. The system utilization is 33.3%. The optimal solution to this problem should be that job executions are arranged in a way that has the lowest energy consumption while still guaranteeing that all tasks meet their deadlines. However, this is an NP-hard problem and an efficient optimal solution is not possible for on-line scheduling due to its huge overhead. At first thought, an offline approach seems better because it can utilize pre-calculated task schedules. However, it is difficult to integrate a resource accessing policy into an offline approach because it is hard to predict exact points that jobs access resources at the offline phase. It is possible that a seemly feasible offline job schedule causes jobs to miss their deadlines at runtime. Moreover, using an offline approach alone can be inefficient, since the offline approach can only use the worst execution time of each task, and is difficult to adapt to actual job executions. Compared to offline methods, it is simple to integrate a resource access policy into an online algorithm. Moreover, online algorithms can better adapt to run-time situations. Unused worst case execution time can be exploited to increase T idle for devices. In this paper, we try to find online solutions to this problem. As discussed before, the ASD algorithm is a straightforward online method. However, this method cannot be directly applied to hard real-time systems due to two inherent constraints: 1. To ensure timing constraints are met using the ASD method, device switch times must be included in a task s WCET, which may compromise the system s schedulability. Figure 1 shows an example where a system is not schedulable with the ASD algorithm, though the CPU utilization is only 33.3%. 2. The second problem is that the ASD algorithm does not consider energy consumption associated with device state transitions. Since the energy consumption associated with the state switch could be high, the ASD algorithm may waste energy. Consider using a device with very high switch power costs. It is easy to find scenarios where the ASD algorithm consumes more energy than keeping the device active all the time. 7

8 3.3 Approach Despite the two constraints, the ASD algorithm can achieve excellent energy savings in many systems since it reduces T active of devices as much as possible. The starting point of this work is to conquer the two constraints of the ASD algorithm and make it applicable to hard real-time system. To this end, two objectives exist in our solutions: (1) our algorithms are applicable to all task sets schedulable with EDF and SRP; (2) our algorithms consider the problem of energy consumption associated with device state switches. Therefore, our algorithms can guarantee energy savings in any feasible situation. Two online preemptive scheduling algorithms that support shared resources are proposed: Conservative Energy- Aware EDF (CEA-EDF) and Enhanced Aggressive Shut Down (EASD). For the first problem, CEA-EDF guarantees that a device is in the active state when a job requiring the device releases. Although this seems too conservative, this algorithm can achieve significant energy savings and can be easily implemented with very little overhead, thus yielding a good performance/cost ratio. EASD employs a different approach. It keeps track of the amount of time a device can be kept inactive without causing a job to miss its deadline. This time is called device slack in this paper. A device is allowed to switch its state only when the device slack is large enough. Detailed discussion is presented in Section 4. Both algorithms utilize the concept of break-even time [3], which represents the minimum inactivity time required to compensate for the cost of entering and exiting the idle state. For example, suppose a device λ k is active at time t and it is known that no job requires it during [t, t + t]. To save energy, λ k can be switched to the idle state at time t and be switched back to the active state at time t + t if t is larger than the transition time t sd (λ k ) + t wu (λ k ). The amount of energy expended by the device during this period is the sum of the energy expended during transitions, E wu and E sd, and the energy expended in the idle state, E idle. However, for a device λ k that expends considerable energy during state transitions, it is possible that the device can consume less energy if it is kept active during this period. That is, E wu + E sd + E idle > P active (λ k ) t. Therefore, λ k needs to be in the idle state long enough to save energy. Let BE(λ i ) denote the break-even time of device λ i. By knowing the energy expended for transitions, E wu (λ k ) = P wu (λ k ) t wu (λ k ) and E sd (λ k ) = P sd (λ k ) t sd (λ k ), as well as the transition delay t sw = t wu (λ k ) + t sd (λ k ), we can calculate the break-even time, BE(λ k ), as 8

9 P active BE(λ k ) = E wu (λ k ) + E sd (λ k ) + P idle (BE(λ k ) t sw (λ k )) = BE(λ k ) = E wu(λ k ) + E sd (λ k ) P idle (λ k ) t sw (λ k ) P active (λ i ) P idle (λ i ) Note that the break-even time has to be larger than the transition delay, i.e., t sw (λ k ). So the break-even time is given by BE(λ k ) = Max(t sw (λ k ), E wu(λ k ) + E sd (λ k ) P idle (λ k ) t sw (λ k ) ) (3) P active (λ i ) P idle (λ i ) It is clear that if a device is idle for less than the break-even time, it is not worth performing the state switch. Therefore, our approach makes decisions of device state transition based on the break-even time rather than device state transition delay. At this point, an obvious improvement to the ASD algorithm can be made so that it utilizes the break-even time to conquer the second constraint. The Switch-Aware Aggressive Shut Down (SA-ASD) algorithm, which makes this enhancement to ASD, and a sufficient schedulability condition are presented in Appendix D. Systems that can satisfy the schedulability condition for the SA-ASD algorithm should use the SA-ASD algorithm rather than the proposed CEA-EDF and EASD algorithms, because of lower scheduling overhead for SA-ASD. Note that the SA-ASD algorithm still has the first constraint, however, which is overcome by both CEA-EDF and EASD. 4. Algorithms This section introduces two OS-directed, real-time, DPM techniques applicable to I/O devices, which are based on EDF [6] and SRP [2]: CEA-EDF and EASD. However, we first briefly review how SRP works with EDF Review of EDF and SRP Each task T i is assigned a preemption level P L(T i ), which is the reciprocal of the period of the task. The preemption ceiling of any resource r i is the highest preemption level of all the tasks that require r i. We use the term Π(t) to denote the current ceiling of the system, which is the highest-preemption level ceiling of all the resources that are in use at time t. Φ is a non existing preemption level that is lower than the lowest preemption level of all tasks. The rules can be stated as follows [7]. 1. Update of the Current Ceiling: Whenever all the resources are free, the preemption ceiling of the system is Φ; otherwise, the preemption ceiling Π(t) is the highest preemption level ceiling of all the resources that are 9

10 in use at time t. The preemption ceiling of the system is thus updated when a resource is allocated or freed (at the end of a critical section). 2. Scheduling Rule: After a job is released, it is blocked from starting execution until its preemption level is higher than the current ceiling Π(t) of the system and the preemption level of the executing job. At any time t, jobs that are not blocked are scheduled on the processor according to their deadlines. 3. Allocation Rule: Whenever a job requests a resource, it is allocated the resource. 4. Priority-Inheritance Rule: When some job is blocked from starting, the blocking job inherits the highest priority of all blocked jobs. When scheduling by EDF and SRP, a job can be blocked for at most the duration of one critical section, which includes regions of shared resource access. The calculation of the maximal blocking duration can be found in [7]. The computation is done off-line and used at runtime. We use B(T i ) to denote the blocking duration for task T i. In addition, the maximal blocking duration of each job J i,j, B(J i,j ), of task T i is equal to B(T i ) CEA-EDF CEA-EDF is a simple, low-overhead energy-aware scheduling algorithm for hard real-time systems. All devices that a job needs are active at or before the job is released. Thus devices are safely shut down without affecting the schedulability of tasks. Before describing CEA-EDF, we define the next device request time and time to next device request that are used in keeping track of the earliest time that a device is required. Definition 4.1. Next Device Request Time. The next device request time is denoted by NextDevReqT ime(λ k, t) and is the earliest time that a device λ k is requested by an uncompleted job. Since a job can only use a device after the job is released, the next device request time of a device λ k is given by NextDevReqT ime(λ k, t) = Min(R(J i,j )) J i,j, λ k Dev(J i,j ) and J i,j is not completed at time t where R(J i,j ) is the release time of job J i,j, Dev(J i,j ) is the set of devices required by J i,j. Definition 4.2. Time To Next Device Request. The time to next device request for device λ k at time t is denoted by T imet onextdevreq(λ k, t) and is the time from current time t to the next device request time of λ k. Therefore, the time to next device request of a device λ k at time t is given by T imet onextdevreq(λ k, t) = NextDevReqT ime(λ k, t) t; 10

11 Device Tasks requiring λ i Current Jobs R(J i,j ) NextDevReqT ime(λ k, t) Power up time Up(λ k ) λ 1 {T 1, T 3, T 5 } {J 1,20, J 3,15, J 5,78 } {160, 200, 220} λ 2 {T 1, T 2 } {J 1,20, J 2,25 } {160, 210} λ 3 {T 3, T 4, T 6 } {J 3,15, J 4,25, J 6,18 } {200, 215, 207} Table 1. Device Usage Table. The CEA-EDF scheduler uses this table to keep tracks of next device request time for each device. The CEA-EDF scheduler also use this table to power up an idle device λ k based on the maintained Up(λ k ). 1 Function T imet onextdevreq() 2 Input: current system time t and the current executing job J i,j; 3 Output: renewed time to next device request for all devices; 4 If (t: instance when job J i,j is completed) 5 α α J i,j + J i,j+1; // α is the set of current jobs requiring λ k. 6 λ k Dev(J i,j), NextDevReqT ime(λ k, t) Min(R(J m,n)), where J m,n α; // Update next device request time. 7 λ k Dev(J i,j), T imet onextdevreq(λ k, t) NextDevReqT ime(λ k, t) t; 8 Else 9 // do nothing 10 End Figure 2. The pseudocode for T imet on extdevreq algorithm. This algorithm updates the time to next device request for devices. The CEA-EDF scheduler maintains a table for each device as shown in Table 1. The current job of a task is the uncompleted job with the earliest deadline among all jobs of the task. For example, suppose J 1,1 is the first job of a task T 1, and J 1,1 is released at time 0 and is completed at time 10. Then the current job of task T 1 is J 1,1 during [0, 10). Suppose the second job of T 1, J 1,2, is released at time 40 and is finished at time 50. Then J 1,2 is the current job of T 1 during [10, 50). By Definition 4.1, the next device request time NextDevReqT ime(λ k, t) is the minimal release time of all current jobs that require device λ k. With CEA-EDF, a device λ i is switched to the low power state at time t when T imet onextdevreq(λ i, t) > BE(λ i ), where BE(λ i ) is the break-even time for device λ i and computed using Equation (3). CEA-EDF sets a power up time, Up(λ i ), for device λ i when λ i is switched to the idle state. For any idle device, it is switched back to the active state if the power up time Up(λ i ) is equal to the current time t. The CEA-EDF scheduling algorithm then can be described as in Figure 3, and is invoked at scheduling points and when a power up time is reached. We define scheduling points as time instances at which jobs are released, completed, or exit critical sections. An example of CEA-EDF scheduling is illustrated in Figure 4. Our experiments in Section 5 show that CEA-EDF scheduling is effective in energy savings, especially when the system workload is low. Meanwhile, the implementation of CEA-EDF is simple. 11

12 1 Preprocessing: 2 Compute Break-Even time BE(λ k ) (1 k m) for each device, as shown in Equation (3). 3 Initiate next device request time NextDevReqT ime(λ k, 0) (1 k m) for each device, as defined in Definition Device scheduling at time t: 5 If (t: instance when job J i,j is completed) 6 If ( λ k, λ k = active and T imet onextdevreq(λ k, t) > BE(λ k )) 7 λ k idle; 8 Up(λ k ) NextDevReqT ime(λ k, t) t wu(λ k ); // Set the power up timer for λ k 9 End 10 End 11 If (t: λ k, λ k = idle and Up(λ k ) = t) // Switch λ k to active when current time is the power up time. 12 λ k active; 13 Up(λ k ) 1; // Clear the power up timer for λ k 14 End 15 Schedule jobs by EDF(SRP). Figure 3. The CEA-EDF algorithm. BE(λ k ) is the Break-Even time for λ k. t wu (λ k ) is the transition delay from the idle state to the active state. Up(λ k ) is the power up time set to λ k, at when the device will be powered up. J 1,1 J 2,1 λ 1 λ 2 Figure 4. CEA-EDF scheduling example; (a) the job scheduling from EDF. J 1,1 is released at 6 and uses device λ 1. J 2,1 is released at 2 and uses device λ 2. J 1,1 has a higher priority than J 2,1. (b) the device state transition with the CEA-EDF algorithm EASD As discussed in Section 3, an energy-aware I/O device scheduler should reduce T active (λ i ), which is the time that a device λ i is in the active state, since a device usually has the highest energy consumption rate in the active state. The ASD algorithm can reduce T active (λ i ) for all λ i. However, some task sets may not be able to utilize ASD because they cannot satisfy schedulablility conditions with WCETs that include device transition time. On the other hand, the CEA-EDF algorithm can be applied to any task set that is schedulable, but it is not as efficient as possible since it conservatively keeps some devices active while jobs requiring these devices are not the currently executing job. The EASD algorithm, which is based on the ASD algorithm, addresses these limitations by keeping track of device slack, which is defined as follows. 12

13 18 16 T 1 T 2 λ k J 1,1 J1,2 J 1, 3 J 2,1 Device access delay (a) Job schedule and device schedule Time (b) The device access delay for λ k Device dependent system slack Device slack Time (c) The device dependent system slack for λ k Time (d) The device slack for λ k. Figure 5. Device Slack examples. T 1 = {10, 4,, }; T 2 = {30, 4, {λ k }, }. That is, λ k Dev(T 2 ). For device λ k, t sd (λ k ) = t wu (λ k ) = 8; BE(λ k ) = 18. The device slack shown in (d) is the sum of the device access delay shown in (b) and the device dependent system slack shown in (c). Definition 4.3. Device slack. The device slack is the maximal length of time that a device λ i can be inactive without causing any job to miss its deadline. We let DevSlack(λ i, t) denote the device slack for a device λ i at time t. The energy savings provided by EASD is closely related to the amount of available device slack. The more device slack is exploited, the more opportunities can be created to put devices in the idle state to save energy. Thus exploiting available device slack is the focus of EASD. Device slack for a device λ k comes from different sources. As discussed in the CEA-EDF algorithm, the time to next device request is a source of available device slack. The CEA-EDF algorithm utilizes the time to next device request to keep devices idle to save energy. However, other sources of device slack exist. For example, another source of device slack for a device λ k comes from the execution of higher priority jobs that do not require device λ k. In this case, jobs requiring λ k cannot execute and thus create slack for the device. As shown in Figure 5, job J 1,1 occupies the CPU during [0, 4]; and the interval [0, 4] becomes a part of the device slack for device λ k. This 13

14 kind of slack does not introduce idle intervals and thus does not jeopardize temporal correctness of the system. In a system for which the utilization is less than 1, the execution of a job might be postponed without jeopardizing temporal correctness of the system; thus creating additional device slack. As shown in Figure 5, idle intervals are inserted in the interval [0, 22] because J 2,1 is delayed by the state transition of device λ k. However, this kind of slack needs to be carefully managed to maintain system schedulability. The EASD algorithm can utilize this kind of device slack while still guaranteeing that every job meets its deadline. In summary, three sources of device slack are identified in this paper: device access delay, device dependent system slack and time to next device request. The time to next device request is defined in Definition 4.2. Definition 4.4. Device access delay. The device access delay for a device λ k is the time during which jobs requiring λ k cannot execute because of the execution of higher priority jobs that do not need λ k. The device access delay for a device λ k at time t is denoted DevAccessDelay(λ k, t). Definition 4.5. Device dependent system slack. The device dependent system slack is the maximum amount of time that the CPU can be idle before the execution of any jobs requiring device λ k without causing any jobs to miss their deadlines. The device dependent system slack for a device λ k at time t is denoted DevDepSysSlack(λ k, t). As shown in Figure 5, the device access delay and the device dependent system slack for device λ k can be combined to create the device slack for λ k because they represent non-overlaping device slacks. In the example shown in Figure 5, the device dependent system slack for device λ k at time 0 is 14. That is, idle intervals with total length of 14 time units can be inserted before the execution of J 2,1 without causing any job to miss its deadline. Note that the 14 units of idle time are separated into two intervals, i.e., interval [4, 10] and interval [14, 22], as shown in Figure 5(c). If a single idle interval with a length of 14 time units is inserted at time 4, job J 1,2 will miss its deadline. Additional device slack of 8 time units comes from the execution of jobs J 1,1 and J 1,2, as shown in Figure 5(b). The two kinds of device slack do not decrease at the same time. On the contrary, the time to next device request cannot be combined with either the device access delay or the device dependent system slack because they might overlap. Therefore, the device slack of a device λ k can be given by, DevSlack(λ k, t) = max(t imet onextdevreq(λ k, t), DevAccessDelay(λ k, t) + DevDepSysSlack(λ k, t)) (4) Given the device slack for each device at time t, it is straightforward to implement the EASD algorithm. The algorithm is presented in Figure 6. This algorithm contains three parts: (1) update device slack for all devices; (2) perform device state transitions; and (3) schedule jobs with EDF and SRP. 14

15 1 The EASD Algorithm 2 // J exec is the job that is selected to occupy the CPU at time t. 3 Update the device slack for all devices at t 4 If (t: t is a scheduling point) 5 λ k, DevSlack(λ k, t) Max(T imet onextdevreq(λ k, t), DevAccessDelay(λ k, t) + DevDepSysSlack(λ k, t)); 6 Else // t is not a scheduling point 7 λ k, DevSlack(λ k, t) DevSlack(λ k, t 1) 1; 8 End 9 Perform device state transitions at time t: 10 If (t: λ k, λ k / Dev(J exec) and λ k = active and DevSlack(λ k, t) > BE(λ k )) 11 λ k idle; 12 t enteridle (λ k ) t; // The time that λ k starts the state transition to idle. 13 End 14 // Next condition makes sure that λ k has been idle for long enough to compensate for energy consumed in state transition. 15 If (t: λ k, λ k = idle and t t enteridle (λ k ) BE(λ k ) t wu(λ k )) 16 If (λ k Dev(J exec) or DevSlack(λ k, t) t wu(λ k )) 17 λ k active; 18 End 19 End 20 Schedule job with EDF(SRP); 21 End Figure 6. The peudocode for the EASD algorithm. As shown in Equation (4), updating the device slack requires updating of the next device request time, device access delay and device dependent system slack. Note that the computation for the next device request time, device access delay and device dependent system slack are only performed at scheduling points, i.e., the time instances of job completion, job release and existing critical sections. At time instances other than scheduling points, the device slack for any device is simply decreased by 1 per time unit. The algorithms to compute device access delay and device dependent system slack are provided in Appendix A. In those computations, we assume that there are n tasks in the system; and the current job of each task T i at time t, is given by J ci whose absolute deadline is denoted by D(J ci ). Without loss of generality, suppose the current job J c1 and J cn have the earliest deadline D(J c1 ) and the latest deadline D(J cn ) among all current jobs respectively. In the example illustrated in Figure 5(a), we have n = 2; J c1 = J 1,2 and J cn = J 2,1 at time 5. As shown in Appendix A, the update of the device slack for all devices can be done by looking at each job J i, Org P rio(j cn ) Org P rio(j i ) Org P rio(j c1 ). The worst case computational complexity of this algorithm at scheduling points is O(m + n + K), where m is the number of uncompleted jobs with priorities within [Org P rio(j cn ), Org P rio(j c1 )], n is the total number of tasks in the system and K is the total number of devices. Let T l be the task with the longest period, and T s be the task with the shortest period. In the worst case, m is a function of 2 P (T l )/P (T s ) (n 1)

16 4.4. Schedulability This section presents a sufficient schedulability condition for the CEA-EDF and EASD scheduling algorithms. The condition is the same condition used for the EDF algorithm with SRP [2]: Theorem 4.1. Suppose n periodic tasks are sorted by their periods. They are schedulable by CEA-EDF and EASD if k, 1 k n, k i=1 wcet(t i ) P (T i ) + B(T k) 1, (5) P (T k ) where B(T k ) is the maximal length that a job in T k can be blocked, which is caused by accessing non-preemptive resources including I/O device resources and non I/O device resources. Note that device state transition delay is not included. With the CEA-EDF algorithm, a device λ k is guaranteed to be in the active state when any jobs requiring λ k are released. Therefore, CEA-EDF does not affect the schedulability of any systems. In other words, Theorem 4.1 is true for CEA-EDF. The problem for EASD is much more complex than for CEA-EDF. With the EASD algorithm, a device is switched to the idle state when its device slack is larger than its break-even time. Therefore, there might be some intervals that the CPU is idle while there are some pending jobs waiting for required devices to be switched to the active state. Since the proof for the EASD algorithm requires the knowledge of job slack and device dependent system slack, the proof is presented in Appendix B. 5 Evaluation In this section, we present evaluation results for the CEA-EDF and EASD algorithms. Section 5.1 describes the evaluation methodology used in this study. Section 5.2 describes the evaluation of the algorithms with various system utilizations. Section 5.3 evaluates the ability that CEA-EDF and EASD reclaim unused WCETs to save energy; and compares the performance of MDO with CEA-EDF and EASD. Section 5.4 gives a comparison of the CEA-EDF and EASD algorithms Methodology We evaluated the CEA-EDF and EASD algorithms using an event-driven simulator. This approach is consistent with evaluation approaches adopted by other researches for energy-aware I/O scheduling [18, 20, 19]. To better evaluate the two algorithms, we compute the minimal energy requirement, LOW-BOUND, for each simulation. The 16

17 Device P active (W) P idle (W) P wu, P sd (W) t wu, t sd (ms) 1 Realtek Ethernet Chip [13] MaxStream Wireless module [11] IBM Microdrive [17] SST Flash SST39LF020 [16] SimpleTech Flash Card [15] Table 2. Device Specifications. LOW-BOUND is acquired by assuming that the time and energy overhead of device state transition is 0. A device is shut off whenever it is not required by the current executing job, and is powered up as soon as a job requiring it is executing. Therefore, the LOW-BOUND represents an energy consumption level that is not achievable for any scheduling algorithm. The power requirements and state switching times for devices were obtained from data sheets provided by the manufacturer. The devices used in experiments are listed in Table 2. The normalized energy savings is used to evaluate the energy savings of the algorithms. The normalized energy savings is the amount of energy saved under a DPM algorithm relative to the case when no DPM technique is used, wherein all devices remain in the active state over the entire simulation. The normalized energy savings is computed using Equation (6). Normalized Energy Savings = 1 Energy with CEA-EDF or EASD Energy with No DPM (6) In all experiments, we used randomly generated task sets to evaluate the performance of the CEA-EDF and EASD algorithms. All task sets are pretested to satisfy the feasibility condition shown in Equation (5). Each generated task set contained 1 10 tasks. Each task in a task set required a random number (0 3) of devices from Table 2. Critical sections of all jobs were randomly generated. Other characteristics include task periods and the best/worst case execution ratio, which are specified in each experiment. We repeated each experiment 500 times and present the mean value. During the whole experiment, all jobs meet their deadlines with the CEA-EDF and EASD algorithms. Although the worst case computational complexity of EASD is briefly discussed in Section 4.3, it may still be a concern that EASD has too much scheduling overhead in practice. We did not measure scheduling overhead in real systems since all algorithms were evaluated with simulations. Instead, we compared the scheduling overhead of CEA-EDF and EASD with respect to EDF(SRP) in our simulations. We used relative scheduling overhead to evaluate the scheduling overhead of CEA-EDF and EASD. The relative scheduling overhead is given by relative scheduling overhead = scheduling overhead with CEA-EDF or EASD 1 scheduling overhead with EDF( SRP) 1 Most vendors report only a single switching time. Thus we used this time for both t wu and t sd. 17

18 Mean energy saving under different system utilizations Mean energy saving under different system utilizations Mean energy saving under different system utilizations 0.9 CEA EDF EASD LOW BOUND 0.9 CEA EDF EASD LOW BOUND 0.9 CEA EDF EASD LOW BOUND Normilized Energy Saving Normilized Energy Saving Normilized Energy Saving System utilization System utilization System utilization (a) Mean normalized energy savings of different system utilization settings for task sets with periods in [50, 200]. (b) Mean normalized energy savings of different system utilization settings for task ferent system utilization settings for task (c) Mean normalized energy savings of dif- sets with periods in [200, 2000]. sets with periods in [2000, 8000]. Figure 7. Normalized energy savings with multiple devices. The mean value of the relative scheduling overhead of CEA-EDF is 3.1%, verifying that CEA-EDF is a lowoverhead algorithm. The mean value of the relative scheduling overhead of EASD is 59.3%. Considering that the scheduling overhead of EDF(SRP) is very low, a relative overhead of 59.3% is affordable. For example, if a system spends 1 time units out of every 1000 time units to perform scheduling with EDF(SRP), then the system will spend only time units out of every 1000 time units to perform scheduling with EASD. Therefore, the scheduling overhead of EASD is very low with respect to the whole system Average energy savings In this experiment, we measured the overall performance of CEA-EDF and EASD. Periods of tasks are chosen from three groups: [50, 200]; [200, 2000] and [2000, 8000], which represent short-period, mid-period and longperiod groups respectively. The intention of experimenting with different ranges of task periods is to evaluate the relation of the energy saving and the ratio of task periods to device state transition times. Within each group, task periods and WCETs were randomly selected such that they are schedulable according to Theorem 4.1. We first focus on the relationship of normalized energy savings to the system utilization, which is the sum of the worst case utilization for all tasks. In this experiment, we set the best/worst case execution time ratio to 1. Figure 7 shows the mean normalized energy saving for the CEA-EDF and the EASD under different system utilizations. On average, EASD saves more energy than CEA-EDF. In most cases, as the system utilization increases, the normalized energy savings decreases. The rationale for this is that as tasks execute more, the amount of time devices can be kept in idle mode decreases. Also it can be seen from the figure that the performance of EASD is comparable to the LOW-BOUND. 18

19 Mean energy saving CEA EDF EASD LOW BOUND Mean energy saving CEA EDF EASD MDO LOW BOUND Normilized Energy Saving Normilized Energy Saving Best Case Execution Time / Worst Case Execution Time Best Case Execution Time / Worst Case Execution Time (a) Normalized energy savings for various ratios of the best case execution time to the worst case execution time. (b) Normalized energy savings for various ratios of the best case execution time to the worst case execution time. Note no shared resource in this experiment since MDO does not address the issue of resource blocking. Figure 8. Reclaiming unused WCETs to save energy. An important, albeit intuitive finding, is that the ratio of device state transition time to task periods greatly affects the energy savings. Both algorithms perform worst in the experiment with short periods, as shown in Figure 7(a). This is consistent with our expectations. For example, suppose it takes a long time for a device λ k to perform a state transition; and λ k is used in a system in which tasks have very short periods, then λ k has little chance to be switched to the idle state. Furthermore, the performance of EASD is close to optimal when tasks are in the longperiod group, as shown in Figure 7(c). It can be seen that EASD is more sensitive to task periods than CEA-EDF. This is also consistent with our discussion of device slack in Appendix A, since device slack is closely related to task periods. With the same system utilization, the mean device slack of devices in a system with longer periods should be larger than those of devices in a system with shorter periods Reclaiming unused WCETs to save energy In practice, job actual execution times can be less than their WCETs. Unused WCETs can be reclaimed to save energy. First, we evaluate the ability of CEA-EDF and EASD to save energy by dynamically reclaiming the slack coming from unused WCETs. Figure 8(a) shows the normalized energy savings for the CEA-EDF and EASD with increasing best/worst case execution time ratios. In this experiment, the system utilization is set between 90% and 100%; and task periods are chosen from the mid-period group, i.e., [200, 2000]. As with the first experiment, critical sections of all jobs were randomly generated. As shown in Figure 8(a), both CEA-EDF and EASD save more energy when the ratio of the best/worst case execution time is smaller, showing that both algorithms can 19

20 dynamically reclaim unused WCETs to save energy. Moreover, the difference between EASD and LOW-BOUND remains almost unchanged with different best/worst case execution time ratios, which means that EASD is able to fully reclaim the slack created by unused WCETs to save energy. Similar results are acquired for the short-period group and long-period group, and are therefore omitted here. In addition, we compare the energy saving of CEA-EDF and EASD with MDO, which is the only published energy-aware device scheduling algorithm for preemptive schedules. This comparison intends to evaluate the advantage of online algorithms (CEA-EDF and EASD) over an offline-alone algorithm (MDO) in utilizing unused WCETs to save energy. As discussed in Section 2, MDO cannot support shared non-preemptive resources. Therefore, no critical section is generated for any job in this experiment. That is, task sets used in this experiment are fully preemptive. As shown in Figure 8(b), MDO has energy savings of only an additional 0.74% over EASD when job execution times are equal to their WCETs, i.e., the runtime job execution is exactly as computed with MDO at the offline phase. However, the energy saving of MDO does not increase when the best/worst case execution time ratio decreases, because MDO does not utilize unused WCETs to save energy [4]. It can be seen from Figure 8(b) that MDO even saves less energy than CEA-EDF when the best/worst case execution time is less than Comparison of CEA-EDF and EASD The last experiment compares the energy saving of EASD to CEA-EDF. We use the normalized additional energy savings to evaluate the additional energy savings of the EASD algorithm. The normalized additional energy savings is the amount of energy saved under the EASD algorithm relative to the CEA-EDF algorithm. It is computed using Equation (7). Normalized Additional Energy Savings = 1 Energy with EASD Energy with CEA-EDF (7) In this experiment, task periods are chosen from the mid-period group, i.e., [200, 2000]. The best/worst case execution time ratio is set to 1. The distribution of normalized additional energy savings with three ranges of system utilization is presented in Figure 9. The results are consistent with previous experiment results. CEA-EDF performs well when the system workload is low. When the system utilization is less than 10%, CEA-EDF performs almost the same as EASD; and when the system utilization is less than 60%, CEA-EDF performs close to EASD. There are a few instances in which the EASD algorithm actually results in more energy being consumed than the CEA-EDF algorithm. This is because the EASD tries to reduce the time that devices are in the active state, but this causes more device switches. A remarkable result from these experiments is that CEA-EDF performs well, on average, compared to EASD 20

21 Percentage of simulation Percentage of simulation Percentage of simulation Normalized Energy Savings (a) The distribution of normalized additional energy savings with system utilization of 0-10% Normalized Energy Savings (b) The distribution of normalized additional energy savings with system utilization of 40-50% Normalized Energy Savings (c) The distribution of normalized additional energy savings with system utilization of %. Figure 9. Comparison of the CEA-EDF and the EASD. X axis represents the normalized additional energy saving of the EASD to the CEA-EDF, Y axis represents the percentage of the normalized additional energy savings gained in all simulations. when the system workload is low. Even in cases where the system utilization is near 100%, the CEA-EDF algorithm can still achieve nearly 40% energy savings for I/O devices. Moreover, CEA-EDF can be used together with an energy-aware processor scheduler without any modification, because CEA-EDF has no influence on processor scheduling while it has an excellent performance/cost ratio. We will conduct the integration of CEA-EDF and energy-aware processor scheduling in future research. 6 Conclusion Two hard real-time scheduling algorithms were presented for conserving energy in device subsystems. Both algorithms support the preemptive scheduling of periodic tasks with non-preemptive shared resources. The CEA- EDF algorithm, though a relatively simple extension to EDF scheduling, provides remarkable power savings when the system workload is low. On the other hand, EASD can produce more energy savings than CEA-EDF. Ultimately, the choice of which energy saving algorithm to choose, if any, depends on the temporal parameters of the task set and devices utilized. Although the power management of the processor is not addressed in this paper, our work can be applied to reduce the leakage power consumption of the processor, which is expected to become an increasingly larger fraction of the processor energy consumption. Leakage power consumption is reduced by disabling all or parts of the processor whenever possible. Therefore, with the CPU as a shared device for all tasks, our algorithms can be applied without any modification. In general, CEA-EDF and EASD do not result in the minimum energy schedule when multiple devices are 21

22 shared. The problem of finding a feasible schedule that consumes minimum I/O device energy is NP-hard. Hence, our focus was not to find the optimal solution but to create algorithms that reduce the energy consumption of multiple shared devices and that can be executed online to adapt to the work load. This work provides the foundation for a family of general, online energy saving algorithms that can be applied to systems with hard temporal constraints. References [1] Advanced configuration & power interface specification. Advanced Configuration & Power Interface, August 2003, [2] Baker, T.P., Stack-Based Scheduling of Real-Time Processes, Real-Time Systems, 3(1):67-99, March [3] Benini, L., Bogliolo, A., and Micheli, G., A survey of design techniques for system-level dynamic power management, IEEE Trans. VLSI Syst., vol. 8, June [4] Chakrabarty., K, Correspondence with the author of the MDO algorithm, May [5] Golding, R.A., Bosch, p., Staelin, C., Sullivan, T., and Wilkes, J., Idleness if not sloth, Proceedings of the Winter USENIX Conference, [6] Liu and Layland, Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment, Journal of the ACM, 20(1), January, [7] Liu, J., Real-time Systems, Prentice Hall, [8] Lu Y. H., Benini L., Power-Aware operating systems for interactive systems, IEEE Transactions on Very Large Scale Integration Systems, 10(2): , April [9] Lu Y. H., Benini L. and Micheli G., Operating-System Directed Power Reduction, International Symposium on Low Power Electronics and Design, [10] Lu Y. H., Benini L., and Micheli G., Requester-Aware Power Reduction, International Symposium on System Synthesis, Stanford University, pages 18 23, September, [11] Maxstream 9xstream 900mhz wireless OEM module. xstreammanual.pdf. [12] Microsoft OnNow power management architecture. whdc/hwdev/tech/onnow/onnowapp Print.mspx. [13] Realtek ISA full-duplex ethernet controller RTL8019AS. ftp:// /cn/nic/rtl8019as/spec-8019as.zip. [14] Sha, L., Rajkumar, R., and Lehoczky, J.P., Priority inheritance protocols: an approach to real-time synchronization, IEEE Transactions on Computers, page , [15] Simpletech compact flash card. prox.php. [16] SST multi-purpose flash SST39LF datasheet/s71150.pdf. [17] IBM microdrive DSCM F532791CA062C38F87256AC00060DD49 /file/ibm md datasheet.pdf. [18] Swaminathan, V., Chakrabarty, K., and Iyengar, S.S., Dynamic I/O Power Management for Hard Real-time Systems In Proceedings of the Ninth International Symposium on Hardware/Software Codesign, p , April 2001,Copenhagen,Denmark. [19] Swaminathan, V., Chakrabarty, K., Pruning-based energy-optimal device scheduling for hard real-time systems, In Proceedings of the tenth international symposium on Hardware/software codesign, Pages: , [20] Swaminathan, V., and Chakrabarty, K., Energy-conscious, deterministic I/O device scheduling in hard real-time systems, IEEE Transactions on Computer-Aided Design of Integrated Circuits & Systems, vol 22, pages , July

23 [21] Swaminathan, V., and Chakrabarty, K., Pruning-based, Energy-optimal, Deterministic I/O Device Scheduling for Hard Real-Time Systems, ACM Transactions on Embeded Computing Systems, 4(1): , February [22] Tia, T.S., Utilizing Slack Time for aperiodic and sporadic requests scheduling in real-time systems., Ph.D. thesis, University of Illinois at Urbana-Champaign, Department of Computing Science, [23] Weiser, M., Welch, B., Demers, A.J., and Shenker, S., Scheduling for Reduced CPU Energy, Operating Systems Design and Implementation,

24 A Appendix In this section, we provide algorithms used to compute device access delay and device dependent system slack. To reduce computational overhead, the original priority of all jobs are computed offline. We assume that there are N jobs in the hyperperiod; and all jobs in a hyperperiod are ordered by their priorities, Org P rio(j 1 ) > Org P rio(j 2 ) > Org P rio(j 3 ) >... > Org P rio(j N ). Without loss of generality, suppose that the current job of each task T i at time t, is given by J ci whose absolute deadline is denoted by D(J ci ). Job J c1 and J cn have the earliest deadline D(J c1 ) and the latest deadline D(J cn ) among all current jobs respectively. In Appendix A.1 and Appendix A.2, we show that updating device access delay and device dependent system slack can be done by looking at all jobs with priorities between Org P rio(j c1 ) and Org P rio(j cn ). J c1 and J cn change at runtime. Thus the set of jobs included in computations is like a sliding window, which is called the computation window hereafter. For example, suppose a system consists of two tasks: T 1 = {10, 2,, } and T 2 = {24, 2,, }, then the computation window at time 0 is {J 1,1, J 1,2, J 2,1 }. When J 1,1 is completed at time 2, the computation window becomes {J 1,2, J 2,1 }. A.1. Device access delay As defined in Definition 4.4, the device access delay for a device λ k is the time during which jobs requiring λ k cannot execute because of the execution of higher priority jobs that do not need λ k. The computation needs the knowledge of the actual job execution time, which is unknown a priori. Therefore, WCET is used in the computation as an approximation. The computation of the device access delay for all devices at time t can be done by computing the schedule of all current jobs from time t with their WCETs. The device access delay for a device λ k at time t, i.e., DevAccessDelay(λ k, t), can be acquired by looking at the schedule. For example, if there is at least one uncompleted job requiring λ k that is released before or at time t, then DevAccessDelay(λ k, t) = t t, where t is the first time instance after time t that any job requiring λ k occupies the CPU in the schedule. However, it requires significant overhead to compute the device access delay in this way. To reduce the computational overhead, we adopt a simplified algorithm. In this algorithm, we only consider higher priority jobs that have been released and cannot be blocked by any current jobs requiring λ k. The pseudocode for the simplified algorithm to compute the device access delay for a device λ k is shown in Figure 10. An optimized algorithm with lower computational complexity is shown in Figure

25 1 Function DevAccessDelay(λ k, t) 2 Output: (1) the device access delay for λ k, i.e., DevAccessDelay(λ k, t); (2) the first device request job, i.e., J λk,t. 3 sum 0; // record device access delay for device λ k. 4 // α denotes the set of jobs that require device λ k 5 J x null; // The job with the highest priority among all jobs in α 6 J y null; // The job with the highest priority among all jobs in α and can be blocked by some jobs in α. 7 J i, Org P rio(j cn ) Org P rio(j i) Org P rio(j c1 ) // looking at each job within the computation window. 8 If (λ k Dev(J i) and Org P rio(j i) > Org P rio(j x)) 9 J x J i; 10 // σ(λ k, t) denotes the highest preemption ceiling of resources being held by any job requiring λ k. 11 Else If( λ k / Dev(J i) and Org P rio(j i) > Org P rio(j y) and Res(J i) and P L(J i) σ(λ k, t) and et(j i, [0, t]) = 0) 12 J y J i; 13 End 14 End 15 J λk,t J i, J i {J x, J y} and Org P rio(j i) = Max(Org P rio(j x), Org P rio(j y)); 16 J i, Org P rio(j cn ) Org P rio(j i) Org P rio(j c1 ) 17 If (Org P rio(j i) > Org P rio(j λk,t) and R(J i) t) 18 sum sum + wcet(j i) et(j i, [0, t]); 19 End 20 End 21 DevAccessDelay(λ k, t) sum; 22 End Figure 10. The pseudocode for the simplified DevAccessDelay algorithm. The algorithm is reformatted for readability. The optimized algorithm to reduce computational overhead is shown in Figure 11. The DevAccessDelay algorithm is as follows. Suppose α is the set of jobs that require λ k ; J x is the job with the highest priority in α; and J y is the highest priority job of all jobs that can be possibly blocked by some job(s) in α. Then the device access delay for device λ k consists of the remaining WCET of any job J i, Org P rio(j cn ) Org P rio(j i ) Org P rio(j c1 ), satisfying following two conditions: (1) Org P rio(j i ) > Org P rio(j x ) and Org P rio(j i ) > Org P rio(j y ); and (2) the release time of J i is equal to or less than the current time t, which is to make sure that J i can occupy the CPU before any job in α. In the DevAccessDelay algorithm, the computation is done by looking at all jobs that have higher priorities than J x and J y. Intuitively, any job that has a higher priority than both J x and J y can delay the execution of any job requiring device λ k once it is released. In the remainder of this paper, first device request job is used to represent the job with the higher priority of J x and J y. The first device request job is defined as follows. Definition A.1. First device request job. The first device request job of a device λ k, at time t, is denoted J λk,t and computed by J x Org P rio(j x ) Org P rio(j y ) J λk,t = J y Org P rio(j x ) < Org P rio(j y ) (8) where J x is the job with the highest priority of all current jobs requiring device λ k ; and J y is the job with the 25

26 1 Function DevAccessDelay() 2 // Update the device access delay and the first device request job for every device λ k, 1 k K; 3 sum 0; // record device access delay. 4 T ocompute ( 0); // T ocompute is a bit-array used to indicate if the computation for devices is uncompleted. 5 HoldingResJob Head(HoldingResJobList); // HoldingResJobList is a list of jobs that are holding resources. 6 J i, J i J c1 : J cn //Browsing the computation window from the highest priority job to the lowest priority job. 7 D T ocompute & DevBits(J i) // D is the set of devices that needs to be computed and are required by J i. 8 λ k, λ k D 9 J λk,t J i; // J λk,t is the first device request job for λ k. 10 DevAccessDelay(λ k, t) sum; 11 T ocompute T ocompute & ( (1 << (k 1))); 12 End 13 If (Res(J i) and et(j i, [0, t]) = 0) 14 While (P L(J i) σ(holdingresjob)) // σ(j x) is the highest preemption ceiling of resources being held by J x. 15 D T ocompute & DevBits(HoldingResJob); 16 λ k, λ k D 17 J λk,t J i; 18 DevAccessDelay(λ k, t) sum; 19 T ocompute T ocompute & ( (1 << (k 1))); 20 End 21 HoldingResJob Next(HoldingResJob); 22 End 23 End 24 If (T ocompute > 0 and R(J i) t) 25 sum sum + wcet(j i) et(j i, [0, t]); 26 Else If (T ocompute = 0) 27 break; // break the loop 28 End 29 End Figure 11. The pseudocode for the optimized DevAccessDelay algorithm. highest priority of all jobs that (1) do not start execution and do not require device λ k ; (2) require shared resources; and (3) have equivalent or lower preemption levels than the preemption ceiling of some resources being held by any job requiring device λ k. The DevAccessDelay algorithm Figure 10 shows the simplified algorithm to compute the device access delay for one device at time t. In practice, the computation of the device access delay for all devices are performed in the same computation window. Thus the computational complexity can be reduced by combining common computations. Figure 11 shows the algorithm to update the device access delay for all devices at time t. In this algorithm, three data structures are used to facilitate the computation: 1. T ocompute. T ocompute is a bit-array to represent the devices of which the device access delay needs to be computed. For example, suppose that the total number of devices is 8. The initial value for T ocompute is set to (line 4), indicating that the device access delay for all devices needs to be computed. 26

27 2. DevBits. DevBits is a bit-array to represent the devices required by each job. For example, suppose job J 1 requires devices λ 1, λ 3 and λ 4, then the DevBits(J 1 ) is HoldingResJobList. HoldingResJobList is a list of jobs that are holding resources at time t. HoldingResJob is initialized to be the first job in the HoldingResJobList (line 5); and σ(holdingresjob) denotes the highest preemption level ceiling of resources being held by HoldingResJob (line 14). Suppose jobs in the HoldingResJobList are J b1, J b2,..., J bm, and σ(j b1 ) > σ(j b2 ) >... > σ(j bm ), then a job J i that requires some resources can start its execution only when its preemption level is higher than σ(j b1 ). If a resource r i is allocated to J i at time t, then σ(j i ) > σ(j b1 ) if J i J b1. Thus J i is placed at the head of HoldingResJobList at time t. With SRP, HoldingResJobList works like a stack. That is, a new job always joins HoldingResJobList at the head of the list. Similarly, the job that releases a resource must also be the first job of the list. Therefore, jobs join/leave the list in a FILO (First In Last Out) manner. For a system of n tasks, the length of the list is at most n. The computational complexity of the maintenance of HoldingResJobList at runtime is O(1). It can be seen that the computation is done by looking at each job J i in the computational window (line 6) from the highest priority job to the lowest priority job. For any device λ k that is required by J i ; and the computation for the device λ k is not yet completed (line 7), J i is the first device request job of λ k (line 8-12), according to Definition A.1. If J i can be blocked by some jobs requiring devices (line 13-14); and the computations for these devices are not completed, then J i is the first device request job of these devices (line 16-20), according to Definition A.1. HoldingResJob is assigned to the next job in the HoldingResJobList (line 21). sum is used to record the cumulative unused worst case execution time of all released jobs (line 25). Once computations for all devices are done (line 26), the DevAccessDelay algorithm is completed (line 27). Note only lower priority job can block a high priority job. However, the DevAccessDelay algorithm does not compare job priorities (line 14) because if HoldingResJob has a higher priority than J i, then the device set D (line 15) should be at this point. The worst case computation complexity of this algorithm is O(m + n + K) where m is the number of jobs in the computational window, n is the total number of tasks in the system and K is the total number of devices. It can be seen that lines 7, 13, execute at most m times; lines 8-12 and lines execute at most K times since at least the computation for one device is completed by executing these codes; lines 14, 15, 21 and 22 execute at most n times since the maximum length of HoldingResJobList is n. 27

28 A.2. Device dependent system slack Before presenting the algorithm to compute the device dependent system slack, we first introduce several concepts used in the computation. Definition A.2. Initial Job Slack. The initial slack of a job J k, at time t = 0, is denoted JobSlack(J k, 0) and computed by subtracting the total time required to execute job J k and other periodic requests with higher priorities than this job and the maximum blocking duration for J k from the total time available to execute job J k. That is, the slack of a job J k at t = 0 is given by JobSlack(J k, 0) = D(J k ) wcet(j i ) B(J k ) (9) D(J i) D(J k ) where D(J k ) is the absolute deadline of J k ; B(J k ) is the maximal blocking time for J k, as discussed in Section 4.1. Definition A.3. job slack. The slack of a job J k, at t, t > 0, is denoted JobSlack(J k, t). JobSlack(J k, t) decreases as it gets consumed by CPU idling and by the execution of lower priority jobs; and increases as jobs are completed sooner than their WCETs. That is, the job slack of a job J k at time t is given by JobSlack(J k, t) = JobSlack(J k, 0) Idle(0, t) et(j i, [0, t]) + U rem (J i ) (10) D(J i)>d(j k ),R(J i)<t D(J i) D(J k ) where R(J i ) is the release time of job J i ; Idle(0, t) is the amount of time the CPU has been idled till t; and et(j i, [0, t]) is the amount of time jobs with deadlines greater than D(J k ), have executed till t, D(J i )>D(J k ),R(J i )<t which implies that these jobs have to be released before t. Thus Idle(0, t) + et(j i, [0, t]) is the D(J i )>D(J k ),R(J i )<t total amount of slack consumed till t. U rem (J i ) is the amount of unused WCETs of completed jobs D(J i ) D(J k ) with deadlines equal to or less than D(J k ), which are reclaimed as job slack for J k. Intuitively, the job slack of a job J i at time t is the maximum amount of time that the CPU can be idle at time t without causing J i itself to miss its deadline. Suppose β = {J j J j, Org P rio(j j ) Org P rio(j i )}, then minslack = min(jobslack(j j, t)), J j β, is the maximum amount of time that the CPU can be idle at time t without causing any jobs with equivalent or lower priority than Org P rio(j i ) to miss its deadline. If the idle time is only inserted before the execution of any job in β without delaying the execution of higher priority jobs, then minslack becomes the maximum amount of time that the CPU can be idle before the execution of any job in β without causing any job to miss its deadline. As shown in Figure 5(a), JobSlack(J 2,1, 0) = JobSlack(J 1,3, 0) = 14. If an idle interval of 14 time units is inserted at time 0, i.e., the CPU is idle during [0, 14], then J 1,1 will miss its 28

29 J 1 J 2 J J 3 3 λ Figure 12. λ Dev(J 3 ); t sd (λ) = t wu (λ) = 4; D(J 1 ) = 10; D(J 2 ) = 16 and D(J 3 ) = 24. At time 4, JobSlack(J 2, 4) = 4 and JobSlack(J 3, 4) = 8. The shaded regions represent shared resource access. In this example, if the CPU is idle for 8 time units before the execution of J 3, then J 2 misses its deadline because it is blocked by J 3. Therefore, J λ,4 = J 2 and DevDepSysSlack(λ, 4) = JobSlack(J 2, 4) = 4. deadline. However, if J 1,1 preempts the idle interval as shown in Figure 5(a), then every job meets its deadline, and the total idle time before the execution of J 2,1 and J 1,3 is still 14 time units. However, the above discussion is only true without resource blocking. If a job J j β is holding a shared resource and thus may block a higher priority job J y / β, then the idle interval inserted before the execution of J j can possibly delay the execution of J y and all jobs with priority lower than J y. In this case, Min(JobSlack(J j, t)), J j, Org P rio(j j ) Org P rio(j y ), is the maximum amount of time that the CPU can be idle before the execution of any job in β without causing any job to miss its deadline. As shown in Figure 12, J 3 can block J 2 and thus the execution of J 2 depends on the execution of J 3. As a result, the maximum amount of time that the CPU can be idle before the execution of J 3 is the job slack of J 2. Note that EASD schedules jobs according to EDF and SRP. If there is a released high priority job in the system, a low priority job can execute only if it is the current ceiling task and blocks all released higher priority jobs. In other word, the EASD algorithm does not allow low priority jobs to utilize the idle intervals caused by device transitions for high priority jobs. The reason is: allowing reordering job executions may cause unexpected blocking for jobs and thus jeopardize temporal correctness. For example, a job can be possibly blocked for more than once with job reordering. Recall that the device dependent system slack is defined to be the maximum amount of time that the CPU can be idle before the execution of any jobs requiring device λ k without causing any jobs to miss their deadlines. It is now clear that the device dependent system slack is the maximum amount of idle time that can be inserted before 29

30 1 Function DevDepSysSlack(λ k, t) 2 Output: The device dependent system slack for λ k at time t. 3 MinSlack + ; // To record the minimal dynamic job slack; 4 J x, Org P rio(j cn ) Org P rio(j x) Org P rio(j cn ) 5 JobSlack(J x, t) JobSlack(J x, 0) Idle(0, t) D(J i )>D(J x),r(j i )<t 6 If (Org P rio(j x) Org P rio(j λk,t) and JobSlack(J x, t) < MinSlack) 7 MinSlack JobSlack(J x, t); 8 End 9 End 10 DevDepSysSlack(λ k, t) Min(MinSlack, MinInitSlack(J cn ) + et(j i, [0, t]) + D(J i ) D(J cn ) D(J i ) D(J k ) U rem(j i); U rem(j i) Idle(0, t)); Figure 13. The pseudocode for the simplified DevDepSysSlack algorithm. An algorithm with lower computational complexity is presented in Figure 15. the execution of J λk,t and all lower priority jobs, which can be given by, DevDepSysSlack(λ k, t) = min(jobslack(j x, t)) J x, Org P rio(j x ) Org P rio(j λk,t) (11) To compute DevDepSysSlack(λ k, t) for each device, we need to compute the job slack of all uncompleted jobs. The complexity of this computation would be O(N), where N is the number of jobs in a hyperperiod. To reduce computational overhead, the initial slack of all jobs in a hyperperiod are computed offline and are kept in a job slack list ordered by deadlines. Let MinInitSlack(J k ) denote the minimal initial job slack of all jobs with equivalent or lower original priorities than Org P rio(j k ), which can be given by, MinInitSlack(J k ) = min(jobslack(j x, 0)) J x, Org P rio(j x ) Org P rio(j k ) Then the minimum job slack of J k and all jobs with lower priorities than Org P rio(j cn ) is MinInitSlack(J cn )+ Urem (J i ) Idle(0, t), where J i is any job that is completed at or before time t. Note that D(J cn ) has a deadline no less than any completed job. In this way, the computation of device dependent system slack can be done by looking at each job in the computation window. The simplified algorithm for the computation of device dependent system slack is presented in Figure 13. A detailed algorithm is presented next. The DevDepSysSlack algorithm We are now ready to describe the DevDepSysSlack algorithm. As discussed before, our method involves an off-line phase. In this phase, the initial slack of all jobs in a hyperperiod are computed and are kept in a job slack list ordered by priorities. Each entry of the job slack list contains a job s ID, say J k ; the corresponding initial job slack, JobSlack(J k, 0); as well as MinInitSlack(J k ). An example of the job slack list of a task set is shown in Figure

31 J1,1 J 2,1 J 3,1 J 1, 2 J 2, 2 J1,3 J 3,2 J1, 4 J 2,3 J 1, 5 J 3,3 J 2,4 J 1,6 Figure 14. An example of job slack list. Jobs are ordered by priorities. T 1 = {10, 3, Dev(T 1 ), Res(T 1 )}; T 2 = {15, 4, Dev(T 2 ), Res(T 2 )}; T 3 = {20, 5, Dev(T 3 ), Res(T 3 )}. The DevDepSysSlack algorithm contains two parts: (1) update the job slack for all jobs in the computation window; and (2) update the device dependent system slack for all devices. This algorithm is invoked at time instances when the current executing job is completed (line 7) or is preempted (line 29). After the job slack of all jobs in the computation window is updated, the device dependent system slack of all devices can be acquired (line 39), according to Equation (11). The computation of job slack is done by looking at each job in the computation window from the lowest priority job to the highest priority job (line 14,27). Thus the computational complexity for updating the job slack of all jobs in the computation window is O(m), where m is the number of jobs in the computation window. The computational complexity of the device dependent system slack for all devices is O(K), where K is the total number of devices. Therefore, the computational complexity for the DevDepSysSlack algorithm is O(m + K). 31

32 1 Function DevDepSysSlack() 2 Initialize at time t : 0 3 c 0; // c is used to record the cumulative unused WCET of all completed jobs. 4 t 0; // t is the last instance that the JobExecDelay algorithm is invoked. 5 MinSlack + ; // MinSlack is the minimum job slack of all jobs that have been looked at. 6 Update dynamic job slack at time t: 7 If (t: instance when job J i,j is completed) 8 c c + wcet(j i,j) et(j i,j, [0, t]); 9 If (J i,j = J c1 ) // Need to update J c1 10 J c1 the second job in the computation window; 11 End 12 If (Org P rio(j i,j+1) < Org P rio(j cn )) // Need to update J cn 13 J c n J cn ; // J c n is used to record old J cn. 14 J cn J i,j+1; 15 End 16 J x, J x J cn : J c1 // Browsing the computation window from the lowest priority job to the highest priority job. 17 If (D(J x) < D(J i,j)) // The execution time of J i,j is not included in the initial job slack of J x 18 JobSlack(J x, t) JobSlack(J x, t) et(j i,j, [t, t]); 19 Else If (D(J i,j) D(J x) D(J c n )) 20 JobSlack(J x, t) JobSlack(J x, t) + wcet(j i,j) et(j i,j, [0, t]); // Reclaim the unused WCET of J i,j. 21 Else // D(J c n ) < D(J x) 22 JobSlack(J x, t) JobSlack(J x, t) + c Idle(0, t); // The dynamic job slack of J x at time t 23 End 24 MinSlack Min(MinSlack, JobSlack(J x, t), MinInitSlack(J cn ) + c Idle(0, t)); 25 J x.minslack MinSlack; // J x.minslack is the minimum job slack of all jobs with equivalent or lower priorities. 26 End 27 Remove J i,j from the job slack list; t t; 28 End 29 If (t: instance when job J i,j is preempted) // J i,j can be idle job, D(idle job) = +, et(idle job, [t, t]) = Idle(t, t) 30 J x, J x J cn : J c1 ; // Browsing the computation window from the lowest priority job to the highest priority job. 31 If (D(J x) < D(J i,j)) 32 JobSlack(J x, t) JobSlack(J x, t) et(j i,j, [t, t]); 33 MinSlack Min(MinSlack, JobSlack(J x, t), MinInitSlack(J cn ) + c Idle(0, t)); 34 J x.minslack MinSlack; 35 End 36 End 37 t t; 38 End 39 Update device dependent system slack: 40 λ k, DevDepSysSlack(λ k, t) J λk,t.minslack; Figure 15. The pseudocode for the DevDepSysSlack algorithm. 32

33 B Appendix This section shows that Theorem 4.1 is true for EASD. With the EASD algorithm, a device is switched to the idle state when its device slack is larger than its break-even time. Therefore, there might be some intervals that the CPU is idle while there are pending jobs waiting for required devices to be switched to the active state. As discussed before, these idle intervals are device dependent system slack. An example of device dependent system slack is shown in Figure 5(a). We first consider the relationship between the device slack at time t and the device slack at time t+1. The device slack at time t means the device slack at the time instance t, while the time unit t means the duration [t, t + 1). Suppose device λ inact is not active at time unit t (including the time that device λ inact begins state transition from the active state to the idle state at time t). With following lemmas, we want to show that the device slack for this device at time t+1 is at most 1 time unit less than the device slack at time t. Let α denote the set of all uncompleted jobs that require device λ inact ; and let J exec be the job that occupies the CPU at time unit t. Therefore, J exec / α. Lemma B.1. The first device request job for device λ inact at time t and time t + 1 are the same job. That is, J λinact,t = J λinact,t+1 Proof: Suppose that J x is the job with the highest priority in α; and J y is the highest priority job of all jobs that do not require λ inact and can be possibly blocked by some job(s) in α at time t. According to Definition A.1, J λinact,t is either J x or J y. Firstly, J exec J x because J x α and J exec / α. Thus at time t + 1, J x is still the highest priority job in α at time t + 1. Secondly, J exec J y because J exec is not blocked by any job. Since J exec / α, neither are new resources acquired nor are resources being held by jobs in α released. Therefore, J y is still the highest priority job of all jobs that do not require λ inact and can be possibly blocked by some job(s) in α at time t + 1. Therefore, J λinact,t = J λinact,t+1. Lemma B.2. The device access delay and the device dependent system slack for device λ inact cannot decrease at the same time. That is, DevAccessDelay(λ inact, t+1) DevAccessDelay(λ inact, t) or DevDepSysSlack(λ inact, t+ 1) DevDepSysSlack(λ inact, t). Proof: Suppose that DevAccessDelay(λ inact, t + 1) < DevAccessDelay(λ inact, t). That means J exec has higher priority than all jobs in α; and the WCET of J exec is included in DevAccessDelay(λ inact, t). It follows 33

34 that D(J exec ) D(J λinact,t). Thus the execution time of J exec has already been subtracted from the dynamic job slack of J λinact,t and all lower priority jobs, which cannot decrease in this case. With Lemma B.1, we have J λinact,t+1 = J λinact,t. Therefore, the device dependent system slack for device λ inact does not decrease when the device access delay decreases. Lemma B.3. The device dependent system slack decreases at most 1 per time unit. That is, DevDepSysSlack(λ inact, t+ 1) DevDepSysSlack(λ inact, t) 1. Proof: From Lemma B.1, we know that J λinact,t+1 = J λinact,t. The device dependent system slack at time t+1 is the minimum dynamic job slack of all jobs with priorities equivalent or lower than Org P rio(j λinact,t+1). The dynamic job slack of any job can decrease at most one during a time unit. It follows that DevDepSysSlack(λ inact, t+ 1) DevDepSysSlack(λ inact, t) 1. Lemma B.4. DevAccessDelay(λ inact, t+1)+devdepsysslack(λ inact, t+1) DevAccessDelay(λ inact, t)+ DevDepSysSlack(λ inact, t) 1. Proof: We show this in the following two cases: (1) the WCET of J exec is not included in DevAccessDelay(λ inact, t); and (2) the WCET of J exec is included in DevAccessDelay(λ inact, t). Case 1: The WCET of J exec is not included in DevAccessDelay(λ inact, t). In this case, DevAccessDelay(λ inact, t+ 1) DevAccessDelay(λ inact, t). Also DevDepSysSlack(λ inact, t+1) DevDepSysSlack(λ inact, t) 1, according to Lemma B.3. Therefore, DevAccessDelay(λ inact, t + 1) + DevDepSysSlack(λ inact, t + 1) DevAccessDelay(λ inact, t) + DevDepSysSlack(λ inact, t) 1. Case 2: The WCET of J exec is included in DevAccessDelay(λ inact, t). In this case, if J exec is not completed at time t + 1, then DevAccessDelay(λ inact, t + 1) DevAccessDelay(λ inact, t) 1. According to Lemma B.2, the device dependent system slack cannot decrease at the same time as the device access delay. Therefore, Lemma B.4 is true in this case. On the other hand, if J exec is completed at time t + 1, then DevAccessDelay(λ inact, t + 1) DevAccessDelay(λ inact, t) (wcet(j exec ) et(j exec, [0, t + 1])) 1. The unused WCET of J exec, i.e., wcet(j exec )-et(j exec, [0, t + 1]), becomes additional job slack for J λinact,t and all lower priority jobs. Therefore, DevDepSysSlack(λ inact, t + 1) DevDepSysSlack(λ inact, t) + (wcet(j exec ) et(j exec, [0, t + 1])). It follows that Lemma B.4 is true in this case, too. 34

35 Lemma B.5. The time to next device request at time t + 1 is not less than the time to next device request at time t minus 1. That is, T imet onextdevreq(λ inact, t + 1) T imet onextdevreq(λ inact, t) 1. Proof: By Definition 4.2, T imet onextdevreq(λ inact, t) = NextDevReqT ime(λ inact, t) t, Since the release time of any job is fixed, we have NextDevReqT ime(λ inact, t+1) NextDevReqT ime(λ inact, t). Therefore, T imet onextdevreq(λ inact, t + 1) T imet onextdevreq(λ inact, t) 1. Lemma B.6. The device slack of device λ inact at time t + 1 is not less than the device slack of the device at time t minus 1. That is, DevSlack(λ inact, t + 1) DevSlack(λ inact, t) 1. Proof: Recall that the device slack for device λ inact at time t is the larger of T imet onextdevreq(λ inact, t) and DevAccessDelay(λ inact, t)+devdepsysslack(λ inact, t). The correctness of Lemma B.6 directly follows from Lemma B.4 and Lemma B.5. We have now finished the proof that the device slack for an inactive device decreases at most 1 per time unit. Next, we provide another lemma before the proof of Theorem 4.1. Lemma B.7. The dynamic job slack of jobs with equivalent deadline is the same at any time t. That is, JobSlack(J i, t) = JobSlack(J j, t), J i, J j, D(J i ) = D(J j ). Proof: According to Definition A.2, JobSlack(J i, 0) = JobSlack(J j, 0), J i, J j, D(J i ) = D(J j ). And the update of dynamic job slack is only related to a job s deadline, as shown in Equation (10). That is, the same amount of slack is added or subtracted to all jobs with equivalent deadlines at any time t. Therefore, JobSlack(J i, t) = JobSlack(J j, t), J i, J j, D(J i ) = D(J j ). Proof of Theorem 4.1: Assume Equation (5) holds but a job misses its deadline when scheduled with EASD. Let J k be the first job that misses its deadline D(J k ) and t 0 be the last time before D(J k ) such that there are no pending jobs with release times before t 0 and deadlines before or at D(J k ). Since no job can release before system start time, t 0 is well defined. Let ρ be the set of jobs that are released in [t 0, D(J k )] and have deadlines in [t 0, D(J k )]. By choice of t 0 and D(J k ), the jobs that execute in [t 0, D(J k )] are jobs in ρ and possibly a job that blocks a job in ρ. Since there are transition delays for devices, there might be some idle periods in [t 0, D(J k )]. 35

36 First of all, there can be at most one job J b / ρ that blocks any job in ρ, and the blocking job J b must be released before t 0 and has a deadline larger than D(J k ). This conclusion directly follows from SRP and the proof can be found in [2]. Next, we proceed with our proof in two cases: (1) there are idle intervals during [t 0, D(J k )]; and (2) there is no idle interval during [t 0, D(J k )]. Case 1: There are some idle intervals during [t 0, D(J k )]. By choice of t 0 and D(J k ), these idle intervals are only from the time when jobs are waiting for required devices to become active. An example is shown in Figure 5. If several devices are required at the same time, we consider the device that takes the longest time to perform the state transition. Suppose an idle interval [t, t ] is caused by the state transition delay of a device λ k. It is obvious that T imet onextdevreq(λ k, t) 0 and DevAccessDelay(λ k, t) = 0. Thus DevSlack(λ k, t) is equal to DevDepSysSlack(λ k, t). With the EASD algorithm, t t DevSlack(λ k, t). Moreover, DevSlack(λ k, t ) is no less than DevSlack(λ k, t) (t t) according to Lemma B.6. Therefore DevSlack(λ k, t ) = DevDepSysSlack(λ k, t ) 0. Assume that [t, t ] is the last idle interval during [t 0, D(J k )]. It follows that DevDepSysSlack(λ k, t ) 0 for at least one device λ k that is required by some job J x at some time in [t, t ]. Now we show that J λk,t ρ. We discuss it in two cases: (i) J x ρ; and (ii) J x / ρ. Case i: J x ρ. According to Definition A.1, we have Org P rio(j λk,t ) Org P rio(j x). It follows that D(J λk,t ) D(J x) D(J k ). Therefore, J λk,t ρ. Case ii: J x / ρ. As discussed before, the only job that is not in ρ but can execute during [t 0, D(J k )] is the blocking job J b. Thus J x = J b ; and at least one job in ρ, say J y, must be blocked by J x. According to Definition A.1, Org P rio(j λk,t ) Org P rio(j y). It follows that J λk,t D(J λk,t ) D(J y) D(J k ). ρ, since Recall that DevDepSysSlack(λ k, t ) = Min(JobSlack(J i, t )), where J i is any job with a equivalent or a lower priority than Org P rio(j λk,t ). Since DevDepSysSlack(λ k, t ) 0, we know that the job slack of J λk,t and all lower priority jobs are at least 0. Next we show JobSlack(J k, t ) 0, where J k is the first job that misses its deadline. We discuss it in following two cases: (i) D(J λk,t ) < D(J k); and (ii) D(J λk,t ) = D(J k). 36

37 Case i: D(J λk,t ) < D(J k). In this case, JobSlack(J k, t ) 0 because Org P rio(j λk,t ) > Org P rio(j k). Case ii: D(J λk,t ) = D(J k). According to Lemma B.7, JobSlack(J k, t ) = JobSlack(J λk,t, t ) 0. According to Definition A.3, JobSlack(J k, D(J k )) is no less than JobSlack(J k, t ) B(J k ) since there is no idle intervals after time t. Therefore, at time D(J k ), JobSlack(J k, D(J k )) JobSlack(J k, t ) B(J k ) B(J k ) = JobSlack(J k, D(J k )) + B(J k ) 0 = D(J k ) wcet(j i ) + Idle(0, D(J k ))+ D(J i ) D(J k ) et(j i, [0, D(J k )]) D(J i )>D(J k ),R(J i )<t D(J i ) D(J k ) U rem (J i ) This contradicts to the assumption that J k misses its deadline at D(J k ). Case 2: There is no idle period during [t 0, D(J k )]. In this case, the proof is the same as presented in [2], and a contradiction can be acquired. Please refer to [2]. Thus, in conclusion, each case leads to a contradiction of the assumption that Equation (5) holds but a job misses a deadline. Therefore, Theorem 4.1 holds for EASD. 37

38 C Appendix This section presents a sufficient schedulability condition for the ASD algorithm. Suppose n periodic tasks are sorted by their periods, P (T 1 ) P (T 2 )... P (T n ); and λ i is the device with the longest combined state transition delays, i.e., t sw ( λ i ) = t sd ( λ i ) + t wu ( λ i ), among all devices required by T i. With EDF and SRP, each job suffers two context switches; one for a job starting its execution and another one for a job finishing its execution. Note that the context-switch when a job is preempted is attributed to the preempting job; otherwise more than two context-switches can be attributed to a job. The context switch cost cannot be ignored with ASD, because of state transition delay of devices. Context switch costs need to be included in each task s WCET. We first consider the worst context switch cost for each job starting its execution. For any job J i,j that is selected to be executed at time t, the worst case is that λ i just starts switching to the idle state at time t 1 and thus needs to be switched back to the active state. This procedure takes at most t sw ( λ i ) 1 time units, which is the worst case context switch overhead for a job starting its execution, as shown in Figure 16(a). We next consider the worst context switch cost for a job finishing its execution. Suppose that J i,j starts its execution at time t and finishes its execution at time t. If J i,j does not preempt any job using devices at time t, then the context switch cost for finishing execution is 0; otherwise, devices required by the preempted job, say J m,n, should be switched back to the active state, which takes at most t sw ( λ m ) 1 time units. This time should be included in the context switch cost for J i,j finishing its execution, as shown in Figure 16(b). With EDF, a task can only preempt tasks of longer periods. Suppose ˆλ i is the device with the longest transition time among all devices required by tasks that have longer periods than T i. Then a sufficient schedulability condition for the ASD scheduling algorithm is described in Theorem C.1. Theorem C.1. Suppose n periodic tasks are sorted by their periods. They are schedulable by the ASD algorithm if k, 1 k n, k i=1 (wcet(t i ) + t sw ( λ i ) + t sw ( ˆλ i ) 2) P (T i ) where B(T k ) is the maximal length that a job in T k can be blocked. + B(T k) P (T k ) 1, In fact, the condition is the same condition used for EDF algorithm with the SRP proposed in [2]. Since context costs are included in WCETs, the proof of Theorem C.1 directly follows the proof presented in [2] and thus 38

39 J 1,1 J 1,1 J 2,1 J 2,1 λ J 3,1 λ! "" (a) λ Dev(J 2,1); t wu(λ) = t sd (λ) = 4. At time 0, J 1,1 starts its execution and λ begin its state transition to the idle state. J 1,1 finishes its execution at 1. The context switch cost for J 2,1 starting its execution is 7. (b) λ Dev(J 1,1) Dev(J 3,1); t wu(λ) = t sd (λ) = 2. At time 2, J 3,1 is preempted by J 2,1, which is in turn preempted by J 1,1. At time 15 when J 2,1 finishes its execution, λ is performing state transition to the idle state, while it is switched back to the active state at time 16 because J 3,1 requires λ. The context switch cost for J 2,1 finishing its execution is 3. Figure 16. Context switch costs in ASD. (a) context switch cost of a job starts its execution; (b) context switch cost of a job finishing its execution. is omitted here. Note that tighter sufficient conditions of scheduablility may exist. However, addressing other scheduling conditions is beyond the scope of this work. 39

A Dynamic Real-time Scheduling Algorithm for Reduced Energy Consumption

A Dynamic Real-time Scheduling Algorithm for Reduced Energy Consumption A Dynamic Real-time Scheduling Algorithm for Reduced Energy Consumption Rohini Krishnapura, Steve Goddard, Ala Qadi Computer Science & Engineering University of Nebraska Lincoln Lincoln, NE 68588-0115

More information

Non-Work-Conserving Non-Preemptive Scheduling: Motivations, Challenges, and Potential Solutions

Non-Work-Conserving Non-Preemptive Scheduling: Motivations, Challenges, and Potential Solutions Non-Work-Conserving Non-Preemptive Scheduling: Motivations, Challenges, and Potential Solutions Mitra Nasri Chair of Real-time Systems, Technische Universität Kaiserslautern, Germany nasri@eit.uni-kl.de

More information

Lecture 13. Real-Time Scheduling. Daniel Kästner AbsInt GmbH 2013

Lecture 13. Real-Time Scheduling. Daniel Kästner AbsInt GmbH 2013 Lecture 3 Real-Time Scheduling Daniel Kästner AbsInt GmbH 203 Model-based Software Development 2 SCADE Suite Application Model in SCADE (data flow + SSM) System Model (tasks, interrupts, buses, ) SymTA/S

More information

CIS 4930/6930: Principles of Cyber-Physical Systems

CIS 4930/6930: Principles of Cyber-Physical Systems CIS 4930/6930: Principles of Cyber-Physical Systems Chapter 11 Scheduling Hao Zheng Department of Computer Science and Engineering University of South Florida H. Zheng (CSE USF) CIS 4930/6930: Principles

More information

Embedded Systems 14. Overview of embedded systems design

Embedded Systems 14. Overview of embedded systems design Embedded Systems 14-1 - Overview of embedded systems design - 2-1 Point of departure: Scheduling general IT systems In general IT systems, not much is known about the computational processes a priori The

More information

Real-Time Systems. Event-Driven Scheduling

Real-Time Systems. Event-Driven Scheduling Real-Time Systems Event-Driven Scheduling Hermann Härtig WS 2018/19 Outline mostly following Jane Liu, Real-Time Systems Principles Scheduling EDF and LST as dynamic scheduling methods Fixed Priority schedulers

More information

Embedded Systems Development

Embedded Systems Development Embedded Systems Development Lecture 3 Real-Time Scheduling Dr. Daniel Kästner AbsInt Angewandte Informatik GmbH kaestner@absint.com Model-based Software Development Generator Lustre programs Esterel programs

More information

There are three priority driven approaches that we will look at

There are three priority driven approaches that we will look at Priority Driven Approaches There are three priority driven approaches that we will look at Earliest-Deadline-First (EDF) Least-Slack-Time-first (LST) Latest-Release-Time-first (LRT) 1 EDF Earliest deadline

More information

Networked Embedded Systems WS 2016/17

Networked Embedded Systems WS 2016/17 Networked Embedded Systems WS 2016/17 Lecture 2: Real-time Scheduling Marco Zimmerling Goal of Today s Lecture Introduction to scheduling of compute tasks on a single processor Tasks need to finish before

More information

Real-time operating systems course. 6 Definitions Non real-time scheduling algorithms Real-time scheduling algorithm

Real-time operating systems course. 6 Definitions Non real-time scheduling algorithms Real-time scheduling algorithm Real-time operating systems course 6 Definitions Non real-time scheduling algorithms Real-time scheduling algorithm Definitions Scheduling Scheduling is the activity of selecting which process/thread should

More information

Embedded Systems 15. REVIEW: Aperiodic scheduling. C i J i 0 a i s i f i d i

Embedded Systems 15. REVIEW: Aperiodic scheduling. C i J i 0 a i s i f i d i Embedded Systems 15-1 - REVIEW: Aperiodic scheduling C i J i 0 a i s i f i d i Given: A set of non-periodic tasks {J 1,, J n } with arrival times a i, deadlines d i, computation times C i precedence constraints

More information

An Efficient Energy-Optimal Device-Scheduling Algorithm for Hard Real-Time Systems

An Efficient Energy-Optimal Device-Scheduling Algorithm for Hard Real-Time Systems An Efficient Energy-Optimal Device-Scheduling Algorithm for Hard Real-Time Systems S. Chakravarthula 2 S.S. Iyengar* Microsoft, srinivac@microsoft.com Louisiana State University, iyengar@bit.csc.lsu.edu

More information

Clock-driven scheduling

Clock-driven scheduling Clock-driven scheduling Also known as static or off-line scheduling Michal Sojka Czech Technical University in Prague, Faculty of Electrical Engineering, Department of Control Engineering November 8, 2017

More information

Real-Time Scheduling and Resource Management

Real-Time Scheduling and Resource Management ARTIST2 Summer School 2008 in Europe Autrans (near Grenoble), France September 8-12, 2008 Real-Time Scheduling and Resource Management Lecturer: Giorgio Buttazzo Full Professor Scuola Superiore Sant Anna

More information

Non-Preemptive and Limited Preemptive Scheduling. LS 12, TU Dortmund

Non-Preemptive and Limited Preemptive Scheduling. LS 12, TU Dortmund Non-Preemptive and Limited Preemptive Scheduling LS 12, TU Dortmund 09 May 2017 (LS 12, TU Dortmund) 1 / 31 Outline Non-Preemptive Scheduling A General View Exact Schedulability Test Pessimistic Schedulability

More information

Real-Time Systems. Event-Driven Scheduling

Real-Time Systems. Event-Driven Scheduling Real-Time Systems Event-Driven Scheduling Marcus Völp, Hermann Härtig WS 2013/14 Outline mostly following Jane Liu, Real-Time Systems Principles Scheduling EDF and LST as dynamic scheduling methods Fixed

More information

Scheduling Slack Time in Fixed Priority Pre-emptive Systems

Scheduling Slack Time in Fixed Priority Pre-emptive Systems Scheduling Slack Time in Fixed Priority Pre-emptive Systems R.I.Davis Real-Time Systems Research Group, Department of Computer Science, University of York, England. ABSTRACT This report addresses the problem

More information

Lecture 6. Real-Time Systems. Dynamic Priority Scheduling

Lecture 6. Real-Time Systems. Dynamic Priority Scheduling Real-Time Systems Lecture 6 Dynamic Priority Scheduling Online scheduling with dynamic priorities: Earliest Deadline First scheduling CPU utilization bound Optimality and comparison with RM: Schedulability

More information

Process Scheduling for RTS. RTS Scheduling Approach. Cyclic Executive Approach

Process Scheduling for RTS. RTS Scheduling Approach. Cyclic Executive Approach Process Scheduling for RTS Dr. Hugh Melvin, Dept. of IT, NUI,G RTS Scheduling Approach RTS typically control multiple parameters concurrently Eg. Flight Control System Speed, altitude, inclination etc..

More information

CSE 380 Computer Operating Systems

CSE 380 Computer Operating Systems CSE 380 Computer Operating Systems Instructor: Insup Lee & Dianna Xu University of Pennsylvania, Fall 2003 Lecture Note 3: CPU Scheduling 1 CPU SCHEDULING q How can OS schedule the allocation of CPU cycles

More information

Real-Time and Embedded Systems (M) Lecture 5

Real-Time and Embedded Systems (M) Lecture 5 Priority-driven Scheduling of Periodic Tasks (1) Real-Time and Embedded Systems (M) Lecture 5 Lecture Outline Assumptions Fixed-priority algorithms Rate monotonic Deadline monotonic Dynamic-priority algorithms

More information

A 2-Approximation Algorithm for Scheduling Parallel and Time-Sensitive Applications to Maximize Total Accrued Utility Value

A 2-Approximation Algorithm for Scheduling Parallel and Time-Sensitive Applications to Maximize Total Accrued Utility Value A -Approximation Algorithm for Scheduling Parallel and Time-Sensitive Applications to Maximize Total Accrued Utility Value Shuhui Li, Miao Song, Peng-Jun Wan, Shangping Ren Department of Engineering Mechanics,

More information

Resource Sharing Protocols for Real-Time Task Graph Systems

Resource Sharing Protocols for Real-Time Task Graph Systems Resource Sharing Protocols for Real-Time Task Graph Systems Nan Guan, Pontus Ekberg, Martin Stigge, Wang Yi Uppsala University, Sweden Northeastern University, China Abstract Previous works on real-time

More information

EDF Feasibility and Hardware Accelerators

EDF Feasibility and Hardware Accelerators EDF Feasibility and Hardware Accelerators Andrew Morton University of Waterloo, Waterloo, Canada, arrmorton@uwaterloo.ca Wayne M. Loucks University of Waterloo, Waterloo, Canada, wmloucks@pads.uwaterloo.ca

More information

Simulation of Process Scheduling Algorithms

Simulation of Process Scheduling Algorithms Simulation of Process Scheduling Algorithms Project Report Instructor: Dr. Raimund Ege Submitted by: Sonal Sood Pramod Barthwal Index 1. Introduction 2. Proposal 3. Background 3.1 What is a Process 4.

More information

Energy-Constrained Scheduling for Weakly-Hard Real-Time Systems

Energy-Constrained Scheduling for Weakly-Hard Real-Time Systems Energy-Constrained Scheduling for Weakly-Hard Real-Time Systems Tarek A. AlEnawy and Hakan Aydin Computer Science Department George Mason University Fairfax, VA 23 {thassan1,aydin}@cs.gmu.edu Abstract

More information

Task Models and Scheduling

Task Models and Scheduling Task Models and Scheduling Jan Reineke Saarland University June 27 th, 2013 With thanks to Jian-Jia Chen at KIT! Jan Reineke Task Models and Scheduling June 27 th, 2013 1 / 36 Task Models and Scheduling

More information

Aperiodic Task Scheduling

Aperiodic Task Scheduling Aperiodic Task Scheduling Jian-Jia Chen (slides are based on Peter Marwedel) TU Dortmund, Informatik 12 Germany Springer, 2010 2017 年 11 月 29 日 These slides use Microsoft clip arts. Microsoft copyright

More information

Real-Time Systems. Lecture #14. Risat Pathan. Department of Computer Science and Engineering Chalmers University of Technology

Real-Time Systems. Lecture #14. Risat Pathan. Department of Computer Science and Engineering Chalmers University of Technology Real-Time Systems Lecture #14 Risat Pathan Department of Computer Science and Engineering Chalmers University of Technology Real-Time Systems Specification Implementation Multiprocessor scheduling -- Partitioned

More information

CPU SCHEDULING RONG ZHENG

CPU SCHEDULING RONG ZHENG CPU SCHEDULING RONG ZHENG OVERVIEW Why scheduling? Non-preemptive vs Preemptive policies FCFS, SJF, Round robin, multilevel queues with feedback, guaranteed scheduling 2 SHORT-TERM, MID-TERM, LONG- TERM

More information

Lecture Note #6: More on Task Scheduling EECS 571 Principles of Real-Time Embedded Systems Kang G. Shin EECS Department University of Michigan

Lecture Note #6: More on Task Scheduling EECS 571 Principles of Real-Time Embedded Systems Kang G. Shin EECS Department University of Michigan Lecture Note #6: More on Task Scheduling EECS 571 Principles of Real-Time Embedded Systems Kang G. Shin EECS Department University of Michigan Note 6-1 Mars Pathfinder Timing Hiccups? When: landed on the

More information

Non-preemptive Fixed Priority Scheduling of Hard Real-Time Periodic Tasks

Non-preemptive Fixed Priority Scheduling of Hard Real-Time Periodic Tasks Non-preemptive Fixed Priority Scheduling of Hard Real-Time Periodic Tasks Moonju Park Ubiquitous Computing Lab., IBM Korea, Seoul, Korea mjupark@kr.ibm.com Abstract. This paper addresses the problem of

More information

Real-time Scheduling of Periodic Tasks (1) Advanced Operating Systems Lecture 2

Real-time Scheduling of Periodic Tasks (1) Advanced Operating Systems Lecture 2 Real-time Scheduling of Periodic Tasks (1) Advanced Operating Systems Lecture 2 Lecture Outline Scheduling periodic tasks The rate monotonic algorithm Definition Non-optimality Time-demand analysis...!2

More information

Andrew Morton University of Waterloo Canada

Andrew Morton University of Waterloo Canada EDF Feasibility and Hardware Accelerators Andrew Morton University of Waterloo Canada Outline 1) Introduction and motivation 2) Review of EDF and feasibility analysis 3) Hardware accelerators and scheduling

More information

Real-Time Scheduling. Real Time Operating Systems and Middleware. Luca Abeni

Real-Time Scheduling. Real Time Operating Systems and Middleware. Luca Abeni Real Time Operating Systems and Middleware Luca Abeni luca.abeni@unitn.it Definitions Algorithm logical procedure used to solve a problem Program formal description of an algorithm, using a programming

More information

Dynamic I/O Power Management for Hard Real-time Systems 1

Dynamic I/O Power Management for Hard Real-time Systems 1 Dynamic I/O Power Management for Hard Real-time Systems 1 Vishnu Swaminathan y, Krishnendu Chakrabarty y and S. S. Iyengar z y Department of Electrical & Computer Engineering z Department of Computer Science

More information

Priority-driven Scheduling of Periodic Tasks (1) Advanced Operating Systems (M) Lecture 4

Priority-driven Scheduling of Periodic Tasks (1) Advanced Operating Systems (M) Lecture 4 Priority-driven Scheduling of Periodic Tasks (1) Advanced Operating Systems (M) Lecture 4 Priority-driven Scheduling Assign priorities to jobs, based on their deadline or other timing constraint Make scheduling

More information

Real-Time Scheduling

Real-Time Scheduling 1 Real-Time Scheduling Formal Model [Some parts of this lecture are based on a real-time systems course of Colin Perkins http://csperkins.org/teaching/rtes/index.html] Real-Time Scheduling Formal Model

More information

AS computer hardware technology advances, both

AS computer hardware technology advances, both 1 Best-Harmonically-Fit Periodic Task Assignment Algorithm on Multiple Periodic Resources Chunhui Guo, Student Member, IEEE, Xiayu Hua, Student Member, IEEE, Hao Wu, Student Member, IEEE, Douglas Lautner,

More information

Network Flow Techniques for Dynamic Voltage Scaling in Hard Real-Time Systems 1

Network Flow Techniques for Dynamic Voltage Scaling in Hard Real-Time Systems 1 Network Flow Techniques for Dynamic Voltage Scaling in Hard Real-Time Systems 1 Vishnu Swaminathan and Krishnendu Chakrabarty Department of Electrical and Computer Engineering Duke University 130 Hudson

More information

Scheduling Periodic Real-Time Tasks on Uniprocessor Systems. LS 12, TU Dortmund

Scheduling Periodic Real-Time Tasks on Uniprocessor Systems. LS 12, TU Dortmund Scheduling Periodic Real-Time Tasks on Uniprocessor Systems Prof. Dr. Jian-Jia Chen LS 12, TU Dortmund 08, Dec., 2015 Prof. Dr. Jian-Jia Chen (LS 12, TU Dortmund) 1 / 38 Periodic Control System Pseudo-code

More information

Efficient TDM-based Arbitration for Mixed-Criticality Systems on Multi-Cores

Efficient TDM-based Arbitration for Mixed-Criticality Systems on Multi-Cores Efficient TDM-based Arbitration for Mixed-Criticality Systems on Multi-Cores Florian Brandner with Farouk Hebbache, 2 Mathieu Jan, 2 Laurent Pautet LTCI, Télécom ParisTech, Université Paris-Saclay 2 CEA

More information

CMSC 451: Lecture 7 Greedy Algorithms for Scheduling Tuesday, Sep 19, 2017

CMSC 451: Lecture 7 Greedy Algorithms for Scheduling Tuesday, Sep 19, 2017 CMSC CMSC : Lecture Greedy Algorithms for Scheduling Tuesday, Sep 9, 0 Reading: Sects.. and. of KT. (Not covered in DPV.) Interval Scheduling: We continue our discussion of greedy algorithms with a number

More information

3. Scheduling issues. Common approaches 3. Common approaches 1. Preemption vs. non preemption. Common approaches 2. Further definitions

3. Scheduling issues. Common approaches 3. Common approaches 1. Preemption vs. non preemption. Common approaches 2. Further definitions Common approaches 3 3. Scheduling issues Priority-driven (event-driven) scheduling This class of algorithms is greedy They never leave available processing resources unutilized An available resource may

More information

On-line scheduling of periodic tasks in RT OS

On-line scheduling of periodic tasks in RT OS On-line scheduling of periodic tasks in RT OS Even if RT OS is used, it is needed to set up the task priority. The scheduling problem is solved on two levels: fixed priority assignment by RMS dynamic scheduling

More information

Probabilistic Preemption Control using Frequency Scaling for Sporadic Real-time Tasks

Probabilistic Preemption Control using Frequency Scaling for Sporadic Real-time Tasks Probabilistic Preemption Control using Frequency Scaling for Sporadic Real-time Tasks Abhilash Thekkilakattil, Radu Dobrin and Sasikumar Punnekkat Mälardalen Real-Time Research Center, Mälardalen University,

More information

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University Che-Wei Chang chewei@mail.cgu.edu.tw Department of Computer Science and Information Engineering, Chang Gung University } 2017/11/15 Midterm } 2017/11/22 Final Project Announcement 2 1. Introduction 2.

More information

Online Scheduling Switch for Maintaining Data Freshness in Flexible Real-Time Systems

Online Scheduling Switch for Maintaining Data Freshness in Flexible Real-Time Systems Online Scheduling Switch for Maintaining Data Freshness in Flexible Real-Time Systems Song Han 1 Deji Chen 2 Ming Xiong 3 Aloysius K. Mok 1 1 The University of Texas at Austin 2 Emerson Process Management

More information

Exam Spring Embedded Systems. Prof. L. Thiele

Exam Spring Embedded Systems. Prof. L. Thiele Exam Spring 20 Embedded Systems Prof. L. Thiele NOTE: The given solution is only a proposal. For correctness, completeness, or understandability no responsibility is taken. Sommer 20 Eingebettete Systeme

More information

Energy-Efficient Real-Time Task Scheduling in Multiprocessor DVS Systems

Energy-Efficient Real-Time Task Scheduling in Multiprocessor DVS Systems Energy-Efficient Real-Time Task Scheduling in Multiprocessor DVS Systems Jian-Jia Chen *, Chuan Yue Yang, Tei-Wei Kuo, and Chi-Sheng Shih Embedded Systems and Wireless Networking Lab. Department of Computer

More information

Process Scheduling. Process Scheduling. CPU and I/O Bursts. CPU - I/O Burst Cycle. Variations in Bursts. Histogram of CPU Burst Times

Process Scheduling. Process Scheduling. CPU and I/O Bursts. CPU - I/O Burst Cycle. Variations in Bursts. Histogram of CPU Burst Times Scheduling The objective of multiprogramming is to have some process running all the time The objective of timesharing is to have the switch between processes so frequently that users can interact with

More information

CycleTandem: Energy-Saving Scheduling for Real-Time Systems with Hardware Accelerators

CycleTandem: Energy-Saving Scheduling for Real-Time Systems with Hardware Accelerators CycleTandem: Energy-Saving Scheduling for Real-Time Systems with Hardware Accelerators Sandeep D souza and Ragunathan (Raj) Rajkumar Carnegie Mellon University High (Energy) Cost of Accelerators Modern-day

More information

Real-time Scheduling of Periodic Tasks (2) Advanced Operating Systems Lecture 3

Real-time Scheduling of Periodic Tasks (2) Advanced Operating Systems Lecture 3 Real-time Scheduling of Periodic Tasks (2) Advanced Operating Systems Lecture 3 Lecture Outline The rate monotonic algorithm (cont d) Maximum utilisation test The deadline monotonic algorithm The earliest

More information

EECS 571 Principles of Real-Time Embedded Systems. Lecture Note #7: More on Uniprocessor Scheduling

EECS 571 Principles of Real-Time Embedded Systems. Lecture Note #7: More on Uniprocessor Scheduling EECS 571 Principles of Real-Time Embedded Systems Lecture Note #7: More on Uniprocessor Scheduling Kang G. Shin EECS Department University of Michigan Precedence and Exclusion Constraints Thus far, we

More information

Real-Time Dynamic Power Management through Device Forbidden Regions

Real-Time Dynamic Power Management through Device Forbidden Regions IEEE Real-Time and Embedded Technology and Applications Symposium Real-Time Dynamic Power Management through Device Forbidden Regions Vinay Devadas Hakan Aydin Department of Computer Science George Mason

More information

Resource Sharing in an Enhanced Rate-Based Execution Model

Resource Sharing in an Enhanced Rate-Based Execution Model In: Proceedings of the 15th Euromicro Conference on Real-Time Systems, Porto, Portugal, July 2003, pp. 131-140. Resource Sharing in an Enhanced Rate-Based Execution Model Xin Liu Steve Goddard Department

More information

Shedding the Shackles of Time-Division Multiplexing

Shedding the Shackles of Time-Division Multiplexing Shedding the Shackles of Time-Division Multiplexing Farouk Hebbache with Florian Brandner, 2 Mathieu Jan, Laurent Pautet 2 CEA List, LS 2 LTCI, Télécom ParisTech, Université Paris-Saclay Multi-core Architectures

More information

Scheduling Lecture 1: Scheduling on One Machine

Scheduling Lecture 1: Scheduling on One Machine Scheduling Lecture 1: Scheduling on One Machine Loris Marchal October 16, 2012 1 Generalities 1.1 Definition of scheduling allocation of limited resources to activities over time activities: tasks in computer

More information

System Model. Real-Time systems. Giuseppe Lipari. Scuola Superiore Sant Anna Pisa -Italy

System Model. Real-Time systems. Giuseppe Lipari. Scuola Superiore Sant Anna Pisa -Italy Real-Time systems System Model Giuseppe Lipari Scuola Superiore Sant Anna Pisa -Italy Corso di Sistemi in tempo reale Laurea Specialistica in Ingegneria dell Informazione Università di Pisa p. 1/?? Task

More information

TEMPORAL WORKLOAD ANALYSIS AND ITS APPLICATION TO POWER-AWARE SCHEDULING

TEMPORAL WORKLOAD ANALYSIS AND ITS APPLICATION TO POWER-AWARE SCHEDULING TEMPORAL WORKLOAD ANALYSIS AND ITS APPLICATION TO POWER-AWARE SCHEDULING Ye-In Seol 1, Jeong-Uk Kim 1 and Young-Kuk Kim 2, 1 Green Energy Institute, Sangmyung University, Seoul, South Korea 2 Dept. of

More information

Feedback EDF Scheduling of Real-Time Tasks Exploiting Dynamic Voltage Scaling

Feedback EDF Scheduling of Real-Time Tasks Exploiting Dynamic Voltage Scaling Feedback EDF Scheduling of Real-Time Tasks Exploiting Dynamic Voltage Scaling Yifan Zhu and Frank Mueller (mueller@cs.ncsu.edu) Department of Computer Science/ Center for Embedded Systems Research, North

More information

arxiv: v1 [cs.os] 6 Jun 2013

arxiv: v1 [cs.os] 6 Jun 2013 Partitioned scheduling of multimode multiprocessor real-time systems with temporal isolation Joël Goossens Pascal Richard arxiv:1306.1316v1 [cs.os] 6 Jun 2013 Abstract We consider the partitioned scheduling

More information

RUN-TIME EFFICIENT FEASIBILITY ANALYSIS OF UNI-PROCESSOR SYSTEMS WITH STATIC PRIORITIES

RUN-TIME EFFICIENT FEASIBILITY ANALYSIS OF UNI-PROCESSOR SYSTEMS WITH STATIC PRIORITIES RUN-TIME EFFICIENT FEASIBILITY ANALYSIS OF UNI-PROCESSOR SYSTEMS WITH STATIC PRIORITIES Department for Embedded Systems/Real-Time Systems, University of Ulm {name.surname}@informatik.uni-ulm.de Abstract:

More information

Lightweight Real-Time Synchronization under P-EDF on Symmetric and Asymmetric Multiprocessors

Lightweight Real-Time Synchronization under P-EDF on Symmetric and Asymmetric Multiprocessors Consistent * Complete * Well Documented * Easy to Reuse * Technical Report MPI-SWS-216-3 May 216 Lightweight Real-Time Synchronization under P-EDF on Symmetric and Asymmetric Multiprocessors (extended

More information

Generalized Network Flow Techniques for Dynamic Voltage Scaling in Hard Real-Time Systems

Generalized Network Flow Techniques for Dynamic Voltage Scaling in Hard Real-Time Systems Generalized Network Flow Techniques for Dynamic Voltage Scaling in Hard Real-Time Systems Vishnu Swaminathan and Krishnendu Chakrabarty Department of Electrical & Computer Engineering Duke University Durham,

More information

The Concurrent Consideration of Uncertainty in WCETs and Processor Speeds in Mixed Criticality Systems

The Concurrent Consideration of Uncertainty in WCETs and Processor Speeds in Mixed Criticality Systems The Concurrent Consideration of Uncertainty in WCETs and Processor Speeds in Mixed Criticality Systems Zhishan Guo and Sanjoy Baruah Department of Computer Science University of North Carolina at Chapel

More information

CPU scheduling. CPU Scheduling

CPU scheduling. CPU Scheduling EECS 3221 Operating System Fundamentals No.4 CPU scheduling Prof. Hui Jiang Dept of Electrical Engineering and Computer Science, York University CPU Scheduling CPU scheduling is the basis of multiprogramming

More information

Rate Monotonic Analysis (RMA)

Rate Monotonic Analysis (RMA) Rate Monotonic Analysis (RMA) ktw@csie.ntu.edu.tw (Real-Time and Embedded System Laboratory) Major References: An Introduction to Rate Monotonic Analysis Tutorial Notes SEI MU* Distributed Real-Time System

More information

System-Wide Energy Minimization for Real-Time Tasks: Lower Bound and Approximation

System-Wide Energy Minimization for Real-Time Tasks: Lower Bound and Approximation System-Wide Energy Minimization for Real-Time Tasks: Lower Bound and Approximation XILIANG ZHONG and CHENG-ZHONG XU Wayne State University We present a dynamic voltage scaling (DVS) technique that minimizes

More information

Scheduling. Uwe R. Zimmer & Alistair Rendell The Australian National University

Scheduling. Uwe R. Zimmer & Alistair Rendell The Australian National University 6 Scheduling Uwe R. Zimmer & Alistair Rendell The Australian National University References for this chapter [Bacon98] J. Bacon Concurrent Systems 1998 (2nd Edition) Addison Wesley Longman Ltd, ISBN 0-201-17767-6

More information

Scheduling of Frame-based Embedded Systems with Rechargeable Batteries

Scheduling of Frame-based Embedded Systems with Rechargeable Batteries Scheduling of Frame-based Embedded Systems with Rechargeable Batteries André Allavena Computer Science Department Cornell University Ithaca, NY 14853 andre@cs.cornell.edu Daniel Mossé Department of Computer

More information

Load Regulating Algorithm for Static-Priority Task Scheduling on Multiprocessors

Load Regulating Algorithm for Static-Priority Task Scheduling on Multiprocessors Technical Report No. 2009-7 Load Regulating Algorithm for Static-Priority Task Scheduling on Multiprocessors RISAT MAHMUD PATHAN JAN JONSSON Department of Computer Science and Engineering CHALMERS UNIVERSITY

More information

A Response-Time Analysis for Non-preemptive Job Sets under Global Scheduling

A Response-Time Analysis for Non-preemptive Job Sets under Global Scheduling A Response-Time Analysis for Non-preemptive Job Sets under Global Scheduling Mitra Nasri 1, Geoffrey Nelissen 2, and Björn B. Brandenburg 1 1 Max Planck Institute for Software Systems (MPI-SWS), Germany

More information

Real Time Operating Systems

Real Time Operating Systems Real Time Operating ystems Luca Abeni luca.abeni@unitn.it Interacting Tasks Until now, only independent tasks... A job never blocks or suspends A task only blocks on job termination In real world, jobs

More information

Real Time Operating Systems

Real Time Operating Systems Real Time Operating ystems hared Resources Luca Abeni Credits: Luigi Palopoli, Giuseppe Lipari, and Marco Di Natale cuola uperiore ant Anna Pisa -Italy Real Time Operating ystems p. 1 Interacting Tasks

More information

Task Reweighting under Global Scheduling on Multiprocessors

Task Reweighting under Global Scheduling on Multiprocessors ask Reweighting under Global Scheduling on Multiprocessors Aaron Block, James H. Anderson, and UmaMaheswari C. Devi Department of Computer Science, University of North Carolina at Chapel Hill March 7 Abstract

More information

Tardiness Bounds under Global EDF Scheduling on a Multiprocessor

Tardiness Bounds under Global EDF Scheduling on a Multiprocessor Tardiness ounds under Global EDF Scheduling on a Multiprocessor UmaMaheswari C. Devi and James H. Anderson Department of Computer Science The University of North Carolina at Chapel Hill Abstract This paper

More information

A Theory of Rate-Based Execution. A Theory of Rate-Based Execution

A Theory of Rate-Based Execution. A Theory of Rate-Based Execution Kevin Jeffay Department of Computer Science University of North Carolina at Chapel Hill jeffay@cs cs.unc.edu Steve Goddard Computer Science & Engineering University of Nebraska Ð Lincoln goddard@cse cse.unl.edu

More information

Energy-efficient Mapping of Big Data Workflows under Deadline Constraints

Energy-efficient Mapping of Big Data Workflows under Deadline Constraints Energy-efficient Mapping of Big Data Workflows under Deadline Constraints Presenter: Tong Shu Authors: Tong Shu and Prof. Chase Q. Wu Big Data Center Department of Computer Science New Jersey Institute

More information

Controlling Preemption for Better Schedulability in Multi-Core Systems

Controlling Preemption for Better Schedulability in Multi-Core Systems 2012 IEEE 33rd Real-Time Systems Symposium Controlling Preemption for Better Schedulability in Multi-Core Systems Jinkyu Lee and Kang G. Shin Dept. of Electrical Engineering and Computer Science, The University

More information

Runtime feasibility check for non-preemptive real-time periodic tasks

Runtime feasibility check for non-preemptive real-time periodic tasks Information Processing Letters 97 (2006) 83 87 www.elsevier.com/locate/ipl Runtime feasibility check for non-preemptive real-time periodic tasks Sangwon Kim, Joonwon Lee, Jinsoo Kim Division of Computer

More information

TDDB68 Concurrent programming and operating systems. Lecture: CPU Scheduling II

TDDB68 Concurrent programming and operating systems. Lecture: CPU Scheduling II TDDB68 Concurrent programming and operating systems Lecture: CPU Scheduling II Mikael Asplund, Senior Lecturer Real-time Systems Laboratory Department of Computer and Information Science Copyright Notice:

More information

UC Santa Barbara. Operating Systems. Christopher Kruegel Department of Computer Science UC Santa Barbara

UC Santa Barbara. Operating Systems. Christopher Kruegel Department of Computer Science UC Santa Barbara Operating Systems Christopher Kruegel Department of Computer Science http://www.cs.ucsb.edu/~chris/ Many processes to execute, but one CPU OS time-multiplexes the CPU by operating context switching Between

More information

On the Soft Real-Time Optimality of Global EDF on Multiprocessors: From Identical to Uniform Heterogeneous

On the Soft Real-Time Optimality of Global EDF on Multiprocessors: From Identical to Uniform Heterogeneous On the Soft Real-Time Optimality of Global EDF on Multiprocessors: From Identical to Uniform Heterogeneous Kecheng Yang and James H. Anderson Department of Computer Science, University of North Carolina

More information

Schedulability analysis of global Deadline-Monotonic scheduling

Schedulability analysis of global Deadline-Monotonic scheduling Schedulability analysis of global Deadline-Monotonic scheduling Sanjoy Baruah Abstract The multiprocessor Deadline-Monotonic (DM) scheduling of sporadic task systems is studied. A new sufficient schedulability

More information

Real-time Systems: Scheduling Periodic Tasks

Real-time Systems: Scheduling Periodic Tasks Real-time Systems: Scheduling Periodic Tasks Advanced Operating Systems Lecture 15 This work is licensed under the Creative Commons Attribution-NoDerivatives 4.0 International License. To view a copy of

More information

Scheduling I. Today Introduction to scheduling Classical algorithms. Next Time Advanced topics on scheduling

Scheduling I. Today Introduction to scheduling Classical algorithms. Next Time Advanced topics on scheduling Scheduling I Today Introduction to scheduling Classical algorithms Next Time Advanced topics on scheduling Scheduling out there You are the manager of a supermarket (ok, things don t always turn out the

More information

The FMLP + : An Asymptotically Optimal Real-Time Locking Protocol for Suspension-Aware Analysis

The FMLP + : An Asymptotically Optimal Real-Time Locking Protocol for Suspension-Aware Analysis The FMLP + : An Asymptotically Optimal Real-Time Locking Protocol for Suspension-Aware Analysis Björn B. Brandenburg Max Planck Institute for Software Systems (MPI-SWS) Abstract Multiprocessor real-time

More information

Optimal Utilization Bounds for the Fixed-priority Scheduling of Periodic Task Systems on Identical Multiprocessors. Sanjoy K.

Optimal Utilization Bounds for the Fixed-priority Scheduling of Periodic Task Systems on Identical Multiprocessors. Sanjoy K. Optimal Utilization Bounds for the Fixed-priority Scheduling of Periodic Task Systems on Identical Multiprocessors Sanjoy K. Baruah Abstract In fixed-priority scheduling the priority of a job, once assigned,

More information

Schedule Table Generation for Time-Triggered Mixed Criticality Systems

Schedule Table Generation for Time-Triggered Mixed Criticality Systems Schedule Table Generation for Time-Triggered Mixed Criticality Systems Jens Theis and Gerhard Fohler Technische Universität Kaiserslautern, Germany Sanjoy Baruah The University of North Carolina, Chapel

More information

Paper Presentation. Amo Guangmo Tong. University of Taxes at Dallas January 24, 2014

Paper Presentation. Amo Guangmo Tong. University of Taxes at Dallas January 24, 2014 Paper Presentation Amo Guangmo Tong University of Taxes at Dallas gxt140030@utdallas.edu January 24, 2014 Amo Guangmo Tong (UTD) January 24, 2014 1 / 30 Overview 1 Tardiness Bounds under Global EDF Scheduling

More information

Module 5: CPU Scheduling

Module 5: CPU Scheduling Module 5: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling Algorithm Evaluation 5.1 Basic Concepts Maximum CPU utilization obtained

More information

Maximizing Rewards for Real-Time Applications with Energy Constraints

Maximizing Rewards for Real-Time Applications with Energy Constraints Maximizing Rewards for Real-Time Applications with Energy Constraints COSMIN RUSU, RAMI MELHEM, and DANIEL MOSSÉ University of Pittsburgh New technologies have brought about a proliferation of embedded

More information

Non-Work-Conserving Scheduling of Non-Preemptive Hard Real-Time Tasks Based on Fixed Priorities

Non-Work-Conserving Scheduling of Non-Preemptive Hard Real-Time Tasks Based on Fixed Priorities Non-Work-Conserving Scheduling of Non-Preemptive Hard Real-Time Tasks Based on Fixed Priorities Mitra Nasri, Gerhard Fohler Chair of Real-time Systems, Technische Universität Kaiserslautern, Germany {nasri,

More information

Supplement of Improvement of Real-Time Multi-Core Schedulability with Forced Non- Preemption

Supplement of Improvement of Real-Time Multi-Core Schedulability with Forced Non- Preemption 12 Supplement of Improvement of Real-Time Multi-Core Schedulability with Forced Non- Preemption Jinkyu Lee, Department of Computer Science and Engineering, Sungkyunkwan University, South Korea. Kang G.

More information

Polynomial Time Algorithms for Minimum Energy Scheduling

Polynomial Time Algorithms for Minimum Energy Scheduling Polynomial Time Algorithms for Minimum Energy Scheduling Philippe Baptiste 1, Marek Chrobak 2, and Christoph Dürr 1 1 CNRS, LIX UMR 7161, Ecole Polytechnique 91128 Palaiseau, France. Supported by CNRS/NSF

More information

Periodic scheduling 05/06/

Periodic scheduling 05/06/ Periodic scheduling T T or eriodic scheduling, the best that we can do is to design an algorithm which will always find a schedule if one exists. A scheduler is defined to be otimal iff it will find a

More information

Energy-aware Scheduling on Multiprocessor Platforms with Devices

Energy-aware Scheduling on Multiprocessor Platforms with Devices Energy-aware Scheduling on Multiprocessor Platforms with Devices Dawei Li, Jie Wu Keqin Li Dept. of Computer and Information Sciences Dept. of Computer Science Temple Univ., PA State Univ. of NY at New

More information

Failure Tolerance of Multicore Real-Time Systems scheduled by a Pfair Algorithm

Failure Tolerance of Multicore Real-Time Systems scheduled by a Pfair Algorithm Failure Tolerance of Multicore Real-Time Systems scheduled by a Pfair Algorithm Yves MOUAFO Supervisors A. CHOQUET-GENIET, G. LARGETEAU-SKAPIN OUTLINES 2 1. Context and Problematic 2. State of the art

More information

Chapter 6: CPU Scheduling

Chapter 6: CPU Scheduling Chapter 6: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling Algorithm Evaluation 6.1 Basic Concepts Maximum CPU utilization obtained

More information