Online Energy-Aware I/O Device Scheduling for Hard Real-Time Systems with Shared Resources

Size: px

Start display at page:

Download "Online Energy-Aware I/O Device Scheduling for Hard Real-Time Systems with Shared Resources"

Byron Watkins
5 years ago
Views:

1 Online Energy-Aware I/O Device Scheduling for Hard Real-Time Systems with Shared Resources Abstract The challenge in conserving energy in embedded real-time systems is to reduce power consumption while preserving temporal correctness. Previous research has focused on power conservation for the processor, while power conservation for I/O devices has received little attention. In this paper, we analyze the problem of online energyaware I/O scheduling for hard real-time systems based on the preemptive periodic task model with non-preemptive shared resources. We propose two online energy-aware I/O device scheduling algorithms: Conservative Energy- Aware EDF (CEA-EDF) and Enhanced Aggressive Shut Down (EASD). The CEA-EDF algorithm makes conservative predictions for device usage and guarantees that a device is in the active state before or at the time the job that requires it is released. The EASD algorithm utilizes device slack to perform device power state transitions to save energy, without jeopardizing temporal correctness. Both algorithms are preemptive but support non-preemptive shared critical regions. An evaluation of the two approaches shows that both yield significant energy savings with respect to no Dynamic Power Management (DPM) techniques. The actual savings depends on the task set, shared devices, and the power requirements of the devices. EASD provides better power savings than CEA-EDF. However CEA-EDF has very low overhead and performs comparable to EASD when the system workload is low. 1 Introduction In recent years, many embedded real-time systems have emerged with energy conservation requirements. Most of these systems consist of a microprocessor with I/O devices and batteries with limited power capacity. Therefore, aggressive energy conservation techniques are needed to extend their lifetimes. Traditionally, the research community has focused on processor-based power management techniques, with many articles published on processor energy conservation. On the other hand, research of energy conservation with I/O devices has received little 1

2 attention. In practice, however, I/O devices are also important power consumers, but typically support fewer power states than processors. At an arbitrary instance, most devices can only be in one of two states: active or idle. To increase energy savings, the time for which a device is idle must be increased, but I/O devices take much longer to perform power state transitions than processors. A DSP device can take as long as 500 ms to switch states [19]. Furthermore, energy consumption during state transitions is not negligible. Too many context switches may increase power consumption rather than decreasing it. The problem of saving energy for I/O devices in hard real-time systems is a dilemma: we want to shut down a device whenever the device is not being used, but at a risk of turning on devices too late, which makes some jobs miss their deadlines, or causes unnecessary state switches that waste energy. We discuss this in detail in Section 3. In this paper, we analyze the problem of energy-aware I/O scheduling for hard real-time systems based on the preemptive periodic task model with non-preemptive shared resources. Without knowing the actual job executions a priori, an optimal solution is not possible for either online or offline scheduling algorithms. Here we define optimal as the maximum energy savings for a task set. The actual savings depends on the task set, actual execution times, shared devices, and the power requirements of the devices. Two online scheduling algorithms that support shared resources are proposed: Conservative Energy-Aware EDF (CEA-EDF) and Enhanced Aggressive Shut Down (EASD). Both of these algorithms use Earliest Deadline First (EDF) [6] to schedule jobs, and use Stack Resource Policy (SRP) [2] to control access to shared resources, which are typically granted to jobs on a non-preemptive basis and used in a mutually exclusive manner. When performing preemptive scheduling with I/O devices, I/O devices become important shared resources whose access needs to be carefully managed. For example, a job that performs an uninterruptible I/O operation can block the execution of all jobs with higher priorities. Thus the time for the uninterruptible I/O operation needs to be treated as a nonpreemptive resource access. Other resources besides I/O devices include critical sections of code, reader/writer buffers, etc. The rest of this paper is organized as follows. Section 2 discusses related work. The problem of energy-aware I/O device scheduling is analyzed in Section 3. Section 4 describes the proposed algorithms. Section 5 describes how we evaluated our system and presents the results. Section 6 presents our conclusions and describes future work. 2

3 2 Related Work In the past decade, much research work has been conducted on low-power design methodologies for realtime embedded systems. For hard real-time systems, the research has focused primarily on reducing the power consumption of the processor. The research on power conservation technologies for I/O devices, though important, has received little attention. Most Dynamic Power Management (DPM) techniques for devices are based on switching a device to a low power state (or shutdown) during an idle interval. DPM techniques for I/O devices in non-real-time systems focus on switching the devices into low power states based on various policies (e.g., [9, 10, 8, 5, 23]). These strategies cannot be directly applied to real-time systems because of their non-deterministic nature. Some energy-aware I/O scheduling algorithms [18, 19, 20, 21] have been developed for hard real-time systems. Among them, [18, 19, 20] are non-preemptive methods, which are known to have limitations. With non-preemptive scheduling, a higher priority task that has been released might have to wait a long time to run (until the current task gives up the CPU). This reduces the set of tasks that the scheduler can support with hard temporal guarantees. For example, non-preemptive scheduling algorithms cannot support any task set in which there is a task with a period shorter than or equal to the Worst Case Execution Time (WCET) of another task. For this reason, most commercial real-time kernels support preemptive task scheduling. In [18], Swaminathan et al. presented the Low Energy Device Scheduler (LEDES) for energy-aware I/O device scheduling for hard real-time systems. LEDES takes as input a pre-determined task schedule and a device-usage list for each task and generates a sequence of sleep/working states for each device. LEDES determines this sequence such that the energy consumed by the devices is minimized while guaranteeing that no task misses its deadline. However, LEDES differs from our work in that it is based on the assumption that scheduling points always occur at task start or completion times. In other words, it can only support non preemptive task scheduling. Another assumption is that the execution times of all tasks are greater than the transition times of required devices. This assumption may not be valid if some required devices have relatively large transition delays, e.g. disks. An extension of LEDES to handle I/O devices with multiple power states is presented in [20] by Swaminathan and Chakrabarty. Multi-state Constrained Low Energy Scheduler (MUSCLES) takes as input a pre-computed task schedule and a per-task device usage list to generate a sequence of power state switching times for I/O devices while guaranteeing that real-time constraints are not violated. MUSCLES is also a non-preemptive method. The pruning-based scheduling algorithm, Energy-optimal Device Scheduler (EDS), is an off-line method in 3

4 which jobs are rearranged to find the minimum energy task schedule [19]. EDS generates a schedule tree by selectively pruning the branches of the tree. Pruning is done based on both temporal and energy constraints. Similar to LEDES and MUSCLES, EDS can only support non-preemptive scheduling systems. The only known published energy-aware algorithm for preemptive schedules, Maximum Device Overlap (MDO), is an offline method proposed by the same authors in [21]. The MDO algorithm uses a real-time scheduling algorithm, e.g., EDF or RM, to generate a feasible real-time job schedule, and then iteratively swaps job segments to reduce energy consumption in device power state transitions. After the heuristic-based job schedule is generated, the device schedule is extracted. That is, device power state transition actions and times are recorded prior to runtime and used at runtime. A deficiency of the MDO algorithm is that it does not explicitly address the issue of resource blocking. It is usually impossible to estimate when a resource blocking will happen at the offline phase. Thus it is hard to integrate a resource accessing policy into MDO. Without considering resource blocking, it is possible that a feasible offline heuristic job schedule results in an invalid online schedule, especially with swapping of job segments. Another problem with MDO is that it does not consider the situation when job executions are less than their WCET; the schedule is generated with jobs WCET. Even without resource blocking, the actual job executions can be very different from the pre-generated job schedule. A fixed device schedule cannot effectively adapt to actual job executions. This problem is further discussed in Section 5. The CEA-EDF and EASD algorithms proposed in this paper remove these drawbacks by providing energysaving scheduling for periodic task sets that have feasible preemptive schedules with blocking for shared resources. To the best of our knowledge, no previous publication has addressed this problem. Another advantage of CEA-EDF and EASD over existing algorithms is that they support actual execution times less than WCET. Unused WCET is dynamically reclaimed to increase energy savings. 3 Problem description Modern I/O devices usually have at least two power states: active and idle. To save energy, a device can be switched to the idle state when it is not in use. In a real-time system, in order to guarantee that jobs will meet their deadlines, a device cannot be made idle without knowing when it will be requested by a job, but, the precise time at which an application requests the operating system for a device is usually not known. Even without knowing the exact time at which requests are made, we can safely assume that devices are requested within the time of execution of the job making the request. Throughout the paper, we assume that task scheduling is based on EDF and resource access is based on SRP. 4

5 The EDF algorithm is a well-known optimal scheduling algorithm. SRP has two advantages over other resource accessing policies: (1) it has low context switch overhead. No job is ever blocked once its execution starts, and no job ever suffers more than two context switches. For other policies such as the Priority-Ceiling Protocol (PCP) [14], four context switches can occur if a job requires one or more resources. (2) A job can be blocked for at most the duration of one critical section. Therefore, the blocking time is bounded. The non-preemptable segment of a job is called a critical section. 3.1 Preliminaries Suppose that the set of devices and resources required by each task during its execution is specified along with the temporal parameters of a periodic task set. More formally, given a periodic task set with deadlines equal to periods, τ = {T 1, T 2,...T n }, let task T i be specified by the four tuple (P (T i ), wcet(t i ), Dev(T i ), Res(T i )) where, P (T i ) is the period, wcet(t i ) is the WCET, Dev(T i ) = {λ 1, λ 2,..., λ m } is the set of required devices for the task T i, and Res(T i ) = {r 1, r 2,...r n } is the set of resources required by the task. Note that Dev(T i ) specifies physical devices required by a task T i, while Res(T i ) specifies how these devices appear as shared resources to task T i. A non-preemptive device may appear as a shared resource with different access times to different tasks; and a preemptive device may be included in Dev(T i ) but not in Res(T i ). Furthermore, if the I/O operation of device λ i Dev(T i ) is non-interruptible, then a shared resource representing λ i should be put in the resource set of T i and all tasks with higher priorities to prevent a possible preemption. For example, suppose that task T i performs a non-interruptible I/O operation for 10 ms during its execution, then a resource with access time of 10 ms should be put in the resource set of T i ; and the same resource with access time of 0 should be put in the resource set of all tasks that may preempt T i (e.g., tasks with shorter periods for EDF). T i can be preempted by higher priority tasks at any time before or after this I/O operation. In summary, a shared resource is a general concept in this model. Suppose a section of task T i is non-preemptive to a subset of tasks, α, then this section should be treated as a shared resource to T i and all tasks in α. A task is an infinite sequence of jobs released every P (T i ) time units. We refer to the j th job of a task T i as J i,j. We let Dev(J i,j ) denote the set of devices that are required by J i,j. Throughout this paper, we have Dev(J i,j )=Dev(T i ). We let et(j i,j, [t, t ]) denote the execution time of job J i,j during the interval [t, t ]. It follows that et(j i,j, [0, t]) is the actual execution time of J i,j, if t is equal to or larger than the time job J i,j finishes its execution. Furthermore, the priorities of all jobs are assigned according to EDF. For any two jobs, the job with the earlier deadline has a higher priority. If two jobs have equivalent deadlines, the job with the earlier release time has 5

6 a higher priority. The original assigned priority of a job J i,j is denoted by Org P rio(j i,j ). Note Org P rio(j i,j ) is the original assigned priority and is not changed during execution, though the actual priority may change due to priority inheritance with SRP. Associated with each device λ i are the following parameters: the transition time from the idle state to the active state represented by t wu (λ i ); the transition time from the active state to the idle state represented by t sd (λ i ); the energy consumed per unit time in the active state represented by P active (λ i ); the energy consumed per unit time in the idle state represented by P idle (λ i ); the energy consumed per unit time during the transition from the active state to the idle state represented by P sd (λ i ); and the energy consumed per unit time during the transition from the idle state to the active state represented by P wu (λ i ). We assume that for any device, the state switch can only be performed when the device is in a stable state, i.e. the idle state or the active state. We will use these parameters in the problem discussion and algorithm descriptions. 3.2 Motivation The generalized problem that we aim to solve in this paper can be stated as, given a periodic task set τ = {T 1, T 2,...T n }, T i = (P (T i ), wcet(t i ), Dev(T i ), Res(T i )), is there a preemptive schedule that meets all deadlines and also reduces the energy consumed by the I/O system? It is clear that the total energy consumed by a device λ i in the hyperperiod H, is given by, E λi = E active + E idle + E sw (1) where, E active is the energy consumed when λ i is in the active state; E idle is the energy consumed by λ i when it is in the idle state; and E sw is the energy consumed when λ i is in transition states. Let T active (λ i ), T idle (λ i ), T wu (λ i ) and T sd (λ i ) denote the time that the device λ i is active, idle, and during wake up/shut shown state transitions respectively. Then we have E active = P active (λ i ) T active (λ i ), E idle = P idle (λ i ) T idle (λ i ) and E sw = P wu (λ i ) T wu (λ i ) + P sd (λ i ) T sd (λ i ). In addition, for most devices, we have, P active (λ i ), P wu (λ i ), P sd (λ i ) > P idle (λ i ) (2) From Equations (1) and (2), it can be seen that to increase energy savings, an energy-aware scheduler should make it the first priority to decrease T active (λ i ) as well as the number of power state transitions. However, it is usually hard to decrease both at the same time while not affecting temporal correctness. For example, consider the obvious approach of aggressively shutting down devices whenever they are not needed, which is called the Aggressive Shut Down (ASD) algorithm. ASD reduces T active as much as possible, but may increase energy consumption because it may introduce too many device switches. 6

7 T 1 J 1,1 J 1,2 T 2 J 2,1 J 2,1 λ k Figure 1. The device state transition delay can cause system failure even when the system utilization is low. T 1 = {12, 2,, }; T 2 = {30, 5, {λ k }, }; t wu (λ k ) = t sd (λ k ) = 8. The system utilization is 33.3%. The optimal solution to this problem should be that job executions are arranged in a way that has the lowest energy consumption while still guaranteeing that all tasks meet their deadlines. However, this is an NP-hard problem and an efficient optimal solution is not possible for on-line scheduling due to its huge overhead. At first thought, an offline approach seems better because it can utilize pre-calculated task schedules. However, it is difficult to integrate a resource accessing policy into an offline approach because it is hard to predict exact points that jobs access resources at the offline phase. It is possible that a seemly feasible offline job schedule causes jobs to miss their deadlines at runtime. Moreover, using an offline approach alone can be inefficient, since the offline approach can only use the worst execution time of each task, and is difficult to adapt to actual job executions. Compared to offline methods, it is simple to integrate a resource access policy into an online algorithm. Moreover, online algorithms can better adapt to run-time situations. Unused worst case execution time can be exploited to increase T idle for devices. In this paper, we try to find online solutions to this problem. As discussed before, the ASD algorithm is a straightforward online method. However, this method cannot be directly applied to hard real-time systems due to two inherent constraints: 1. To ensure timing constraints are met using the ASD method, device switch times must be included in a task s WCET, which may compromise the system s schedulability. Figure 1 shows an example where a system is not schedulable with the ASD algorithm, though the CPU utilization is only 33.3%. 2. The second problem is that the ASD algorithm does not consider energy consumption associated with device state transitions. Since the energy consumption associated with the state switch could be high, the ASD algorithm may waste energy. Consider using a device with very high switch power costs. It is easy to find scenarios where the ASD algorithm consumes more energy than keeping the device active all the time. 7

8 3.3 Approach Despite the two constraints, the ASD algorithm can achieve excellent energy savings in many systems since it reduces T active of devices as much as possible. The starting point of this work is to conquer the two constraints of the ASD algorithm and make it applicable to hard real-time system. To this end, two objectives exist in our solutions: (1) our algorithms are applicable to all task sets schedulable with EDF and SRP; (2) our algorithms consider the problem of energy consumption associated with device state switches. Therefore, our algorithms can guarantee energy savings in any feasible situation. Two online preemptive scheduling algorithms that support shared resources are proposed: Conservative Energy- Aware EDF (CEA-EDF) and Enhanced Aggressive Shut Down (EASD). For the first problem, CEA-EDF guarantees that a device is in the active state when a job requiring the device releases. Although this seems too conservative, this algorithm can achieve significant energy savings and can be easily implemented with very little overhead, thus yielding a good performance/cost ratio. EASD employs a different approach. It keeps track of the amount of time a device can be kept inactive without causing a job to miss its deadline. This time is called device slack in this paper. A device is allowed to switch its state only when the device slack is large enough. Detailed discussion is presented in Section 4. Both algorithms utilize the concept of break-even time [3], which represents the minimum inactivity time required to compensate for the cost of entering and exiting the idle state. For example, suppose a device λ k is active at time t and it is known that no job requires it during [t, t + t]. To save energy, λ k can be switched to the idle state at time t and be switched back to the active state at time t + t if t is larger than the transition time t sd (λ k ) + t wu (λ k ). The amount of energy expended by the device during this period is the sum of the energy expended during transitions, E wu and E sd, and the energy expended in the idle state, E idle. However, for a device λ k that expends considerable energy during state transitions, it is possible that the device can consume less energy if it is kept active during this period. That is, E wu + E sd + E idle > P active (λ k ) t. Therefore, λ k needs to be in the idle state long enough to save energy. Let BE(λ i ) denote the break-even time of device λ i. By knowing the energy expended for transitions, E wu (λ k ) = P wu (λ k ) t wu (λ k ) and E sd (λ k ) = P sd (λ k ) t sd (λ k ), as well as the transition delay t sw = t wu (λ k ) + t sd (λ k ), we can calculate the break-even time, BE(λ k ), as 8

9 P active BE(λ k ) = E wu (λ k ) + E sd (λ k ) + P idle (BE(λ k ) t sw (λ k )) = BE(λ k ) = E wu(λ k ) + E sd (λ k ) P idle (λ k ) t sw (λ k ) P active (λ i ) P idle (λ i ) Note that the break-even time has to be larger than the transition delay, i.e., t sw (λ k ). So the break-even time is given by BE(λ k ) = Max(t sw (λ k ), E wu(λ k ) + E sd (λ k ) P idle (λ k ) t sw (λ k ) ) (3) P active (λ i ) P idle (λ i ) It is clear that if a device is idle for less than the break-even time, it is not worth performing the state switch. Therefore, our approach makes decisions of device state transition based on the break-even time rather than device state transition delay. At this point, an obvious improvement to the ASD algorithm can be made so that it utilizes the break-even time to conquer the second constraint. The Switch-Aware Aggressive Shut Down (SA-ASD) algorithm, which makes this enhancement to ASD, and a sufficient schedulability condition are presented in Appendix D. Systems that can satisfy the schedulability condition for the SA-ASD algorithm should use the SA-ASD algorithm rather than the proposed CEA-EDF and EASD algorithms, because of lower scheduling overhead for SA-ASD. Note that the SA-ASD algorithm still has the first constraint, however, which is overcome by both CEA-EDF and EASD. 4. Algorithms This section introduces two OS-directed, real-time, DPM techniques applicable to I/O devices, which are based on EDF [6] and SRP [2]: CEA-EDF and EASD. However, we first briefly review how SRP works with EDF Review of EDF and SRP Each task T i is assigned a preemption level P L(T i ), which is the reciprocal of the period of the task. The preemption ceiling of any resource r i is the highest preemption level of all the tasks that require r i. We use the term Π(t) to denote the current ceiling of the system, which is the highest-preemption level ceiling of all the resources that are in use at time t. Φ is a non existing preemption level that is lower than the lowest preemption level of all tasks. The rules can be stated as follows [7]. 1. Update of the Current Ceiling: Whenever all the resources are free, the preemption ceiling of the system is Φ; otherwise, the preemption ceiling Π(t) is the highest preemption level ceiling of all the resources that are 9

10 in use at time t. The preemption ceiling of the system is thus updated when a resource is allocated or freed (at the end of a critical section). 2. Scheduling Rule: After a job is released, it is blocked from starting execution until its preemption level is higher than the current ceiling Π(t) of the system and the preemption level of the executing job. At any time t, jobs that are not blocked are scheduled on the processor according to their deadlines. 3. Allocation Rule: Whenever a job requests a resource, it is allocated the resource. 4. Priority-Inheritance Rule: When some job is blocked from starting, the blocking job inherits the highest priority of all blocked jobs. When scheduling by EDF and SRP, a job can be blocked for at most the duration of one critical section, which includes regions of shared resource access. The calculation of the maximal blocking duration can be found in [7]. The computation is done off-line and used at runtime. We use B(T i ) to denote the blocking duration for task T i. In addition, the maximal blocking duration of each job J i,j, B(J i,j ), of task T i is equal to B(T i ) CEA-EDF CEA-EDF is a simple, low-overhead energy-aware scheduling algorithm for hard real-time systems. All devices that a job needs are active at or before the job is released. Thus devices are safely shut down without affecting the schedulability of tasks. Before describing CEA-EDF, we define the next device request time and time to next device request that are used in keeping track of the earliest time that a device is required. Definition 4.1. Next Device Request Time. The next device request time is denoted by NextDevReqT ime(λ k, t) and is the earliest time that a device λ k is requested by an uncompleted job. Since a job can only use a device after the job is released, the next device request time of a device λ k is given by NextDevReqT ime(λ k, t) = Min(R(J i,j )) J i,j, λ k Dev(J i,j ) and J i,j is not completed at time t where R(J i,j ) is the release time of job J i,j, Dev(J i,j ) is the set of devices required by J i,j. Definition 4.2. Time To Next Device Request. The time to next device request for device λ k at time t is denoted by T imet onextdevreq(λ k, t) and is the time from current time t to the next device request time of λ k. Therefore, the time to next device request of a device λ k at time t is given by T imet onextdevreq(λ k, t) = NextDevReqT ime(λ k, t) t; 10

11 Device Tasks requiring λ i Current Jobs R(J i,j ) NextDevReqT ime(λ k, t) Power up time Up(λ k ) λ 1 {T 1, T 3, T 5 } {J 1,20, J 3,15, J 5,78 } {160, 200, 220} λ 2 {T 1, T 2 } {J 1,20, J 2,25 } {160, 210} λ 3 {T 3, T 4, T 6 } {J 3,15, J 4,25, J 6,18 } {200, 215, 207} Table 1. Device Usage Table. The CEA-EDF scheduler uses this table to keep tracks of next device request time for each device. The CEA-EDF scheduler also use this table to power up an idle device λ k based on the maintained Up(λ k ). 1 Function T imet onextdevreq() 2 Input: current system time t and the current executing job J i,j; 3 Output: renewed time to next device request for all devices; 4 If (t: instance when job J i,j is completed) 5 α α J i,j + J i,j+1; // α is the set of current jobs requiring λ k. 6 λ k Dev(J i,j), NextDevReqT ime(λ k, t) Min(R(J m,n)), where J m,n α; // Update next device request time. 7 λ k Dev(J i,j), T imet onextdevreq(λ k, t) NextDevReqT ime(λ k, t) t; 8 Else 9 // do nothing 10 End Figure 2. The pseudocode for T imet on extdevreq algorithm. This algorithm updates the time to next device request for devices. The CEA-EDF scheduler maintains a table for each device as shown in Table 1. The current job of a task is the uncompleted job with the earliest deadline among all jobs of the task. For example, suppose J 1,1 is the first job of a task T 1, and J 1,1 is released at time 0 and is completed at time 10. Then the current job of task T 1 is J 1,1 during [0, 10). Suppose the second job of T 1, J 1,2, is released at time 40 and is finished at time 50. Then J 1,2 is the current job of T 1 during [10, 50). By Definition 4.1, the next device request time NextDevReqT ime(λ k, t) is the minimal release time of all current jobs that require device λ k. With CEA-EDF, a device λ i is switched to the low power state at time t when T imet onextdevreq(λ i, t) > BE(λ i ), where BE(λ i ) is the break-even time for device λ i and computed using Equation (3). CEA-EDF sets a power up time, Up(λ i ), for device λ i when λ i is switched to the idle state. For any idle device, it is switched back to the active state if the power up time Up(λ i ) is equal to the current time t. The CEA-EDF scheduling algorithm then can be described as in Figure 3, and is invoked at scheduling points and when a power up time is reached. We define scheduling points as time instances at which jobs are released, completed, or exit critical sections. An example of CEA-EDF scheduling is illustrated in Figure 4. Our experiments in Section 5 show that CEA-EDF scheduling is effective in energy savings, especially when the system workload is low. Meanwhile, the implementation of CEA-EDF is simple. 11

1 Preprocessing: 2 Compute Break-Even time BE(λ k ) (1 k m) for each device, as shown in Equation (3).

k m) for each device, as defined in Definition 4.1.

k ); // Set the power up timer for λ k 9 End 10 End 11 If (t: λ k, λ k = idle and Up(λ k ) = t) // Switch λ k to active when current time is the power up time.

12 1 Preprocessing: 2 Compute Break-Even time BE(λ k ) (1 k m) for each device, as shown in Equation (3). 3 Initiate next device request time NextDevReqT ime(λ k, 0) (1 k m) for each device, as defined in Definition Device scheduling at time t: 5 If (t: instance when job J i,j is completed) 6 If ( λ k, λ k = active and T imet onextdevreq(λ k, t) > BE(λ k )) 7 λ k idle; 8 Up(λ k ) NextDevReqT ime(λ k, t) t wu(λ k ); // Set the power up timer for λ k 9 End 10 End 11 If (t: λ k, λ k = idle and Up(λ k ) = t) // Switch λ k to active when current time is the power up time. 12 λ k active; 13 Up(λ k ) 1; // Clear the power up timer for λ k 14 End 15 Schedule jobs by EDF(SRP). Figure 3. The CEA-EDF algorithm. BE(λ k ) is the Break-Even time for λ k. t wu (λ k ) is the transition delay from the idle state to the active state. Up(λ k ) is the power up time set to λ k, at when the device will be powered up. J 1,1 J 2,1 λ 1 λ 2 Figure 4. CEA-EDF scheduling example; (a) the job scheduling from EDF. J 1,1 is released at 6 and uses device λ 1. J 2,1 is released at 2 and uses device λ 2. J 1,1 has a higher priority than J 2,1. (b) the device state transition with the CEA-EDF algorithm EASD As discussed in Section 3, an energy-aware I/O device scheduler should reduce T active (λ i ), which is the time that a device λ i is in the active state, since a device usually has the highest energy consumption rate in the active state. The ASD algorithm can reduce T active (λ i ) for all λ i. However, some task sets may not be able to utilize ASD because they cannot satisfy schedulablility conditions with WCETs that include device transition time. On the other hand, the CEA-EDF algorithm can be applied to any task set that is schedulable, but it is not as efficient as possible since it conservatively keeps some devices active while jobs requiring these devices are not the currently executing job. The EASD algorithm, which is based on the ASD algorithm, addresses these limitations by keeping track of device slack, which is defined as follows. 12

18 16 T 1 T 2 λ k J 1,1 J1,2 J 1, 3 J 2,1 Device access delay 14 12 10 8 6 4 2 0 (a) Job schedule and device schedule. 2 0 2 4 6 8 10 12 14 16 18 20 22 24 Time (b) The device access delay for λ k.

13 18 16 T 1 T 2 λ k J 1,1 J1,2 J 1, 3 J 2,1 Device access delay (a) Job schedule and device schedule Time (b) The device access delay for λ k Device dependent system slack Device slack Time (c) The device dependent system slack for λ k Time (d) The device slack for λ k. Figure 5. Device Slack examples. T 1 = {10, 4,, }; T 2 = {30, 4, {λ k }, }. That is, λ k Dev(T 2 ). For device λ k, t sd (λ k ) = t wu (λ k ) = 8; BE(λ k ) = 18. The device slack shown in (d) is the sum of the device access delay shown in (b) and the device dependent system slack shown in (c). Definition 4.3. Device slack. The device slack is the maximal length of time that a device λ i can be inactive without causing any job to miss its deadline. We let DevSlack(λ i, t) denote the device slack for a device λ i at time t. The energy savings provided by EASD is closely related to the amount of available device slack. The more device slack is exploited, the more opportunities can be created to put devices in the idle state to save energy. Thus exploiting available device slack is the focus of EASD. Device slack for a device λ k comes from different sources. As discussed in the CEA-EDF algorithm, the time to next device request is a source of available device slack. The CEA-EDF algorithm utilizes the time to next device request to keep devices idle to save energy. However, other sources of device slack exist. For example, another source of device slack for a device λ k comes from the execution of higher priority jobs that do not require device λ k. In this case, jobs requiring λ k cannot execute and thus create slack for the device. As shown in Figure 5, job J 1,1 occupies the CPU during [0, 4]; and the interval [0, 4] becomes a part of the device slack for device λ k. This 13

14 kind of slack does not introduce idle intervals and thus does not jeopardize temporal correctness of the system. In a system for which the utilization is less than 1, the execution of a job might be postponed without jeopardizing temporal correctness of the system; thus creating additional device slack. As shown in Figure 5, idle intervals are inserted in the interval [0, 22] because J 2,1 is delayed by the state transition of device λ k. However, this kind of slack needs to be carefully managed to maintain system schedulability. The EASD algorithm can utilize this kind of device slack while still guaranteeing that every job meets its deadline. In summary, three sources of device slack are identified in this paper: device access delay, device dependent system slack and time to next device request. The time to next device request is defined in Definition 4.2. Definition 4.4. Device access delay. The device access delay for a device λ k is the time during which jobs requiring λ k cannot execute because of the execution of higher priority jobs that do not need λ k. The device access delay for a device λ k at time t is denoted DevAccessDelay(λ k, t). Definition 4.5. Device dependent system slack. The device dependent system slack is the maximum amount of time that the CPU can be idle before the execution of any jobs requiring device λ k without causing any jobs to miss their deadlines. The device dependent system slack for a device λ k at time t is denoted DevDepSysSlack(λ k, t). As shown in Figure 5, the device access delay and the device dependent system slack for device λ k can be combined to create the device slack for λ k because they represent non-overlaping device slacks. In the example shown in Figure 5, the device dependent system slack for device λ k at time 0 is 14. That is, idle intervals with total length of 14 time units can be inserted before the execution of J 2,1 without causing any job to miss its deadline. Note that the 14 units of idle time are separated into two intervals, i.e., interval [4, 10] and interval [14, 22], as shown in Figure 5(c). If a single idle interval with a length of 14 time units is inserted at time 4, job J 1,2 will miss its deadline. Additional device slack of 8 time units comes from the execution of jobs J 1,1 and J 1,2, as shown in Figure 5(b). The two kinds of device slack do not decrease at the same time. On the contrary, the time to next device request cannot be combined with either the device access delay or the device dependent system slack because they might overlap. Therefore, the device slack of a device λ k can be given by, DevSlack(λ k, t) = max(t imet onextdevreq(λ k, t), DevAccessDelay(λ k, t) + DevDepSysSlack(λ k, t)) (4) Given the device slack for each device at time t, it is straightforward to implement the EASD algorithm. The algorithm is presented in Figure 6. This algorithm contains three parts: (1) update device slack for all devices; (2) perform device state transitions; and (3) schedule jobs with EDF and SRP. 14

15 1 The EASD Algorithm 2 // J exec is the job that is selected to occupy the CPU at time t. 3 Update the device slack for all devices at t 4 If (t: t is a scheduling point) 5 λ k, DevSlack(λ k, t) Max(T imet onextdevreq(λ k, t), DevAccessDelay(λ k, t) + DevDepSysSlack(λ k, t)); 6 Else // t is not a scheduling point 7 λ k, DevSlack(λ k, t) DevSlack(λ k, t 1) 1; 8 End 9 Perform device state transitions at time t: 10 If (t: λ k, λ k / Dev(J exec) and λ k = active and DevSlack(λ k, t) > BE(λ k )) 11 λ k idle; 12 t enteridle (λ k ) t; // The time that λ k starts the state transition to idle. 13 End 14 // Next condition makes sure that λ k has been idle for long enough to compensate for energy consumed in state transition. 15 If (t: λ k, λ k = idle and t t enteridle (λ k ) BE(λ k ) t wu(λ k )) 16 If (λ k Dev(J exec) or DevSlack(λ k, t) t wu(λ k )) 17 λ k active; 18 End 19 End 20 Schedule job with EDF(SRP); 21 End Figure 6. The peudocode for the EASD algorithm. As shown in Equation (4), updating the device slack requires updating of the next device request time, device access delay and device dependent system slack. Note that the computation for the next device request time, device access delay and device dependent system slack are only performed at scheduling points, i.e., the time instances of job completion, job release and existing critical sections. At time instances other than scheduling points, the device slack for any device is simply decreased by 1 per time unit. The algorithms to compute device access delay and device dependent system slack are provided in Appendix A. In those computations, we assume that there are n tasks in the system; and the current job of each task T i at time t, is given by J ci whose absolute deadline is denoted by D(J ci ). Without loss of generality, suppose the current job J c1 and J cn have the earliest deadline D(J c1 ) and the latest deadline D(J cn ) among all current jobs respectively. In the example illustrated in Figure 5(a), we have n = 2; J c1 = J 1,2 and J cn = J 2,1 at time 5. As shown in Appendix A, the update of the device slack for all devices can be done by looking at each job J i, Org P rio(j cn ) Org P rio(j i ) Org P rio(j c1 ). The worst case computational complexity of this algorithm at scheduling points is O(m + n + K), where m is the number of uncompleted jobs with priorities within [Org P rio(j cn ), Org P rio(j c1 )], n is the total number of tasks in the system and K is the total number of devices. Let T l be the task with the longest period, and T s be the task with the shortest period. In the worst case, m is a function of 2 P (T l )/P (T s ) (n 1)

16 4.4. Schedulability This section presents a sufficient schedulability condition for the CEA-EDF and EASD scheduling algorithms. The condition is the same condition used for the EDF algorithm with SRP [2]: Theorem 4.1. Suppose n periodic tasks are sorted by their periods. They are schedulable by CEA-EDF and EASD if k, 1 k n, k i=1 wcet(t i ) P (T i ) + B(T k) 1, (5) P (T k ) where B(T k ) is the maximal length that a job in T k can be blocked, which is caused by accessing non-preemptive resources including I/O device resources and non I/O device resources. Note that device state transition delay is not included. With the CEA-EDF algorithm, a device λ k is guaranteed to be in the active state when any jobs requiring λ k are released. Therefore, CEA-EDF does not affect the schedulability of any systems. In other words, Theorem 4.1 is true for CEA-EDF. The problem for EASD is much more complex than for CEA-EDF. With the EASD algorithm, a device is switched to the idle state when its device slack is larger than its break-even time. Therefore, there might be some intervals that the CPU is idle while there are some pending jobs waiting for required devices to be switched to the active state. Since the proof for the EASD algorithm requires the knowledge of job slack and device dependent system slack, the proof is presented in Appendix B. 5 Evaluation In this section, we present evaluation results for the CEA-EDF and EASD algorithms. Section 5.1 describes the evaluation methodology used in this study. Section 5.2 describes the evaluation of the algorithms with various system utilizations. Section 5.3 evaluates the ability that CEA-EDF and EASD reclaim unused WCETs to save energy; and compares the performance of MDO with CEA-EDF and EASD. Section 5.4 gives a comparison of the CEA-EDF and EASD algorithms Methodology We evaluated the CEA-EDF and EASD algorithms using an event-driven simulator. This approach is consistent with evaluation approaches adopted by other researches for energy-aware I/O scheduling [18, 20, 19]. To better evaluate the two algorithms, we compute the minimal energy requirement, LOW-BOUND, for each simulation. The 16

17 Device P active (W) P idle (W) P wu, P sd (W) t wu, t sd (ms) 1 Realtek Ethernet Chip [13] MaxStream Wireless module [11] IBM Microdrive [17] SST Flash SST39LF020 [16] SimpleTech Flash Card [15] Table 2. Device Specifications. LOW-BOUND is acquired by assuming that the time and energy overhead of device state transition is 0. A device is shut off whenever it is not required by the current executing job, and is powered up as soon as a job requiring it is executing. Therefore, the LOW-BOUND represents an energy consumption level that is not achievable for any scheduling algorithm. The power requirements and state switching times for devices were obtained from data sheets provided by the manufacturer. The devices used in experiments are listed in Table 2. The normalized energy savings is used to evaluate the energy savings of the algorithms. The normalized energy savings is the amount of energy saved under a DPM algorithm relative to the case when no DPM technique is used, wherein all devices remain in the active state over the entire simulation. The normalized energy savings is computed using Equation (6). Normalized Energy Savings = 1 Energy with CEA-EDF or EASD Energy with No DPM (6) In all experiments, we used randomly generated task sets to evaluate the performance of the CEA-EDF and EASD algorithms. All task sets are pretested to satisfy the feasibility condition shown in Equation (5). Each generated task set contained 1 10 tasks. Each task in a task set required a random number (0 3) of devices from Table 2. Critical sections of all jobs were randomly generated. Other characteristics include task periods and the best/worst case execution ratio, which are specified in each experiment. We repeated each experiment 500 times and present the mean value. During the whole experiment, all jobs meet their deadlines with the CEA-EDF and EASD algorithms. Although the worst case computational complexity of EASD is briefly discussed in Section 4.3, it may still be a concern that EASD has too much scheduling overhead in practice. We did not measure scheduling overhead in real systems since all algorithms were evaluated with simulations. Instead, we compared the scheduling overhead of CEA-EDF and EASD with respect to EDF(SRP) in our simulations. We used relative scheduling overhead to evaluate the scheduling overhead of CEA-EDF and EASD. The relative scheduling overhead is given by relative scheduling overhead = scheduling overhead with CEA-EDF or EASD 1 scheduling overhead with EDF( SRP) 1 Most vendors report only a single switching time. Thus we used this time for both t wu and t sd. 17

18 Mean energy saving under different system utilizations Mean energy saving under different system utilizations Mean energy saving under different system utilizations 0.9 CEA EDF EASD LOW BOUND 0.9 CEA EDF EASD LOW BOUND 0.9 CEA EDF EASD LOW BOUND Normilized Energy Saving Normilized Energy Saving Normilized Energy Saving System utilization System utilization System utilization (a) Mean normalized energy savings of different system utilization settings for task sets with periods in [50, 200]. (b) Mean normalized energy savings of different system utilization settings for task ferent system utilization settings for task (c) Mean normalized energy savings of dif- sets with periods in [200, 2000]. sets with periods in [2000, 8000]. Figure 7. Normalized energy savings with multiple devices. The mean value of the relative scheduling overhead of CEA-EDF is 3.1%, verifying that CEA-EDF is a lowoverhead algorithm. The mean value of the relative scheduling overhead of EASD is 59.3%. Considering that the scheduling overhead of EDF(SRP) is very low, a relative overhead of 59.3% is affordable. For example, if a system spends 1 time units out of every 1000 time units to perform scheduling with EDF(SRP), then the system will spend only time units out of every 1000 time units to perform scheduling with EASD. Therefore, the scheduling overhead of EASD is very low with respect to the whole system Average energy savings In this experiment, we measured the overall performance of CEA-EDF and EASD. Periods of tasks are chosen from three groups: [50, 200]; [200, 2000] and [2000, 8000], which represent short-period, mid-period and longperiod groups respectively. The intention of experimenting with different ranges of task periods is to evaluate the relation of the energy saving and the ratio of task periods to device state transition times. Within each group, task periods and WCETs were randomly selected such that they are schedulable according to Theorem 4.1. We first focus on the relationship of normalized energy savings to the system utilization, which is the sum of the worst case utilization for all tasks. In this experiment, we set the best/worst case execution time ratio to 1. Figure 7 shows the mean normalized energy saving for the CEA-EDF and the EASD under different system utilizations. On average, EASD saves more energy than CEA-EDF. In most cases, as the system utilization increases, the normalized energy savings decreases. The rationale for this is that as tasks execute more, the amount of time devices can be kept in idle mode decreases. Also it can be seen from the figure that the performance of EASD is comparable to the LOW-BOUND. 18

19 Mean energy saving CEA EDF EASD LOW BOUND Mean energy saving CEA EDF EASD MDO LOW BOUND Normilized Energy Saving Normilized Energy Saving Best Case Execution Time / Worst Case Execution Time Best Case Execution Time / Worst Case Execution Time (a) Normalized energy savings for various ratios of the best case execution time to the worst case execution time. (b) Normalized energy savings for various ratios of the best case execution time to the worst case execution time. Note no shared resource in this experiment since MDO does not address the issue of resource blocking. Figure 8. Reclaiming unused WCETs to save energy. An important, albeit intuitive finding, is that the ratio of device state transition time to task periods greatly affects the energy savings. Both algorithms perform worst in the experiment with short periods, as shown in Figure 7(a). This is consistent with our expectations. For example, suppose it takes a long time for a device λ k to perform a state transition; and λ k is used in a system in which tasks have very short periods, then λ k has little chance to be switched to the idle state. Furthermore, the performance of EASD is close to optimal when tasks are in the longperiod group, as shown in Figure 7(c). It can be seen that EASD is more sensitive to task periods than CEA-EDF. This is also consistent with our discussion of device slack in Appendix A, since device slack is closely related to task periods. With the same system utilization, the mean device slack of devices in a system with longer periods should be larger than those of devices in a system with shorter periods Reclaiming unused WCETs to save energy In practice, job actual execution times can be less than their WCETs. Unused WCETs can be reclaimed to save energy. First, we evaluate the ability of CEA-EDF and EASD to save energy by dynamically reclaiming the slack coming from unused WCETs. Figure 8(a) shows the normalized energy savings for the CEA-EDF and EASD with increasing best/worst case execution time ratios. In this experiment, the system utilization is set between 90% and 100%; and task periods are chosen from the mid-period group, i.e., [200, 2000]. As with the first experiment, critical sections of all jobs were randomly generated. As shown in Figure 8(a), both CEA-EDF and EASD save more energy when the ratio of the best/worst case execution time is smaller, showing that both algorithms can 19

20 dynamically reclaim unused WCETs to save energy. Moreover, the difference between EASD and LOW-BOUND remains almost unchanged with different best/worst case execution time ratios, which means that EASD is able to fully reclaim the slack created by unused WCETs to save energy. Similar results are acquired for the short-period group and long-period group, and are therefore omitted here. In addition, we compare the energy saving of CEA-EDF and EASD with MDO, which is the only published energy-aware device scheduling algorithm for preemptive schedules. This comparison intends to evaluate the advantage of online algorithms (CEA-EDF and EASD) over an offline-alone algorithm (MDO) in utilizing unused WCETs to save energy. As discussed in Section 2, MDO cannot support shared non-preemptive resources. Therefore, no critical section is generated for any job in this experiment. That is, task sets used in this experiment are fully preemptive. As shown in Figure 8(b), MDO has energy savings of only an additional 0.74% over EASD when job execution times are equal to their WCETs, i.e., the runtime job execution is exactly as computed with MDO at the offline phase. However, the energy saving of MDO does not increase when the best/worst case execution time ratio decreases, because MDO does not utilize unused WCETs to save energy [4]. It can be seen from Figure 8(b) that MDO even saves less energy than CEA-EDF when the best/worst case execution time is less than Comparison of CEA-EDF and EASD The last experiment compares the energy saving of EASD to CEA-EDF. We use the normalized additional energy savings to evaluate the additional energy savings of the EASD algorithm. The normalized additional energy savings is the amount of energy saved under the EASD algorithm relative to the CEA-EDF algorithm. It is computed using Equation (7). Normalized Additional Energy Savings = 1 Energy with EASD Energy with CEA-EDF (7) In this experiment, task periods are chosen from the mid-period group, i.e., [200, 2000]. The best/worst case execution time ratio is set to 1. The distribution of normalized additional energy savings with three ranges of system utilization is presented in Figure 9. The results are consistent with previous experiment results. CEA-EDF performs well when the system workload is low. When the system utilization is less than 10%, CEA-EDF performs almost the same as EASD; and when the system utilization is less than 60%, CEA-EDF performs close to EASD. There are a few instances in which the EASD algorithm actually results in more energy being consumed than the CEA-EDF algorithm. This is because the EASD tries to reduce the time that devices are in the active state, but this causes more device switches. A remarkable result from these experiments is that CEA-EDF performs well, on average, compared to EASD 20

21 Percentage of simulation Percentage of simulation Percentage of simulation Normalized Energy Savings (a) The distribution of normalized additional energy savings with system utilization of 0-10% Normalized Energy Savings (b) The distribution of normalized additional energy savings with system utilization of 40-50% Normalized Energy Savings (c) The distribution of normalized additional energy savings with system utilization of %. Figure 9. Comparison of the CEA-EDF and the EASD. X axis represents the normalized additional energy saving of the EASD to the CEA-EDF, Y axis represents the percentage of the normalized additional energy savings gained in all simulations. when the system workload is low. Even in cases where the system utilization is near 100%, the CEA-EDF algorithm can still achieve nearly 40% energy savings for I/O devices. Moreover, CEA-EDF can be used together with an energy-aware processor scheduler without any modification, because CEA-EDF has no influence on processor scheduling while it has an excellent performance/cost ratio. We will conduct the integration of CEA-EDF and energy-aware processor scheduling in future research. 6 Conclusion Two hard real-time scheduling algorithms were presented for conserving energy in device subsystems. Both algorithms support the preemptive scheduling of periodic tasks with non-preemptive shared resources. The CEA- EDF algorithm, though a relatively simple extension to EDF scheduling, provides remarkable power savings when the system workload is low. On the other hand, EASD can produce more energy savings than CEA-EDF. Ultimately, the choice of which energy saving algorithm to choose, if any, depends on the temporal parameters of the task set and devices utilized. Although the power management of the processor is not addressed in this paper, our work can be applied to reduce the leakage power consumption of the processor, which is expected to become an increasingly larger fraction of the processor energy consumption. Leakage power consumption is reduced by disabling all or parts of the processor whenever possible. Therefore, with the CPU as a shared device for all tasks, our algorithms can be applied without any modification. In general, CEA-EDF and EASD do not result in the minimum energy schedule when multiple devices are 21

22 shared. The problem of finding a feasible schedule that consumes minimum I/O device energy is NP-hard. Hence, our focus was not to find the optimal solution but to create algorithms that reduce the energy consumption of multiple shared devices and that can be executed online to adapt to the work load. This work provides the foundation for a family of general, online energy saving algorithms that can be applied to systems with hard temporal constraints. References [1] Advanced configuration & power interface specification. Advanced Configuration & Power Interface, August 2003, [2] Baker, T.P., Stack-Based Scheduling of Real-Time Processes, Real-Time Systems, 3(1):67-99, March [3] Benini, L., Bogliolo, A., and Micheli, G., A survey of design techniques for system-level dynamic power management, IEEE Trans. VLSI Syst., vol. 8, June [4] Chakrabarty., K, Correspondence with the author of the MDO algorithm, May [5] Golding, R.A., Bosch, p., Staelin, C., Sullivan, T., and Wilkes, J., Idleness if not sloth, Proceedings of the Winter USENIX Conference, [6] Liu and Layland, Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment, Journal of the ACM, 20(1), January, [7] Liu, J., Real-time Systems, Prentice Hall, [8] Lu Y. H., Benini L., Power-Aware operating systems for interactive systems, IEEE Transactions on Very Large Scale Integration Systems, 10(2): , April [9] Lu Y. H., Benini L. and Micheli G., Operating-System Directed Power Reduction, International Symposium on Low Power Electronics and Design, [10] Lu Y. H., Benini L., and Micheli G., Requester-Aware Power Reduction, International Symposium on System Synthesis, Stanford University, pages 18 23, September, [11] Maxstream 9xstream 900mhz wireless OEM module. xstreammanual.pdf. [12] Microsoft OnNow power management architecture. whdc/hwdev/tech/onnow/onnowapp Print.mspx. [13] Realtek ISA full-duplex ethernet controller RTL8019AS. ftp:// /cn/nic/rtl8019as/spec-8019as.zip. [14] Sha, L., Rajkumar, R., and Lehoczky, J.P., Priority inheritance protocols: an approach to real-time synchronization, IEEE Transactions on Computers, page , [15] Simpletech compact flash card. prox.php. [16] SST multi-purpose flash SST39LF datasheet/s71150.pdf. [17] IBM microdrive DSCM F532791CA062C38F87256AC00060DD49 /file/ibm md datasheet.pdf. [18] Swaminathan, V., Chakrabarty, K., and Iyengar, S.S., Dynamic I/O Power Management for Hard Real-time Systems In Proceedings of the Ninth International Symposium on Hardware/Software Codesign, p , April 2001,Copenhagen,Denmark. [19] Swaminathan, V., Chakrabarty, K., Pruning-based energy-optimal device scheduling for hard real-time systems, In Proceedings of the tenth international symposium on Hardware/software codesign, Pages: , [20] Swaminathan, V., and Chakrabarty, K., Energy-conscious, deterministic I/O device scheduling in hard real-time systems, IEEE Transactions on Computer-Aided Design of Integrated Circuits & Systems, vol 22, pages , July

23 [21] Swaminathan, V., and Chakrabarty, K., Pruning-based, Energy-optimal, Deterministic I/O Device Scheduling for Hard Real-Time Systems, ACM Transactions on Embeded Computing Systems, 4(1): , February [22] Tia, T.S., Utilizing Slack Time for aperiodic and sporadic requests scheduling in real-time systems., Ph.D. thesis, University of Illinois at Urbana-Champaign, Department of Computing Science, [23] Weiser, M., Welch, B., Demers, A.J., and Shenker, S., Scheduling for Reduced CPU Energy, Operating Systems Design and Implementation,

24 A Appendix In this section, we provide algorithms used to compute device access delay and device dependent system slack. To reduce computational overhead, the original priority of all jobs are computed offline. We assume that there are N jobs in the hyperperiod; and all jobs in a hyperperiod are ordered by their priorities, Org P rio(j 1 ) > Org P rio(j 2 ) > Org P rio(j 3 ) >... > Org P rio(j N ). Without loss of generality, suppose that the current job of each task T i at time t, is given by J ci whose absolute deadline is denoted by D(J ci ). Job J c1 and J cn have the earliest deadline D(J c1 ) and the latest deadline D(J cn ) among all current jobs respectively. In Appendix A.1 and Appendix A.2, we show that updating device access delay and device dependent system slack can be done by looking at all jobs with priorities between Org P rio(j c1 ) and Org P rio(j cn ). J c1 and J cn change at runtime. Thus the set of jobs included in computations is like a sliding window, which is called the computation window hereafter. For example, suppose a system consists of two tasks: T 1 = {10, 2,, } and T 2 = {24, 2,, }, then the computation window at time 0 is {J 1,1, J 1,2, J 2,1 }. When J 1,1 is completed at time 2, the computation window becomes {J 1,2, J 2,1 }. A.1. Device access delay As defined in Definition 4.4, the device access delay for a device λ k is the time during which jobs requiring λ k cannot execute because of the execution of higher priority jobs that do not need λ k. The computation needs the knowledge of the actual job execution time, which is unknown a priori. Therefore, WCET is used in the computation as an approximation. The computation of the device access delay for all devices at time t can be done by computing the schedule of all current jobs from time t with their WCETs. The device access delay for a device λ k at time t, i.e., DevAccessDelay(λ k, t), can be acquired by looking at the schedule. For example, if there is at least one uncompleted job requiring λ k that is released before or at time t, then DevAccessDelay(λ k, t) = t t, where t is the first time instance after time t that any job requiring λ k occupies the CPU in the schedule. However, it requires significant overhead to compute the device access delay in this way. To reduce the computational overhead, we adopt a simplified algorithm. In this algorithm, we only consider higher priority jobs that have been released and cannot be blocked by any current jobs requiring λ k. The pseudocode for the simplified algorithm to compute the device access delay for a device λ k is shown in Figure 10. An optimized algorithm with lower computational complexity is shown in Figure

25 1 Function DevAccessDelay(λ k, t) 2 Output: (1) the device access delay for λ k, i.e., DevAccessDelay(λ k, t); (2) the first device request job, i.e., J λk,t. 3 sum 0; // record device access delay for device λ k. 4 // α denotes the set of jobs that require device λ k 5 J x null; // The job with the highest priority among all jobs in α 6 J y null; // The job with the highest priority among all jobs in α and can be blocked by some jobs in α. 7 J i, Org P rio(j cn ) Org P rio(j i) Org P rio(j c1 ) // looking at each job within the computation window. 8 If (λ k Dev(J i) and Org P rio(j i) > Org P rio(j x)) 9 J x J i; 10 // σ(λ k, t) denotes the highest preemption ceiling of resources being held by any job requiring λ k. 11 Else If( λ k / Dev(J i) and Org P rio(j i) > Org P rio(j y) and Res(J i) and P L(J i) σ(λ k, t) and et(j i, [0, t]) = 0) 12 J y J i; 13 End 14 End 15 J λk,t J i, J i {J x, J y} and Org P rio(j i) = Max(Org P rio(j x), Org P rio(j y)); 16 J i, Org P rio(j cn ) Org P rio(j i) Org P rio(j c1 ) 17 If (Org P rio(j i) > Org P rio(j λk,t) and R(J i) t) 18 sum sum + wcet(j i) et(j i, [0, t]); 19 End 20 End 21 DevAccessDelay(λ k, t) sum; 22 End Figure 10. The pseudocode for the simplified DevAccessDelay algorithm. The algorithm is reformatted for readability. The optimized algorithm to reduce computational overhead is shown in Figure 11. The DevAccessDelay algorithm is as follows. Suppose α is the set of jobs that require λ k ; J x is the job with the highest priority in α; and J y is the highest priority job of all jobs that can be possibly blocked by some job(s) in α. Then the device access delay for device λ k consists of the remaining WCET of any job J i, Org P rio(j cn ) Org P rio(j i ) Org P rio(j c1 ), satisfying following two conditions: (1) Org P rio(j i ) > Org P rio(j x ) and Org P rio(j i ) > Org P rio(j y ); and (2) the release time of J i is equal to or less than the current time t, which is to make sure that J i can occupy the CPU before any job in α. In the DevAccessDelay algorithm, the computation is done by looking at all jobs that have higher priorities than J x and J y. Intuitively, any job that has a higher priority than both J x and J y can delay the execution of any job requiring device λ k once it is released. In the remainder of this paper, first device request job is used to represent the job with the higher priority of J x and J y. The first device request job is defined as follows. Definition A.1. First device request job. The first device request job of a device λ k, at time t, is denoted J λk,t and computed by J x Org P rio(j x ) Org P rio(j y ) J λk,t = J y Org P rio(j x ) < Org P rio(j y ) (8) where J x is the job with the highest priority of all current jobs requiring device λ k ; and J y is the job with the 25

26 1 Function DevAccessDelay() 2 // Update the device access delay and the first device request job for every device λ k, 1 k K; 3 sum 0; // record device access delay. 4 T ocompute ( 0); // T ocompute is a bit-array used to indicate if the computation for devices is uncompleted. 5 HoldingResJob Head(HoldingResJobList); // HoldingResJobList is a list of jobs that are holding resources. 6 J i, J i J c1 : J cn //Browsing the computation window from the highest priority job to the lowest priority job. 7 D T ocompute & DevBits(J i) // D is the set of devices that needs to be computed and are required by J i. 8 λ k, λ k D 9 J λk,t J i; // J λk,t is the first device request job for λ k. 10 DevAccessDelay(λ k, t) sum; 11 T ocompute T ocompute & ( (1 << (k 1))); 12 End 13 If (Res(J i) and et(j i, [0, t]) = 0) 14 While (P L(J i) σ(holdingresjob)) // σ(j x) is the highest preemption ceiling of resources being held by J x. 15 D T ocompute & DevBits(HoldingResJob); 16 λ k, λ k D 17 J λk,t J i; 18 DevAccessDelay(λ k, t) sum; 19 T ocompute T ocompute & ( (1 << (k 1))); 20 End 21 HoldingResJob Next(HoldingResJob); 22 End 23 End 24 If (T ocompute > 0 and R(J i) t) 25 sum sum + wcet(j i) et(j i, [0, t]); 26 Else If (T ocompute = 0) 27 break; // break the loop 28 End 29 End Figure 11. The pseudocode for the optimized DevAccessDelay algorithm. highest priority of all jobs that (1) do not start execution and do not require device λ k ; (2) require shared resources; and (3) have equivalent or lower preemption levels than the preemption ceiling of some resources being held by any job requiring device λ k. The DevAccessDelay algorithm Figure 10 shows the simplified algorithm to compute the device access delay for one device at time t. In practice, the computation of the device access delay for all devices are performed in the same computation window. Thus the computational complexity can be reduced by combining common computations. Figure 11 shows the algorithm to update the device access delay for all devices at time t. In this algorithm, three data structures are used to facilitate the computation: 1. T ocompute. T ocompute is a bit-array to represent the devices of which the device access delay needs to be computed. For example, suppose that the total number of devices is 8. The initial value for T ocompute is set to (line 4), indicating that the device access delay for all devices needs to be computed. 26

27 2. DevBits. DevBits is a bit-array to represent the devices required by each job. For example, suppose job J 1 requires devices λ 1, λ 3 and λ 4, then the DevBits(J 1 ) is HoldingResJobList. HoldingResJobList is a list of jobs that are holding resources at time t. HoldingResJob is initialized to be the first job in the HoldingResJobList (line 5); and σ(holdingresjob) denotes the highest preemption level ceiling of resources being held by HoldingResJob (line 14). Suppose jobs in the HoldingResJobList are J b1, J b2,..., J bm, and σ(j b1 ) > σ(j b2 ) >... > σ(j bm ), then a job J i that requires some resources can start its execution only when its preemption level is higher than σ(j b1 ). If a resource r i is allocated to J i at time t, then σ(j i ) > σ(j b1 ) if J i J b1. Thus J i is placed at the head of HoldingResJobList at time t. With SRP, HoldingResJobList works like a stack. That is, a new job always joins HoldingResJobList at the head of the list. Similarly, the job that releases a resource must also be the first job of the list. Therefore, jobs join/leave the list in a FILO (First In Last Out) manner. For a system of n tasks, the length of the list is at most n. The computational complexity of the maintenance of HoldingResJobList at runtime is O(1). It can be seen that the computation is done by looking at each job J i in the computational window (line 6) from the highest priority job to the lowest priority job. For any device λ k that is required by J i ; and the computation for the device λ k is not yet completed (line 7), J i is the first device request job of λ k (line 8-12), according to Definition A.1. If J i can be blocked by some jobs requiring devices (line 13-14); and the computations for these devices are not completed, then J i is the first device request job of these devices (line 16-20), according to Definition A.1. HoldingResJob is assigned to the next job in the HoldingResJobList (line 21). sum is used to record the cumulative unused worst case execution time of all released jobs (line 25). Once computations for all devices are done (line 26), the DevAccessDelay algorithm is completed (line 27). Note only lower priority job can block a high priority job. However, the DevAccessDelay algorithm does not compare job priorities (line 14) because if HoldingResJob has a higher priority than J i, then the device set D (line 15) should be at this point. The worst case computation complexity of this algorithm is O(m + n + K) where m is the number of jobs in the computational window, n is the total number of tasks in the system and K is the total number of devices. It can be seen that lines 7, 13, execute at most m times; lines 8-12 and lines execute at most K times since at least the computation for one device is completed by executing these codes; lines 14, 15, 21 and 22 execute at most n times since the maximum length of HoldingResJobList is n. 27

28 A.2. Device dependent system slack Before presenting the algorithm to compute the device dependent system slack, we first introduce several concepts used in the computation. Definition A.2. Initial Job Slack. The initial slack of a job J k, at time t = 0, is denoted JobSlack(J k, 0) and computed by subtracting the total time required to execute job J k and other periodic requests with higher priorities than this job and the maximum blocking duration for J k from the total time available to execute job J k. That is, the slack of a job J k at t = 0 is given by JobSlack(J k, 0) = D(J k ) wcet(j i ) B(J k ) (9) D(J i) D(J k ) where D(J k ) is the absolute deadline of J k ; B(J k ) is the maximal blocking time for J k, as discussed in Section 4.1. Definition A.3. job slack. The slack of a job J k, at t, t > 0, is denoted JobSlack(J k, t). JobSlack(J k, t) decreases as it gets consumed by CPU idling and by the execution of lower priority jobs; and increases as jobs are completed sooner than their WCETs. That is, the job slack of a job J k at time t is given by JobSlack(J k, t) = JobSlack(J k, 0) Idle(0, t) et(j i, [0, t]) + U rem (J i ) (10) D(J i)>d(j k ),R(J i)<t D(J i) D(J k ) where R(J i ) is the release time of job J i ; Idle(0, t) is the amount of time the CPU has been idled till t; and et(j i, [0, t]) is the amount of time jobs with deadlines greater than D(J k ), have executed till t, D(J i )>D(J k ),R(J i )<t which implies that these jobs have to be released before t. Thus Idle(0, t) + et(j i, [0, t]) is the D(J i )>D(J k ),R(J i )<t total amount of slack consumed till t. U rem (J i ) is the amount of unused WCETs of completed jobs D(J i ) D(J k ) with deadlines equal to or less than D(J k ), which are reclaimed as job slack for J k. Intuitively, the job slack of a job J i at time t is the maximum amount of time that the CPU can be idle at time t without causing J i itself to miss its deadline. Suppose β = {J j J j, Org P rio(j j ) Org P rio(j i )}, then minslack = min(jobslack(j j, t)), J j β, is the maximum amount of time that the CPU can be idle at time t without causing any jobs with equivalent or lower priority than Org P rio(j i ) to miss its deadline. If the idle time is only inserted before the execution of any job in β without delaying the execution of higher priority jobs, then minslack becomes the maximum amount of time that the CPU can be idle before the execution of any job in β without causing any job to miss its deadline. As shown in Figure 5(a), JobSlack(J 2,1, 0) = JobSlack(J 1,3, 0) = 14. If an idle interval of 14 time units is inserted at time 0, i.e., the CPU is idle during [0, 14], then J 1,1 will miss its 28

29 J 1 J 2 J J 3 3 λ Figure 12. λ Dev(J 3 ); t sd (λ) = t wu (λ) = 4; D(J 1 ) = 10; D(J 2 ) = 16 and D(J 3 ) = 24. At time 4, JobSlack(J 2, 4) = 4 and JobSlack(J 3, 4) = 8. The shaded regions represent shared resource access. In this example, if the CPU is idle for 8 time units before the execution of J 3, then J 2 misses its deadline because it is blocked by J 3. Therefore, J λ,4 = J 2 and DevDepSysSlack(λ, 4) = JobSlack(J 2, 4) = 4. deadline. However, if J 1,1 preempts the idle interval as shown in Figure 5(a), then every job meets its deadline, and the total idle time before the execution of J 2,1 and J 1,3 is still 14 time units. However, the above discussion is only true without resource blocking. If a job J j β is holding a shared resource and thus may block a higher priority job J y / β, then the idle interval inserted before the execution of J j can possibly delay the execution of J y and all jobs with priority lower than J y. In this case, Min(JobSlack(J j, t)), J j, Org P rio(j j ) Org P rio(j y ), is the maximum amount of time that the CPU can be idle before the execution of any job in β without causing any job to miss its deadline. As shown in Figure 12, J 3 can block J 2 and thus the execution of J 2 depends on the execution of J 3. As a result, the maximum amount of time that the CPU can be idle before the execution of J 3 is the job slack of J 2. Note that EASD schedules jobs according to EDF and SRP. If there is a released high priority job in the system, a low priority job can execute only if it is the current ceiling task and blocks all released higher priority jobs. In other word, the EASD algorithm does not allow low priority jobs to utilize the idle intervals caused by device transitions for high priority jobs. The reason is: allowing reordering job executions may cause unexpected blocking for jobs and thus jeopardize temporal correctness. For example, a job can be possibly blocked for more than once with job reordering. Recall that the device dependent system slack is defined to be the maximum amount of time that the CPU can be idle before the execution of any jobs requiring device λ k without causing any jobs to miss their deadlines. It is now clear that the device dependent system slack is the maximum amount of idle time that can be inserted before 29

30 1 Function DevDepSysSlack(λ k, t) 2 Output: The device dependent system slack for λ k at time t. 3 MinSlack + ; // To record the minimal dynamic job slack; 4 J x, Org P rio(j cn ) Org P rio(j x) Org P rio(j cn ) 5 JobSlack(J x, t) JobSlack(J x, 0) Idle(0, t) D(J i )>D(J x),r(j i )<t 6 If (Org P rio(j x) Org P rio(j λk,t) and JobSlack(J x, t) < MinSlack) 7 MinSlack JobSlack(J x, t); 8 End 9 End 10 DevDepSysSlack(λ k, t) Min(MinSlack, MinInitSlack(J cn ) + et(j i, [0, t]) + D(J i ) D(J cn ) D(J i ) D(J k ) U rem(j i); U rem(j i) Idle(0, t)); Figure 13. The pseudocode for the simplified DevDepSysSlack algorithm. An algorithm with lower computational complexity is presented in Figure 15. the execution of J λk,t and all lower priority jobs, which can be given by, DevDepSysSlack(λ k, t) = min(jobslack(j x, t)) J x, Org P rio(j x ) Org P rio(j λk,t) (11) To compute DevDepSysSlack(λ k, t) for each device, we need to compute the job slack of all uncompleted jobs. The complexity of this computation would be O(N), where N is the number of jobs in a hyperperiod. To reduce computational overhead, the initial slack of all jobs in a hyperperiod are computed offline and are kept in a job slack list ordered by deadlines. Let MinInitSlack(J k ) denote the minimal initial job slack of all jobs with equivalent or lower original priorities than Org P rio(j k ), which can be given by, MinInitSlack(J k ) = min(jobslack(j x, 0)) J x, Org P rio(j x ) Org P rio(j k ) Then the minimum job slack of J k and all jobs with lower priorities than Org P rio(j cn ) is MinInitSlack(J cn )+ Urem (J i ) Idle(0, t), where J i is any job that is completed at or before time t. Note that D(J cn ) has a deadline no less than any completed job. In this way, the computation of device dependent system slack can be done by looking at each job in the computation window. The simplified algorithm for the computation of device dependent system slack is presented in Figure 13. A detailed algorithm is presented next. The DevDepSysSlack algorithm We are now ready to describe the DevDepSysSlack algorithm. As discussed before, our method involves an off-line phase. In this phase, the initial slack of all jobs in a hyperperiod are computed and are kept in a job slack list ordered by priorities. Each entry of the job slack list contains a job s ID, say J k ; the corresponding initial job slack, JobSlack(J k, 0); as well as MinInitSlack(J k ). An example of the job slack list of a task set is shown in Figure

31 J1,1 J 2,1 J 3,1 J 1, 2 J 2, 2 J1,3 J 3,2 J1, 4 J 2,3 J 1, 5 J 3,3 J 2,4 J 1,6 Figure 14. An example of job slack list. Jobs are ordered by priorities. T 1 = {10, 3, Dev(T 1 ), Res(T 1 )}; T 2 = {15, 4, Dev(T 2 ), Res(T 2 )}; T 3 = {20, 5, Dev(T 3 ), Res(T 3 )}. The DevDepSysSlack algorithm contains two parts: (1) update the job slack for all jobs in the computation window; and (2) update the device dependent system slack for all devices. This algorithm is invoked at time instances when the current executing job is completed (line 7) or is preempted (line 29). After the job slack of all jobs in the computation window is updated, the device dependent system slack of all devices can be acquired (line 39), according to Equation (11). The computation of job slack is done by looking at each job in the computation window from the lowest priority job to the highest priority job (line 14,27). Thus the computational complexity for updating the job slack of all jobs in the computation window is O(m), where m is the number of jobs in the computation window. The computational complexity of the device dependent system slack for all devices is O(K), where K is the total number of devices. Therefore, the computational complexity for the DevDepSysSlack algorithm is O(m + K). 31

32 1 Function DevDepSysSlack() 2 Initialize at time t : 0 3 c 0; // c is used to record the cumulative unused WCET of all completed jobs. 4 t 0; // t is the last instance that the JobExecDelay algorithm is invoked. 5 MinSlack + ; // MinSlack is the minimum job slack of all jobs that have been looked at. 6 Update dynamic job slack at time t: 7 If (t: instance when job J i,j is completed) 8 c c + wcet(j i,j) et(j i,j, [0, t]); 9 If (J i,j = J c1 ) // Need to update J c1 10 J c1 the second job in the computation window; 11 End 12 If (Org P rio(j i,j+1) < Org P rio(j cn )) // Need to update J cn 13 J c n J cn ; // J c n is used to record old J cn. 14 J cn J i,j+1; 15 End 16 J x, J x J cn : J c1 // Browsing the computation window from the lowest priority job to the highest priority job. 17 If (D(J x) < D(J i,j)) // The execution time of J i,j is not included in the initial job slack of J x 18 JobSlack(J x, t) JobSlack(J x, t) et(j i,j, [t, t]); 19 Else If (D(J i,j) D(J x) D(J c n )) 20 JobSlack(J x, t) JobSlack(J x, t) + wcet(j i,j) et(j i,j, [0, t]); // Reclaim the unused WCET of J i,j. 21 Else // D(J c n ) < D(J x) 22 JobSlack(J x, t) JobSlack(J x, t) + c Idle(0, t); // The dynamic job slack of J x at time t 23 End 24 MinSlack Min(MinSlack, JobSlack(J x, t), MinInitSlack(J cn ) + c Idle(0, t)); 25 J x.minslack MinSlack; // J x.minslack is the minimum job slack of all jobs with equivalent or lower priorities. 26 End 27 Remove J i,j from the job slack list; t t; 28 End 29 If (t: instance when job J i,j is preempted) // J i,j can be idle job, D(idle job) = +, et(idle job, [t, t]) = Idle(t, t) 30 J x, J x J cn : J c1 ; // Browsing the computation window from the lowest priority job to the highest priority job. 31 If (D(J x) < D(J i,j)) 32 JobSlack(J x, t) JobSlack(J x, t) et(j i,j, [t, t]); 33 MinSlack Min(MinSlack, JobSlack(J x, t), MinInitSlack(J cn ) + c Idle(0, t)); 34 J x.minslack MinSlack; 35 End 36 End 37 t t; 38 End 39 Update device dependent system slack: 40 λ k, DevDepSysSlack(λ k, t) J λk,t.minslack; Figure 15. The pseudocode for the DevDepSysSlack algorithm. 32

33 B Appendix This section shows that Theorem 4.1 is true for EASD. With the EASD algorithm, a device is switched to the idle state when its device slack is larger than its break-even time. Therefore, there might be some intervals that the CPU is idle while there are pending jobs waiting for required devices to be switched to the active state. As discussed before, these idle intervals are device dependent system slack. An example of device dependent system slack is shown in Figure 5(a). We first consider the relationship between the device slack at time t and the device slack at time t+1. The device slack at time t means the device slack at the time instance t, while the time unit t means the duration [t, t + 1). Suppose device λ inact is not active at time unit t (including the time that device λ inact begins state transition from the active state to the idle state at time t). With following lemmas, we want to show that the device slack for this device at time t+1 is at most 1 time unit less than the device slack at time t. Let α denote the set of all uncompleted jobs that require device λ inact ; and let J exec be the job that occupies the CPU at time unit t. Therefore, J exec / α. Lemma B.1. The first device request job for device λ inact at time t and time t + 1 are the same job. That is, J λinact,t = J λinact,t+1 Proof: Suppose that J x is the job with the highest priority in α; and J y is the highest priority job of all jobs that do not require λ inact and can be possibly blocked by some job(s) in α at time t. According to Definition A.1, J λinact,t is either J x or J y. Firstly, J exec J x because J x α and J exec / α. Thus at time t + 1, J x is still the highest priority job in α at time t + 1. Secondly, J exec J y because J exec is not blocked by any job. Since J exec / α, neither are new resources acquired nor are resources being held by jobs in α released. Therefore, J y is still the highest priority job of all jobs that do not require λ inact and can be possibly blocked by some job(s) in α at time t + 1. Therefore, J λinact,t = J λinact,t+1. Lemma B.2. The device access delay and the device dependent system slack for device λ inact cannot decrease at the same time. That is, DevAccessDelay(λ inact, t+1) DevAccessDelay(λ inact, t) or DevDepSysSlack(λ inact, t+ 1) DevDepSysSlack(λ inact, t). Proof: Suppose that DevAccessDelay(λ inact, t + 1) < DevAccessDelay(λ inact, t). That means J exec has higher priority than all jobs in α; and the WCET of J exec is included in DevAccessDelay(λ inact, t). It follows 33

34 that D(J exec ) D(J λinact,t). Thus the execution time of J exec has already been subtracted from the dynamic job slack of J λinact,t and all lower priority jobs, which cannot decrease in this case. With Lemma B.1, we have J λinact,t+1 = J λinact,t. Therefore, the device dependent system slack for device λ inact does not decrease when the device access delay decreases. Lemma B.3. The device dependent system slack decreases at most 1 per time unit. That is, DevDepSysSlack(λ inact, t+ 1) DevDepSysSlack(λ inact, t) 1. Proof: From Lemma B.1, we know that J λinact,t+1 = J λinact,t. The device dependent system slack at time t+1 is the minimum dynamic job slack of all jobs with priorities equivalent or lower than Org P rio(j λinact,t+1). The dynamic job slack of any job can decrease at most one during a time unit. It follows that DevDepSysSlack(λ inact, t+ 1) DevDepSysSlack(λ inact, t) 1. Lemma B.4. DevAccessDelay(λ inact, t+1)+devdepsysslack(λ inact, t+1) DevAccessDelay(λ inact, t)+ DevDepSysSlack(λ inact, t) 1. Proof: We show this in the following two cases: (1) the WCET of J exec is not included in DevAccessDelay(λ inact, t); and (2) the WCET of J exec is included in DevAccessDelay(λ inact, t). Case 1: The WCET of J exec is not included in DevAccessDelay(λ inact, t). In this case, DevAccessDelay(λ inact, t+ 1) DevAccessDelay(λ inact, t). Also DevDepSysSlack(λ inact, t+1) DevDepSysSlack(λ inact, t) 1, according to Lemma B.3. Therefore, DevAccessDelay(λ inact, t + 1) + DevDepSysSlack(λ inact, t + 1) DevAccessDelay(λ inact, t) + DevDepSysSlack(λ inact, t) 1. Case 2: The WCET of J exec is included in DevAccessDelay(λ inact, t). In this case, if J exec is not completed at time t + 1, then DevAccessDelay(λ inact, t + 1) DevAccessDelay(λ inact, t) 1. According to Lemma B.2, the device dependent system slack cannot decrease at the same time as the device access delay. Therefore, Lemma B.4 is true in this case. On the other hand, if J exec is completed at time t + 1, then DevAccessDelay(λ inact, t + 1) DevAccessDelay(λ inact, t) (wcet(j exec ) et(j exec, [0, t + 1])) 1. The unused WCET of J exec, i.e., wcet(j exec )-et(j exec, [0, t + 1]), becomes additional job slack for J λinact,t and all lower priority jobs. Therefore, DevDepSysSlack(λ inact, t + 1) DevDepSysSlack(λ inact, t) + (wcet(j exec ) et(j exec, [0, t + 1])). It follows that Lemma B.4 is true in this case, too. 34

35 Lemma B.5. The time to next device request at time t + 1 is not less than the time to next device request at time t minus 1. That is, T imet onextdevreq(λ inact, t + 1) T imet onextdevreq(λ inact, t) 1. Proof: By Definition 4.2, T imet onextdevreq(λ inact, t) = NextDevReqT ime(λ inact, t) t, Since the release time of any job is fixed, we have NextDevReqT ime(λ inact, t+1) NextDevReqT ime(λ inact, t). Therefore, T imet onextdevreq(λ inact, t + 1) T imet onextdevreq(λ inact, t) 1. Lemma B.6. The device slack of device λ inact at time t + 1 is not less than the device slack of the device at time t minus 1. That is, DevSlack(λ inact, t + 1) DevSlack(λ inact, t) 1. Proof: Recall that the device slack for device λ inact at time t is the larger of T imet onextdevreq(λ inact, t) and DevAccessDelay(λ inact, t)+devdepsysslack(λ inact, t). The correctness of Lemma B.6 directly follows from Lemma B.4 and Lemma B.5. We have now finished the proof that the device slack for an inactive device decreases at most 1 per time unit. Next, we provide another lemma before the proof of Theorem 4.1. Lemma B.7. The dynamic job slack of jobs with equivalent deadline is the same at any time t. That is, JobSlack(J i, t) = JobSlack(J j, t), J i, J j, D(J i ) = D(J j ). Proof: According to Definition A.2, JobSlack(J i, 0) = JobSlack(J j, 0), J i, J j, D(J i ) = D(J j ). And the update of dynamic job slack is only related to a job s deadline, as shown in Equation (10). That is, the same amount of slack is added or subtracted to all jobs with equivalent deadlines at any time t. Therefore, JobSlack(J i, t) = JobSlack(J j, t), J i, J j, D(J i ) = D(J j ). Proof of Theorem 4.1: Assume Equation (5) holds but a job misses its deadline when scheduled with EASD. Let J k be the first job that misses its deadline D(J k ) and t 0 be the last time before D(J k ) such that there are no pending jobs with release times before t 0 and deadlines before or at D(J k ). Since no job can release before system start time, t 0 is well defined. Let ρ be the set of jobs that are released in [t 0, D(J k )] and have deadlines in [t 0, D(J k )]. By choice of t 0 and D(J k ), the jobs that execute in [t 0, D(J k )] are jobs in ρ and possibly a job that blocks a job in ρ. Since there are transition delays for devices, there might be some idle periods in [t 0, D(J k )]. 35

36 First of all, there can be at most one job J b / ρ that blocks any job in ρ, and the blocking job J b must be released before t 0 and has a deadline larger than D(J k ). This conclusion directly follows from SRP and the proof can be found in [2]. Next, we proceed with our proof in two cases: (1) there are idle intervals during [t 0, D(J k )]; and (2) there is no idle interval during [t 0, D(J k )]. Case 1: There are some idle intervals during [t 0, D(J k )]. By choice of t 0 and D(J k ), these idle intervals are only from the time when jobs are waiting for required devices to become active. An example is shown in Figure 5. If several devices are required at the same time, we consider the device that takes the longest time to perform the state transition. Suppose an idle interval [t, t ] is caused by the state transition delay of a device λ k. It is obvious that T imet onextdevreq(λ k, t) 0 and DevAccessDelay(λ k, t) = 0. Thus DevSlack(λ k, t) is equal to DevDepSysSlack(λ k, t). With the EASD algorithm, t t DevSlack(λ k, t). Moreover, DevSlack(λ k, t ) is no less than DevSlack(λ k, t) (t t) according to Lemma B.6. Therefore DevSlack(λ k, t ) = DevDepSysSlack(λ k, t ) 0. Assume that [t, t ] is the last idle interval during [t 0, D(J k )]. It follows that DevDepSysSlack(λ k, t ) 0 for at least one device λ k that is required by some job J x at some time in [t, t ]. Now we show that J λk,t ρ. We discuss it in two cases: (i) J x ρ; and (ii) J x / ρ. Case i: J x ρ. According to Definition A.1, we have Org P rio(j λk,t ) Org P rio(j x). It follows that D(J λk,t ) D(J x) D(J k ). Therefore, J λk,t ρ. Case ii: J x / ρ. As discussed before, the only job that is not in ρ but can execute during [t 0, D(J k )] is the blocking job J b. Thus J x = J b ; and at least one job in ρ, say J y, must be blocked by J x. According to Definition A.1, Org P rio(j λk,t ) Org P rio(j y). It follows that J λk,t D(J λk,t ) D(J y) D(J k ). ρ, since Recall that DevDepSysSlack(λ k, t ) = Min(JobSlack(J i, t )), where J i is any job with a equivalent or a lower priority than Org P rio(j λk,t ). Since DevDepSysSlack(λ k, t ) 0, we know that the job slack of J λk,t and all lower priority jobs are at least 0. Next we show JobSlack(J k, t ) 0, where J k is the first job that misses its deadline. We discuss it in following two cases: (i) D(J λk,t ) < D(J k); and (ii) D(J λk,t ) = D(J k). 36

37 Case i: D(J λk,t ) < D(J k). In this case, JobSlack(J k, t ) 0 because Org P rio(j λk,t ) > Org P rio(j k). Case ii: D(J λk,t ) = D(J k). According to Lemma B.7, JobSlack(J k, t ) = JobSlack(J λk,t, t ) 0. According to Definition A.3, JobSlack(J k, D(J k )) is no less than JobSlack(J k, t ) B(J k ) since there is no idle intervals after time t. Therefore, at time D(J k ), JobSlack(J k, D(J k )) JobSlack(J k, t ) B(J k ) B(J k ) = JobSlack(J k, D(J k )) + B(J k ) 0 = D(J k ) wcet(j i ) + Idle(0, D(J k ))+ D(J i ) D(J k ) et(j i, [0, D(J k )]) D(J i )>D(J k ),R(J i )<t D(J i ) D(J k ) U rem (J i ) This contradicts to the assumption that J k misses its deadline at D(J k ). Case 2: There is no idle period during [t 0, D(J k )]. In this case, the proof is the same as presented in [2], and a contradiction can be acquired. Please refer to [2]. Thus, in conclusion, each case leads to a contradiction of the assumption that Equation (5) holds but a job misses a deadline. Therefore, Theorem 4.1 holds for EASD. 37

38 C Appendix This section presents a sufficient schedulability condition for the ASD algorithm. Suppose n periodic tasks are sorted by their periods, P (T 1 ) P (T 2 )... P (T n ); and λ i is the device with the longest combined state transition delays, i.e., t sw ( λ i ) = t sd ( λ i ) + t wu ( λ i ), among all devices required by T i. With EDF and SRP, each job suffers two context switches; one for a job starting its execution and another one for a job finishing its execution. Note that the context-switch when a job is preempted is attributed to the preempting job; otherwise more than two context-switches can be attributed to a job. The context switch cost cannot be ignored with ASD, because of state transition delay of devices. Context switch costs need to be included in each task s WCET. We first consider the worst context switch cost for each job starting its execution. For any job J i,j that is selected to be executed at time t, the worst case is that λ i just starts switching to the idle state at time t 1 and thus needs to be switched back to the active state. This procedure takes at most t sw ( λ i ) 1 time units, which is the worst case context switch overhead for a job starting its execution, as shown in Figure 16(a). We next consider the worst context switch cost for a job finishing its execution. Suppose that J i,j starts its execution at time t and finishes its execution at time t. If J i,j does not preempt any job using devices at time t, then the context switch cost for finishing execution is 0; otherwise, devices required by the preempted job, say J m,n, should be switched back to the active state, which takes at most t sw ( λ m ) 1 time units. This time should be included in the context switch cost for J i,j finishing its execution, as shown in Figure 16(b). With EDF, a task can only preempt tasks of longer periods. Suppose ˆλ i is the device with the longest transition time among all devices required by tasks that have longer periods than T i. Then a sufficient schedulability condition for the ASD scheduling algorithm is described in Theorem C.1. Theorem C.1. Suppose n periodic tasks are sorted by their periods. They are schedulable by the ASD algorithm if k, 1 k n, k i=1 (wcet(t i ) + t sw ( λ i ) + t sw ( ˆλ i ) 2) P (T i ) where B(T k ) is the maximal length that a job in T k can be blocked. + B(T k) P (T k ) 1, In fact, the condition is the same condition used for EDF algorithm with the SRP proposed in [2]. Since context costs are included in WCETs, the proof of Theorem C.1 directly follows the proof presented in [2] and thus 38

J 1,1 J 1,1 J 2,1 J 2,1 λ J 3,1 λ! "" (a) λ Dev(J 2,1); t wu(λ) = t sd (λ) = 4.

J 1,1 finishes its execution at 1. The context switch cost for J 2,1 starting its execution is 7.

At time 2, J 3,1 is preempted by J 2,1, which is in turn preempted by J 1,1.

state, while it is switched back to the active state at time 16 because J 3,1 requires λ.

39 J 1,1 J 1,1 J 2,1 J 2,1 λ J 3,1 λ! "" (a) λ Dev(J 2,1); t wu(λ) = t sd (λ) = 4. At time 0, J 1,1 starts its execution and λ begin its state transition to the idle state. J 1,1 finishes its execution at 1. The context switch cost for J 2,1 starting its execution is 7. (b) λ Dev(J 1,1) Dev(J 3,1); t wu(λ) = t sd (λ) = 2. At time 2, J 3,1 is preempted by J 2,1, which is in turn preempted by J 1,1. At time 15 when J 2,1 finishes its execution, λ is performing state transition to the idle state, while it is switched back to the active state at time 16 because J 3,1 requires λ. The context switch cost for J 2,1 finishing its execution is 3. Figure 16. Context switch costs in ASD. (a) context switch cost of a job starts its execution; (b) context switch cost of a job finishing its execution. is omitted here. Note that tighter sufficient conditions of scheduablility may exist. However, addressing other scheduling conditions is beyond the scope of this work. 39

A Dynamic Real-time Scheduling Algorithm for Reduced Energy Consumption

A Dynamic Real-time Scheduling Algorithm for Reduced Energy Consumption Rohini Krishnapura, Steve Goddard, Ala Qadi Computer Science & Engineering University of Nebraska Lincoln Lincoln, NE 68588-0115