Two-Phase Low-Energy N-Modular Redundancy for Hard Real-Time Multi-Core Systems

Size: px
Start display at page:

Download "Two-Phase Low-Energy N-Modular Redundancy for Hard Real-Time Multi-Core Systems"

Transcription

1 1 Two-Phase Low-Energy N-Modular Redundancy for Hard Real-Tme Mult-Core Systems Mohammad Saleh, Alreza Ejlal, and Bashr M. Al-Hashm, Fellow, IEEE Abstract Ths paper proposes an N-modular redundancy (NMR) technque wth low energy-overhead for hard real-tme multcore systems. NMR s well-suted for mult-core platforms as they provde multple processng unts and low-overhead communcaton for votng. However, t can mpose consderable energy overhead and hence ts energy overhead must be controlled, whch s the prmary consderaton of ths paper. For ths purpose the system operaton can be dvded nto two phases: ndspensable phase and on-demand phase. In the ndspensable phase only half-plus-one copes for each task are executed. When no fault occurs durng ths phase, the results must be dentcal and hence the remanng copes are not requred. Otherwse, the remanng copes must be executed n the on-demand phase to perform a complete majorty votng. In ths paper, for such a two-phase NMR, an energy-management technque s developed where two new concepts have been consdered: ) Block-parttoned schedulng that enables parallel task executon durng on-demand phase, thereby leavng more slack for energy savng, ) Pseudo-dynamc slack, that results when a task has no faulty executon durng the ndspensable phase and hence the tme whch s reserved for ts copes n the on-demand phase s reclamed for energy savng. The energymanagement technque has an off-lne part that manages statc and pseudo-dynamc slacks at desgn tme and an onlne part that manly manages dynamc slacks at run-tme. Expermental results show that the proposed NMR technque provdes up to 29% energy savng and s 6 orders of magntude hgher relable as compared to a recent prevous work. Index Terms Energy mnmzaton, mult-core systems, real-tme and embedded systems, relablty, schedulng. 1 INTRODUCTION M ULTI-CORE platforms have emerged to be popular and powerful computng engnes for many recent embedded systems [1], [2], [3], [4], [5]. Whle such archtectures have been employed for embedded applcatons that requre hgh performance computng, we beleve they also offer new consderable opportuntes for desgnng embedded systems where hard real-tme operaton, hgh relablty n the presence of transent faults, and low energy consumpton are requred [6], [7], [8]. In ths paper, we address the use of mult-core platforms to acheve hgh relablty wth low energy-overhead for hard realtme embedded systems. To acheve relablty aganst transent faults, we consder N modular redundancy (NMR) [9], [10], where multple processng unts execute dentcal copes for each task and ther results are voted on to produce a sngle output. NMR s well-suted for mult-core platforms as they satsfy NMR requrements such as multple processng unts and low-overhead communcaton for votng [3]. An NMR system can mask faults whle less than half of ts unts are faulty. Fault-tolerant real-tme systems that has been consdered n prevous works requre faultdetecton mechansms (e.g., [5], [6], [7], [8], [11]) and these works have assumed (usually mplctly) that they have perfect detecton mechansms (.e., they can detect all faulty task executons). However, common fault-detecton Mohammad Saleh and Alreza Ejlal are wth the Department of Computer Engneerng, Sharf Unversty of Technology, Tehran 14588, Iran (e-mal: mohammad_saleh@ce.sharf.edu, ejlal@sharf.edu). Bashr M. Al-Hashm s wth the School of Electroncs and Computer Scence, Unversty of Southampton, Southampton SO17 1BJ, U.K. (emal: bmah@ecs.soton.ac.uk). mechansms are far less effectve than what s requred for hghly relable systems, whereas NMR does not requre any specfc fault-detecton mechansm and uses result comparson (majorty votng) for fault-detecton and maskng [9], [10]. Snce t s very unlkely that all modules n NMR become faulty at the same tme and make the same erroneous results, comparng the results can provde almost perfect fault-detecton/-maskng [9], [10]. Also, result comparson can be combned wth hashbased detecton mechansms, e.g. Fngerprntng [31], to acheve very hgh detecton coverage, about [31]. Therefore, n our experments n Secton 5 we wll assume detecton coverage for our system. Lke all other fault-tolerance and fault-maskng technques, NMR can mpose consderable energy overhead [9], [10], whch s an mportant concern n the embedded systems where energy consumpton s promnent. To reduce the energy overhead, we propose an energy-management technque that bears the major contrbutons of the work and s specfcally developed for NMR when used for hard realtme mult-core systems (Sectons 3 and 4). The man contrbutons of ths work are: ) Consderng the domnance of the fault-free executon on faulty executons [6] [8] [38], a two-phase NMR s proposed that acheves mnmzed energy consumpton n the absence of faults whle guaranteeng relablty and deadlne requrements. ) A specfc type of slack tme, called pseudo dynamc slack, s consdered n ths work. As explaned n Secton 4, ths type of slack tme s dfferent from conventonal slack tmes,.e., statc and dynamc slack [6], [8], [20].

2 2 ) An energy-management technque s proposed that explots the pseudo-dynamc and statc slacks through offlne optmzaton (Secton 4.2). Ths s dfferent from prevous works that have not proposed a mechansm to manage the pseudodynamc slack. Also, an onlne energy-management technque s proposed to explot dynamc slacks at run-tme (Secton 4.3). v) A specfc schedulng technque s developed, called block-parttoned schedulng (Secton 3) that provdes the ablty of n-advance parallel task executon (Secton 3) to explot pseudo-dynamc slacks more effectvely. The remander of ths paper s organzed as follows. In secton 2 we revew the related work. The proposed technque s presented n Secton 3. Secton 4 descrbes the energy-management method whch s used for the proposed technque. The expermental results are presented n Secton 5. Fnally, Secton 6 concludes the paper. 2 RELATED WORK Some research works, e.g., [6], [7], [11], have addressed both fault tolerance and low energy-consumpton n faulttolerant real-tme systems wth two processors. These works have not consdered multple faults per task executon, and also they assume they have perfect faultdetecton mechansms. [13] has proposed voltage-scalng technques to reduce the energy consumpton of trplemodular redundancy (TMR). However, ths work has only consdered sngle task applcatons. Many prevous works n the context of mult-processor systems ether propose energy reducton management technques wthout consderng relablty (e.g., [14], [15], [32], [33]) or consder relablty wthout consderng energy consumpton (e.g., [4], [30], [34]). [14] has consdered varaton n executon tmes to propose a schedulng algorthm based on dynamc voltage scalng (DVS) [12] for mult-processor systems. [15] has studed the energy effcency of mult-core platforms that use multple voltage slands. [32] has proposed a technque to mnmze chplevel peak power consumpton n mult-core systems runnng sporadc real-tme tasks. [33] has proposed an adaptve task parttonng for mult-core systems runnng ndependent perodc real-tme tasks. [4] has evaluated schedulng heurstcs for tasks wth dfferent crtcalty. [30] has proposed a mappng optmzaton technque for mxed crtcal mult-core systems wth dfferent relablty requrements. [34] has proposed software transformatons to ncrease relablty through reducng nstructons vulnerabltes and the executons of crtcal nstructons. Recently, research works have also been focused on both energy and relablty consderatons n mult-core systems. Some works, e.g. [35], [36], [37] have proposed mult-core archtectures that explot redundancy at dfferent levels of abstracton to target low-energy consumpton and relablty. [35] has proposed an adaptve multcore archtecture that selectvely adjusts ppelne-level redundancy to satsfy relablty target wth low energy consumpton. [36] has proposed a customzable chp-level redundancy technque for mult-core systems that utlzes power effcent hardware fault-detecton mechansms along wth forward recovery to reduce overheads n case of fault-free executons. [37] has consdered the effects of DVS on the soft error rate and proposed a flexble dual modular redundancy (DMR) mechansm that selectvely enables per-core DMR to ncrease relablty. However, these works requre hardware modfcaton or redesgn, and hence, cannot be used by the current commercal-offthe-shelf processors, whle our proposed technque s general and can be exploted by any mult-core processor that supports DVS. Some works, e.g. [5] [16], [17], [18], [38], have proposed energy-management technques for task-level redundancy n mult-core systems. [5] and [16] have consdered only one faulty executon for each task to preserve the orgnal system relablty, whle for many applcatons (e.g., the applcatons that are used n harsh envronments) a hgh level of relablty cannot be acheved unless toleratng multple faulty tasks [9], [10], [17], [18]. Some works have consdered dfferent applcaton models, e.g. perodc ndependent real-tme tasks n [17] and [38] and parallel ndependent applcatons n [18]. However, these works cannot be appled to tasks wth precedence constrants (e.g., task graphs [5], [6], [7]), whle we consder hard real-tme applcatons wth task precedence constrant and propose a schedulng and energy-management technque for these applcatons. 3 PROPOSED TWO-PHASE NMR TECHNIQUE In ths paper, we consder frame-based applcatons [5], [6], [7] wth hard tmng requrements and task precedence constrants where n dependent tasks {T 1, T 2,,,T n} are executed wthn each executon frame and must be completed as a whole before the end of the frame (specfed by a deadlne D). We also consder that the task precedence constrants (dependences between the tasks) are depcted as a drected acyclc graph (DAG) [5], [6], [7]. For example Fg. 1a shows an example applcaton tasks graph wth sx tasks where the numbers placed above the tasks s ther worst-case executon tme at the maxmum supply voltage V max and the maxmum operatonal frequency f max (denoted by W for each task T ). For ths type of applcatons we propose a two-phase NMR technque wth low energy consumpton runnng on mult-core platforms. To do ths, a new schedulng technque s proposed and a new type of slack tme (whch s specfc to the proposed two-phase NMR) s exploted to manage energy consumpton. In ths secton we descrbe the twophase operaton of the system and the proposed schedulng technque and n the next secton we explan the energy-management technque. The two operaton phases of the proposed NMR are: 1. Indspensable phase: At frst the system operates n ts ndspensable phase where t executes a multcore schedule contanng copes of each task. For each task, the results of the task copes are compared. If no fault occurs, the task results must be dentcal and n ths case t s used as the result of the system. However, when the results

3 SALEHI ET AL.: TWO-PHASE LOW-ENERGY N-MODULAR REDUNDANCY FOR HARD REAL-TIME MULTI-CORE SYSTEMS 3 Indspensable phase On-demand phase Majorty votng (Result comparson)/savng results 20 T 1 30 T T T 5 T 2 Task Graph G (a) Core4 T 5 T 5 T 5 Core3 T 5 T 5 T 5 Core2 T 1 T 4 Core1 T 1 T 4 step 1 step 2 step 3 step 4 step 5 step 6 (b) LstSchedulngLTF(G, 2) for the ndspensable phase Core4 B 1 B 2 B 3 Core3 T 4 T 4 Core2 T 5 T 5 T3 T 5 Core1 T 1 step 1 step 2 step 3 step 4 step 5 step 6 (c) LstSchedulngLTF(G, 1) for the on-demand phase (d) Fg. 1. Syntheszng a TMR system (.e., NMR wth N=3) on a quad-core platform. a) An example task graph, b) Creatng a schedule wth two copes for each task for the ndspensable phase, c) Creatng a schedule wth one copy for each task for the on-demand phase, and d) A block-parttoned verson of the on-demand phase schedule. are not dentcal (when some faults have occurred durng the ndspensable phase), the system temporarly swtches to the on-demand phase where t executes the remanng copes of the task to perform a complete majorty votng. 2. On-demand phase: In ths phase, the system executes a part of a mult-core schedule that contans the remanng copes of the task whch had faulty executons n the ndspensable phase. As copes of the task have already been executed n the ndspensable phase, n the on-demand phase we execute the remanng copes of the same task to obtan N results for performng a complete majorty votng to mask the faults. Therefore, each of the two operaton phases of the proposed NMR technque requres ts own schedule, so that we need to synthesze two schedules from the same applcaton task graph. These two schedules are: ) a mult-core schedule contanng copes for each task for the ndspensable phase, ) a mult-core schedule contanng copes for each task for the on-demand phase. It s known that fndng the optmal mult-core schedule to maxmze parallelsm (.e., mnmzng the schedule tme length) s an NP-hard problem [5]. Indeed mult-core schedules are typcally obtaned by the lst schedulng algorthm [19] as a smple heurstc that also provdes parallelsm. Smlarly, n ths paper we use lst schedulng to synthesze the mult-core schedules of the ndspensable and on-demand phases. Also, n the lst schedulng, whenever several tasks can be scheduled (these are the tasks that all ther predecessors are scheduled), we use the longest task frst (LTF) polcy to determne the executon order. We wll dscuss n Secton 4 why the LTF polcy s effectve for our proposed technque. For example, consderng a TMR system (.e., NMR wth N=3), Fg. 1 shows the step by step generaton of the two mult-core schedules for a gven task graph (Fg. 1a) usng lst schedulng wth LTF polcy. Fg. 1b shows the schedule wth two copes of each task for the ndspensable phase and Fg. 1c shows the schedule wth one copy of each task for the on-demand phase. For the schedule whch s used n the on-demand phase, we requre that each task can overlap (n tme) wth at most one other task n each of the other cores. For example, the mult-core schedule of Fg. 1c (step 6) does not satsfy ths condton as n ths schedule T 2 overlaps wth both and T 5 on Core2 and also overlaps wth both T 4 and T 6 on Core3. Indeed, we need the schedule of Fg. 1c (step 6) to be transformed to a schedule lke the one n Fg. 1d that satsfes the condton as each task overlaps wth at most one other task n each of the other cores. We requre ths condton to be satsfed because t lets us partton the mult-core schedule nto tme blocks, so that n each block only one sngle task or multple parallel tasks exst. For example, n Fg. 1d the block B 1 only conssts of the task T 1, the block B 2 conssts of the parallel tasks T 2, and T 4, and the block B 3 conssts of the parallel tasks T 5 and T 6. In ths paper, we call such schedules, block-parttoned (BP) schedules. As we wll show later n ths secton, whenever a fault occurs durng the ndspensable phase, we swtch to the on-demand phase to execute exactly one block of the BP schedule and then we swtch back to the ndspensable phase to contnue executng the schedule of the ndspensable phase. As a mult-core schedule whch has been syntheszed usng the lst schedulng technque wth LTF polcy (e.g., the schedule of Fg. 1c) may not be BP, we use a smple technque to convert ordnary schedules to BP schedules. Suppose that n a mult-core schedule a task T A overlaps wth two other tasks T B and T C scheduled on another core (Fg. 2a). Assumng that the task T C comes after the task T B, we smply shft the task T C (and all ts successor tasks) to the rght untl there s no overlap between T A and T C. As t can be seen from Fg. 2a, the amount of ths shft (denoted by σ n the fgure) s smply the dfference between the fnsh tme of T A and the start tme of T C. We start from the begnnng of a mult-core schedule, move to the rght, and apply ths technque untl we obtan a BP mult-core schedule. As an example, when we apply ths technque to the schedule of Fg. 1c (step 6), we obtan the BP schedule of Fg. 2b (step 3). One pont that should be noted here s that block-parttonng may ncrease the executon tme of an applcaton and hence t may cause the applcaton to be unschedulable. Therefore, we use the proposed energy-management technque (Secton 4) when the applcaton total executon tme s less than ts deadlne. Ths mples that the energy-management technque mght not be used for some applcatons that have

4 4 σ B 1 B 2 B 3 B 1 B 2 B 3 T B T C T B T A T A (a) T C T 4 T 6 T 5 schedule S T 4 T 6 T 5 T 5 T 5 step 1 step 2 step 3 (b) BlockParttonng(S) Fg. 2. Block parttonng scheme. a) A technque to convert ordnary schedules to block-parttoned (BP) schedules and b) Block-parttonng a schedule that s not BP. tght deadlnes. Smlar schedulablty condtons are used by other technques, e.g. [16], [17] and [38], to defne nfeasble solutons. In the followng, we descrbe how the proposed twophase NMR technque works by means of the example of Fg. 1 where we have a TMR system runnng on a quadcore platform. When no fault has occurred the system executes the schedule of the ndspensable phase (Fg. 1b (step 6)) where two copes of each task T are executed and ther results are compared. If the results are dentcal, t s used as the result of the system. Whenever the results of a task T are not dentcal (whch ndcates that some faults have occurred durng the ndspensable phase), we swtch to the on-demand phase to execute the block of the BP schedule of the on-demand phase (Fg. 1d) that ncludes the same task T. After executng the thrd copy of T n the on-demand phase, a majorty votng s done over the three results to mask the faults. Then, we swtch back to the ndspensable phase to contnue executng ths schedule from the pont t was broken. Fg. 3 shows how the proposed technque operates when some faults occur durng executng the applcaton of Fg. 1. Note that, n ths paper, whenever we say a fault occurs or a task becomes faulty, we mean that the task gves an ncorrect result due to some errors (e.g. one or more transent faults). Assumng that the task T 2 becomes faulty, when comparng the results of T 2, they do not match, and hence the system temporarly swtches to the on-demand phase. The result msmatch may happen due to a fault durng the task executon or even due to a fault that corrupts the result comparson between the two phases. In the on-demand phase as T 2 belongs to the block B 2 of the BP schedule of Fg. 1d, the block B 2 s executed (the hghlghted tasks T 2 and T 4 n Fg. 3), and then a majorty votng s done over the results of the three copes of (a) (c) Indspensable phase On-demand phase Two dentcal results for T Result comparson/savng results Transent fault Result msmatch T 5 block B 2 T 5 T 5 block B 2 T 5 (b) T 5 T 4 T 5 T 5 T 4 T 5 T 2 T 2 ü T 5 block B 2 T 5 T 5 T 4 T 5 (d) ü T 5 T 5 T 4 T 5 T 2 T 1 T 2 ü ü T 2 ü T 5 block B 2 ü ü Fg. 3. Operaton of the proposed technque when faults occur durng the executon of the applcaton of Fg. 1. T 2 to mask the fault (Fg. 3a). Here, s not executed n the block B 2 durng the on-demand phase as t has already fnshed successfully before detectng the fault n T 2 and hence t s no longer requred. The mportant pont to be noted here s that when we execute B 2 durng the ondemand phase we not only execute T 2 (whose result s requred for majorty votng as ts executon n the ndspensable phase has been faulty), but also we execute T 4 n parallel wth T 2 and ts result s saved n memory, so that t can be used later for possble majorty votng. After executng the block B 2 the system swtches back to the ndspensable phase and contnues executng the schedule from the pont t was broken. After swtchng back to the ndspensable phase, two possble executon scenaros can be consdered regardng the task T 4: ) If a fault occurs durng the executon of T 4 n the ndspensable phase (Fg. 3b), when the result comparson ndcates fault occurrence, the system does not need to swtch to the on-demand phase as the results of three copes of T 4 are already avalable to be voted on (the results of two copes of T 4 are obtaned n the ndspensable phase and the result of another copy of T 4 already exsts n the nternal memory as t was executed n-advance n the prevous on-demand phase). ) If no fault occurs durng the executon of T 4 n the ndspensable phase (Fg. 3a), the results of the copy of T 4 that was executed n-advance n the prevous on-demand phase are no longer requred and can be dropped from the memory. One queston that may arse here s what happens f the n-advance executon of T 4 becomes faulty?. (Such a fault may occur durng the n-advance executon of T 4 n the on-demand phase or durng savng the results of the nadvance executon of T 4 between the two phases or even after the n-advance executon of T 4 n ts stored results). In ths case, when the system executes T 4 n the ndspensable phase, f no fault occurs (Fg. 3c), we wll not use the results of the n-advance executon of T 4, and hence no problem occurs. However, f the executon of T 4 n the ndspensable phase also becomes faulty (Fg. 3d), the system cannot mask ths second fault as the stored values of the n-advance executon are also faulty. Indeed, a TMR system can mask only one faulty executon for each task (generally speakng, an NMR system can mask faulty executons for each task) [9], [10]. The n-advance executons of tasks (e.g., T 4 n the block B 2 n Fg. 3) n the on-demand phase are useful because: ) Because of the use of parallel executon n the on-

5 SALEHI ET AL.: TWO-PHASE LOW-ENERGY N-MODULAR REDUNDANCY FOR HARD REAL-TIME MULTI-CORE SYSTEMS 5 Inputs: G: applcaton task graph N: parameter N of NMR, e.g, 3 for TMR Outputs: S IND: schedule for the ndspensable phase S BP: BP schedule for the on-demand phase 1: S IND=LstSchedulngLTF(G, ); // Fg. 4b 2: S TMP=LstSchedulngLTF(G, ); // Fg. 4b 3: S BP=BlockParttonng(S TMP); // Fg. 4c functon LstSchedulngLTF(G, q) // G: nput task graph, q: number of copes for each task n the // schedule, S: the output schedule 1: S = Null; // Intalze S wth an empty schedule 2: whle all tasks n G are not scheduled do 3: T = the largest unscheduled task n G whose predecessors -- have all scheduled; 4: Add q parallel copes of T to S; 5: endwhle; 6: return S; (a) (b) functon BlockParttonng(S) // S: the nput mult-core schedule 1: for each task T A from the begnnng of S do 2: f T A overlaps wth more than one task, T B and T C (where T C -- comes after T B n the same core) then 3: σ = (fnsh tme of T A) (start tme of T C); 4: shft T C and all ts successors n S to the rght by σ; 5: endf; 6: endfor; 7: for each block B n S do 8: shft all tasks n B to the rght and place them at the end of B; 9: endfor; 10: return S; (c) Fg. 4. The proposed schedulng technque. demand phase, n-advance executons do not mpose any tme overhead. For example, t can be seen n Fg. 3 that when the system have to execute T 2 n the on-demand phase, the n-advance executon of T 4 s performed n parallel wth t. Also, because of the use of LTF schedulng, tasks that come later n the schedule (e.g., T 4) can never be longer than the tasks that come earler (e.g., T 2) whch means that the n-advance executon of T 4 cannot lengthen the executon of the block B 2 n Fg. 3. Indeed, f we dd not use n-advance executons, we would not have any parallel executon durng the on-demand phase whch mples that the use of nadvance executons helps us reserve relatvely less slack tme for the on-demand phase, resultng n more slack to be avalable for energy management. ) Although n-advance executons of tasks n the ondemand phase may turn out to be useless when no fault occurs later durng the executon of the task n the ndspensable phase, they have a neglgble mpact on the average energy consumpton. Ths s because an n-advance parallel executon s performed only when a fault occurs n the ndspensable phase (for example n Fg. 3 the n-advance executon of T 4 has been performed because a fault has occurred n T 2 durng the ndspensable phase). Note that whle from a relablty pont of vew the consderaton of faults s a must, from the average energy consumpton pont of vew, we do not need to consder the cases where the system tolerates a fault [6], [13]. As an example, consder T 2 and T 4 n Fg. 3. Suppose that the probablty of a task executon becomes faulty s 10 4 and the energy consumpton of T 2 and T 4 are 10 mj and 5 mj. When no fault occurs, the system only executes T 2 n the ndspensable phase and consumes 2 10=20 mj. If a fault occurs durng the executon of T 2 n the ndspensable phase, the system wll execute T 2 and T 4 n the on-demand phase and hence consumes (10+5)=15 mj more energy. Therefore, the average energy consumpton for the executon of T 2 and T 4 s ( ) (20+15)= mj whch s very close to the energy consumpton when no faults occur (20 mj). Ths s also consstent wth our expermental observatons showng that the average energy consumpton dffers less than 0.01% from the fault-free energy consumpton. Ths example shows that the energy overhead of the nadvance executons s neglgble from the vewpont of average energy consumpton. Fg. 4 shows the pseudo-code of the proposed schedulng method used n our technque that receves an applcaton task graph (G) to make schedules for the ndspensable and on-demand phases (.e., S IND and S BP respectvely). The pseudo-code of Fg. 4a s the man body of the schedulng technque that calls the functons presented n Fgs 4b and 4c. The functon of Fg. 4b (LstSchedulng- LTF(G, q)) mplements the lst schedulng algorthm wth the LTF polcy to make a schedule S contanng q copes of each task from a task graph G. In ths functon, lne 1 s for the ntalzaton purpose. In lne 2, we begn a whle body to apply the schedulng to all tasks. Lne 3 s used to mplement LTF lst schedulng, as t selects the largest unscheduled task T whose predecessors have all scheduled. In lne 4, q parallel copes of T are scheduled. Fnally, lne 6 returns the schedule S. As t can be seen from Fg. 4a, ths functon s requred for both the ndspensable and on-demand phases. For the ndspensable phase we need a schedule contanng copes for each task, and for the on-demand phase we need a schedule contanng copes for each task. We make these two schedules n lnes 1 and 2 of Fg. 4a. In lne 3 of Fg. 4a we use the functon of Fg. 4c (BlockParttonng(S)) to convert the schedule S TMP (temporary schedule obtaned from lne 2 of Fg. 4a) to the BP schedule S BP. The functon of Fg. 4c receves a mult-core schedule S and starts from the begnnng of the schedule (lne 1). In lne 2 we check f each task, say T A, overlaps wth more than one task n another core n the schedule S, say T B, T C (where T C comes after T B on the same core). If so, through lnes 3 and 4 we shft the task T C (and all ts successor tasks n S) to the rght untl there s no overlap between T A and T C. In

6 6 lne 4, when we shft T C to the rght, we need to shft all the tasks that come after T C on the same core and the tasks that are dependent to T C (successors of T C n the task graph) but are scheduled on the other cores. After removng possble overlaps n the schedule (.e., parttonng the schedule nto blocks), through lnes 7 to 9 we shft all tasks n each block to the rght to place them at end of the block. We wll dscuss n Secton 4 why ths s effectve for our proposed technque. Fnally, n lne 10 the schedule S (.e., a BP schedule) s returned. It s noteworthy that although the proposed NMR technque needs at least cores for parallel executon of each task n the ndspensable and on-demand phases, f less than cores are avalable, the proposed technque stll can be used (wth a slght change) but wth less parallelsm. Indeed the technque can be even used for a sngle core where for each task, at frst the system executes copes of the task one after another (n seres) n ts ndspensable phase and then compares ther results. If some faults occur durng the ndspensable phase, the system executes the remanng copes of the task (agan n seres) for the on-demand phase and fnally the whole results are voted on to mask the faults. It should be noted that ths reduced parallelsm obvously takes more tme and hence may not be sutable for realtme systems wth tght deadlnes. When more cores are avalable, more parallelsm can be acheved that results n lower schedule length that provdes hgher schedulablty [19]. Ths can also release some statc slack tme that can be used for energy management. 4 ENERGY MANAGEMENT For the proposed NMR technque we have mplemented a specfc energy-management technque whch comprses offlne (Secton 4.2) and onlne (Secton 4.3) stages and explots dfferent types of slack tme to reduce the system energy consumpton through DVS [12]. Let W IND and W BP be the worst-case tme t takes to execute the schedules S SC and S BP n the ndspensable and on-demand phases respectvely. We need not only to reserve the tme W IND for the ndspensable phase but also to reserve the tme W BP for the on-demand phase. Hence, the proposed technque s feasble when W IND+W BP D (D s the applcaton deadlne) and the statc slack SS whch s left over from the applcaton and can be used for energy management s: IND BP SS D W W (1) where W IND+W BP s the applcaton total executon tme. As the amount of statc slack s known at desgn tme, offlne technques (e.g., the even slack dstrbuton technque n [20]) can be used at desgn tme to dstrbute ths slack among the tasks. However, n the proposed technque, there are also two other types of slack tme that are created at run-tme, and hence, unlke the statc slack, cannot be allocated at desgn tme, and have to be allocated at run-tme. These two types of slacks are: Dynamc slack: Ths slack results at run-tme when a task consumes less than ts worst-case executon tme due to early completon [6], [8], [11]. It should be noted that the actual executon tme of a task s not known at desgn tme, and hence the dynamc slack tme whch s obtaned from the task s also not known at desgn tme. Pseudo-dynamc slack: Although we always reserve enough tme to execute the BP schedule completely, we do not usually need to execute the tasks of ths schedule at run-tme. Ths s because when copes of a task fnshes successfully durng the ndspensable phase, ths task no longer re- qures the addtonal copes n the ondemand phase. Therefore, the task copes can be dropped from the BP schedule, thereby releasng some slack. We have called ths slack pseudodynamc slack because, just lke dynamc slacks, t s created at run-tme, but unlke dynamc slacks, ts amount can be calculated offlne at desgn tme. When a task T executes successfully n the ndspensable phase and we drop ts copes from the schedule of the on-demand phase, the pseudo-dynamc slack tme δ s released that can be exploted by DVS to reduce the energy consumpton of the subsequent tasks n the ndspensable phase. As the schedule of the on-demand phase s avalable at desgn tme, the amount of ths reclamed slack can be calculated offlne at desgn tme. To do ths, at desgn tme, we consder droppng the tasks from the schedule of the on-demand phase one after another n the order n whch they appear n the schedule of the ndspensable phase and the tme whch s released due to droppng a task T s the pseudo-dynamc slack δ. Fg. 5 shows n more detal how we calculate the pseudo-dynamc slack δ whch s released after droppng T from the schedule of the on-demand phase. To calculate the pseudo-dynamc slack δ the followng three cases can be consdered: 1. Case I (Fg. 5a): If there s no task except T n the block, when T s dropped from the schedule the released slack δ wll be W +c, where W s the worstcase executon tme of T and c s the maxmum tme whch s requred for comparng the results (majorty votng) or savng results. 2. Case II (Fg. 5b): If T s the largest task n the block (.e., W max{w j} for all the remanng tasks n the block), after droppng T from the schedule the value of pseudo-dynamc slack δ s W -max{w j}. 3. Case III (Fg. 5c): If there exsts at least one task T j n the block larger than T, after removng T from the schedule no pseudo-dynamc slack wll be released. Consderng the three cases n Fg. 5, δ s calculated as:... δ T W +c (a) δ... T W T j W j (b) T j W j T W (c)... Fg. 5. Pseudo-dynamc slack (δ ) calculaton. B 1 B 2 B 3 T 5 δ 1 δ 2 δ 3 δ 4 δ 5 δ 6 (d)

7 SALEHI ET AL.: TWO-PHASE LOW-ENERGY N-MODULAR REDUNDANCY FOR HARD REAL-TIME MULTI-CORE SYSTEMS 7 W c when only T exsts n the block (Case I) W max Wj when W max Wj (Case II) 0 when W <W j for at least one task T j (Case III) In the followng we llustrate how pseudo-dynamc slack s calculated by means of an example. Fg. 5d shows the pseudo-dynamc slack δ whch wll be released after droppng each task T from the BP schedule. The worstcase executon tmes of the example tasks are shown n Fg. 1a. The tasks are dropped from Fg. 5d n the order n whch they are scheduled. In ths example, wthout loss of generalty, we assume that comparng results (majorty votng) and savng results for all the tasks consume 5 tme unts (.e., c =5 for all the tasks). For ths example, t can be seen from the schedule of Fg. 5d that as the task T 1 s a sngle task n the block B 1 (Case I), f we drop T 1 from the schedule, the released slack wll be δ 1=W 1+c 1=25. After droppng T 1, f we drop T 2 from the schedule, as T 2 s the largest task n the block B 2 (Case II), the released slack wll be δ 2=W 2 W 3=20. After droppng T 2, as the task s the largest task n B 2, the released slack wll be δ 3=W 3 W 4=10 (Case II). After droppng, the task T 4 wll be a sngle task n the block B 2 and f we drop T 4 from the schedule the released slack wll be δ 4=W 4+c 4=35 (Case I). Smlarly, we obtan: δ 5=20 and δ 6=25. Although the amount of the pseudo-dynamc slacks can be calculated at desgn tme, t should be noted that ths slack s not avalable (and hence cannot be allocated) from the begnnng of the applcaton executon and s created at run tme when a task fnshes successfully n the ndspensable phase. Ths s why we call t pseudo-dynamc slack. It s noteworthy that the proposed schedulng technque (Secton 3) helps to dstrbute pseudo-dynamc slacks evenly among the tasks. It s known that even slack dstrbuton results n more energy savng as compared to uneven slack dstrbuton [6], [20]. Indeed pseudodynamc slack s prone to be dstrbuted unevenly among the tasks. Ths s because a pseudo-dynamc slack whch s obtaned from a task cannot be exploted by the same task or by ts prevous tasks and t can only be exploted by ts subsequent tasks. Therefore, those tasks that appear later n the schedule have more chance to gan larger pseudo-dynamc slacks as compared to the tasks that come earler. Ths mples that when pseudo-dynamc slack becomes avalable sooner rather than later, t helps to dstrbute pseudo-dynamc slacks more evenly. To acheve ths, we use two polces n our proposed schedulng technque (Secton 3): ) we move tasks n each block of the BP mult-core schedule to the end of the block, thereby enablng the slacks to appear sooner n the block (see Fg. 5b), ) we use the LTF polcy. To gve an nsght nto how the LTF polcy works, consder the followng example. Suppose that three tasks T 1, T 2 and wth worst-case executon tmes W 1=6, W 2=3 and W 3=2 appear n the LTF order n the ndspensable phase. Assumng that these tasks are n one block of the on-demand phase, usng (2), the pseudo-dynamc slacks obtaned from these tasks wll be δ 1=W 1 W 2=3, δ 2=W 2 W 3=1 and δ 3=W 3=2 (In (2) ths example we assume that c =0). However, f the tasks appear n the order, T 1 and T 2 whch s not LTF, the pseudo-dynamc slacks wll be δ 3=0, δ 1=3 and δ 2=3. Therefore, n the LTF order pseudo-dynamc slacks are avalable sooner and hence can be dstrbuted among the tasks more evenly. As we explaned earler, dynamc slack may result at run-tme due to early completon of tasks [6], [8], [11]. However, as the actual executon tme of a task s not known at desgn tme, the amount of dynamc slack s also not known at desgn tme. Hence, we provde an onlne energy-management technque to explot dynamc slacks at run-tme (Secton 4.3). Wth respect to pseudodynamc slacks, snce unlke dynamc slack the amount of pseudo-dynamc slack s known at desgn tme, we have developed a specfc offlne technque to manage pseudodynamc slacks (Secton 3.2). 4.1 Energy and Relablty Models Power consumpton of each task T manly comprses dynamc power P dyn(t ) and statc power P stat(t ). The dynamc power s determned by [6]: P (T ) C V f (3) 2 dyn eff where C eff s the effectve swtched capactance, V and f are, respectvely, the supply voltage and the operatonal frequency durng the executon of T [6], [7]. The statc power s manly comprsed of sub-threshold leakage power and can be wrtten as: Vth VT stat (T ) sub 0 P I V I e V (4) where V s the supply voltage, I sub s the sub-threshold leakage current, I 0 depends on technology parameters and devce geometres, η s a technology parameter, V th s the transstors threshold voltage, and V T s the thermal voltage [6]. When DVS s used, each task T s executed at a voltage V, whch may be less than V max (the maxmum possble supply voltage). For each task T, we defne the normalzed supply voltage ρ as follows: V V (5) max When a task T s executed at the scaled voltage V =ρ V max, consderng an almost lnear relatonshp between voltage and frequency [6], [7], we have: f =ρ f max, where f s the operatonal frequency correspondng to V and f max s the maxmum possble operatonal frequency (correspondng to V max). Therefore, when DVS s used, the actual executon tme of the task s prolonged from t to t /ρ, and by substtutng V =ρ V max and f =ρ f max n (3) and (4), the total energy whch s consumed to execute the task T s gven by [6]: 2 2 t 2 ET ( ) ( IsubVmax CeffVmaxfmax ) ( PS PD) t (6) where P S=I subv max and P D=C effv 2 maxf max are respectvely the statc and dynamc powers when the system performs at the maxmum voltage and frequency. Wthout consderng the energy consumpton of the on-demand phase (whch commonly has a very low

8 8 probablty of beng performed as faults rarely occur [6], [8]), we focus on the ndspensable phase and am at mnmzng the fault-free energy consumpton (lke the works [5], [6], [8]). Usng the energy model of (6) that gves the energy consumpton of a sngle task, the energy whch s consumed to execute a task T n the ndspensable phase (.e., executng copes of the task and comparng the results) can be wrtten as: 2 ENMR(T ) N /2( PS PD ) t c (7) where c s the result comparson tme. Based on (7), the energy consumpton of the fault-free executon of an applcaton wth n tasks usng the proposed NMR technque can be calculated as: n 2 Eapp ENMR (T ) N /2( PS PD ) t c 1 1 n (8) As t s explaned n Secton 3, the fault-free energy consumpton s very close to the average energy consumpton. Therefore, we use (8) n our offlne energy management at desgn tme (Secton 4.2) to mnmze the fault-free energy consumpton. Also, n our experments n Secton 5 we report the fault-free energy consumpton. Transent faults are usually assumed to follow a Posson dstrbuton wth an average rate λ [5], [6]. Consderng the effects of DVS on transent fault rates, the fault rate at the scaled supply voltage V =ρ V max (ρ mn ρ ρ max=1) s modeled as [5], [6]: ( ) 10 0 d (1 ) 1mn where λ 0=λ(ρ max=1) s the fault rate at the maxmum voltage V max, ρ mn s the rato of the mnmum supply voltage V mn to V max, and the exponent value d s a technology dependent constant [5], [6]. Consderng (9) (.e., the effect of voltage scalng on transent fault rate), the probablty of a task T beng executed correctly s wrtten as [5], [6]: t ( ) ( ) (9) R e (10) where λ(ρ ) s gven by (9) and t /ρ s the executon tme of T when executed at V =ρ V max. Conversely, the probablty of falure of the task T (.e., the unrelablty of T ) s denoted by [5], [6]: t ( ) F( ) 1 R( ) 1e (11) To calculate the relablty of the proposed two-phase NMR technque we consder two cases: ) the fault-free executon where all copes of each task are executed successfully n the ndspensable phase and ) the case where some tasks n the ndspensable phase become faulty and we perform the on-demand phase. In NMR, the correct executon of at least copes of each task s requred for the system to be functonal. In the proposed NMR technque, all the correct executons may be performed n the ndspensable phase (when no fault occurs), or some of them are performed n the on-demand phase (when a fault occurs). Therefore, the relablty of the proposed system can be calculated by consderng the two cases. The frst case gves the relablty of T n the fault-free state, and the second case gves the relablty when some faults occur durng the executon of T. When no fault occurs, copes of each task T are executed n the ndspensable phase and the ondemand phase s not requred. When we use DVS n the ndspensable phase, each task T s executed on the scaled supply voltage ρ V max. Therefore, usng (10), the relablty of a task T n the fault-free case can be calculated as: R 1(T ) ( ) R (12) where R ( ) s gven by (10). To calculate the relablty for the case that k (1 k ) copes of each task become faulty (n NMR up to faulty executons can be masked [9], [10]), we consder all the cases that j (1 j k) copes from copes of T n the ndspensable phase and k j copes from copes of T n the on-demand phase become faulty. In these cases, the other j copes n the ndspensable phase and (k j) copes n the on-demand phase are executed correctly. Therefore, the probablty of the correct executon of a task T when up to executons of T become faulty can be calculated usng (10) and (11) as: N /2 k N /2 j N /2 R2(T ) F( ) R( ) k1 j1 j ndspensable phase N /2 F( max ) R( max ) k j k j N/2 ( k j) on-demand phase j (13) where ρ determnes the scaled voltage whch s employed n the ndspensable phase, ρ max=1 s employed n the ondemand phase (as n the on-demand phase no DVS s used and tasks are executed at the maxmum supply voltage V max). Consderng both the fault-free and faulty condtons, the relablty of a task T n the presence of up to faults when executed by the proposed NMR technque, can be wrtten as: R(T ) R1(T ) R2(T ) (14) The relablty of an applcaton executon reles on the correct executon of all ts tasks. Therefore, usng (14), the relablty of an applcaton wth n tasks runnng by the proposed NMR technque can be calculated as: R app n R(T ) (15) Offlne Energy Management As explaned n the prevous sectons, n the proposed NMR technque, when no fault occurs, we do not execute the on-demand phase (ncludes half of the copes for each task,.e., N /2 ), whch results n consderable energy savng as compared wth conventonal NMR. In ths secton we dscuss how the proposed NMR technque explots statc and pseudo-dynamc slack tmes to acheve even further energy reducton. For ths purpose, we develop a specfc technque to allocate statc and pseudodynamc slack tmes to tasks offlne at desgn tme. When we allocate statc and pseudo-dynamc slack tmes, we

9 SALEHI ET AL.: TWO-PHASE LOW-ENERGY N-MODULAR REDUNDANCY FOR HARD REAL-TIME MULTI-CORE SYSTEMS 9 assume that no dynamc slack exsts, as the avalablty and the amount of dynamc slack tmes s not known at desgn tme. Indeed, at frst we mnmze the expected energy consumpton of the system by the offlne allocaton of statc and pseudo-dynamc slacks assumng that no dynamc slack exsts. However, at run-tme we also explot dynamc slacks through our onlne energymanagement for further energy savng (Secton 4.3). To develop the offlne slack allocaton, we formulate the problem as an optmzaton problem. To do ths, we formulate tme constrants as nequaltes. For the frst task T 1, as the task s executed at the scaled supply voltage ρ 1V max, ts worst-case executon tme ncreases from W 1+c 1 to (W 1+c 1)/ρ 1. Consderng that the only slack tme whch s avalable to T 1 s the statc slack tme (SS gven by (1)) and no pseudo-dynamc slack s avalable to t (as pseudo statc slack s obtaned only from the prevous tasks and T 1 has no prevous task), T 1 cannot explot more than the statc slack SS. So we have: W c 1 1 ( W1c1) SS (16) 1 It should be noted that although the whole of statc slack SS s avalable to the frst task T 1, ths does not necessarly mean that t explots all ts avalable slack tme. Indeed, each task can explot only a part of ts avalable slack and set asde the remanng for the subsequent tasks. Durng the ndspensable phase, when a task T fnshes successfully the pseudo-dynamc slack δ whch s obtaned by droppng the task T from the on-demand phase, s avalable to ts subsequent tasks n the ndspensable phase. Consequently, the task T 2 can explot both the part of statc slack SS left over by T 1 and the pseudodynamc slack δ 1 whch s obtaned by droppng T 1 from the on-demand phase. Hence, for the task T 2, we have: W c W c ( W c ) SS ( W c ) Obtaned 1 from T1 Statc slack left over from T1 Smlarly, for each task T (1 n) we have: W c W c ( W c ) SS j j ( W c ) j j j Tj Tj j Obtaned from the prevous tasks Statc slack left over from the prevous tasks (17) (18) where Φ s the set of all tasks that has been executed before startng the task T. The optmzaton problem of the offlne part of energy management can be wrtten as: mnmze: E E (T ) subject to: app app demand 1 W c Wj cj c1: ( W c) j SS ( Wj c j) Tj T j j for all T (1 n) c2: R R n NMR (19) where E app s the energy consumpton of an applcaton executed usng the proposed NMR technque (gven by (8)), the constrant c1 (Inequalty 18) s used to consder tme constrants,.e., to consder how much slack s avalable to each task (ncludng pseudo-dynamc and statc slack), and the constrant c2 guarantees that the system relablty does not fall below a requred level R demand. The parameters, tasks worst-case executon tme (W ), result comparson tme (c ), statc slack tme (SS) and pseudo-dynamc slack (δ ) are all known at desgn tme. Ths mples that, ths optmzaton problem can be solved offlne at desgn tme to determne the ρ values whch mnmze the system energy consumpton. It should be noted however that we cannot assgn obtaned ρ values to the tasks at desgn tme. Rather, we store the ρ values, and durng the ndspensable phase we assgn the supply voltage ρ V max to the task T, whenever all ts prevous tasks fnsh successfully. In other words, the ρ values that we calculate usng the proposed offlne technque s only vald for the fault-free executon. If some faults occur durng the ndspensable phase, the ρ values wll be no longer vald. Ths s because when a fault occurs n a task T durng the ndspensable phase, the system cannot drop t from the schedule of the on-demand phase, whch means that the pseudo-dynamc slack δ wll not be longer avalable. One possble soluton for ths problem s the offlne calculaton of ρ values for all possble fault scenaros and at run tme based on how faults occur we can decde to use the proper set of ρ values. However, we do not use ths method as the fault-free state s the most probable state and hence s the most promnent state from the vewpont of average energy consumpton [6], [8]. Therefore, n the proposed technque we use the ρ values that are calculated for the fault-free case. However, f a fault occurs at run-tme, we temporarly do not use the ρ values that are calculated offlne (as they are no longer vald) and from then on, we only use the proposed onlne management technque (Secton 4.3) to allocate pseudodynamc slacks. From the begnnng of the next frame we agan use the ρ values that are calculated offlne. 4.3 Onlne Energy Management Let x be the slack (ncludng the pseudo-dynamc and statc slacks) that s allocated to a task T at desgn tme usng the offlne part of our energy-management (Secton 4.2). When DVS s used, the task worst-case executon tme ncreases from W +c to (W +c )/ρ. On the other hand, as we explot the slack x by DVS, we can also say that the task worst-case executon tme ncreases from W +c to W +c +x. Ths mples that we have: W c x ( W c ) (20) Indeed, after calculatng the ρ values by solvng the offlne optmzaton problem at desgn tme, we obtan the slack x (ncludng pseudo-dynamc and statc slacks) that we allocate to a task T usng (20). At run-tme, for each task T, the total slack tme SL whch s avalable to the task can be wrtten as: SL x DS j (21) where x s the slack tme whch has been calculated offlne n Secton 4.2 (ncludng both pseudo-dynamc and statc slacks), and DS j s the dynamc slack whch has

10 10 been left over by the prevous task (the task T j) n the ndspensable phase due to early completon at run-tme. Snce SL s the whole slack tme whch s avalable to the task T, the scaled supply voltage ρ V max whch s assgned to the task, must not prolong ts worst-case executon tme beyond the tme W +c +SL,.e., we requre: W c W c SL (22) Clearly the proposed onlne energy management must take nto account the tme-constrant gven by (22). Another mportant constrant that must be taken nto account s for guaranteeng relablty. Let be the mnmum value of ρ that does not cause the system relablty falls below the requred level. Clearly we requre: (23) In the proposed onlne energy manager, as DVSenabled processors usually have dscrete voltage/frequency levels (Secton 5), we always select the smallest value of ρ among the set of possble ρ values that satsfes both the Inequaltes (22) and (23). In order to be able to check Inequaltes (22) and (23) at run-tme we need to have SL and values at run-tme. To calculate the slack tme SL (gven by (21)) at run-tme, note that x values have been calculated offlne and stored to be used at run-tme. Also the dynamc slack tme DS whch s obtaned from the task T can be easly calculated at runtme as follows. When DVS s used for the frst task (T 1), the actual executon tme of the task s (t 1+c 1)/ρ 1. Snce all the slack tme whch s avalable to T 1 s x 1, the maxmum tme whch s avalable for executng T 1 s W 1+c 1+x 1 therefore the dynamc slack whch s obtaned from T 1 s: 1 1 DS1 ( W1c 1x 1) t c (24) For the remanng tasks (T, 2 n), the maxmum avalable tme s W +c +x +DS j (where DS j s the dynamc slack whch has been left over by the task T j whch s the task that s fnshed just before startng the task T ). Therefore, we can wrte: t c DS ( W c x DS j) (25) At the end of each task T, we can use (25) (except for the frst task that we use (24)) to calculate DS at run-tme. It can be seen from (25) that to calculate the dynamc slack DS, we need to know W +c +x, DS j, and (t +c )/ρ. The parameter W +c +x s known at desgn tme, and hence t can be calculated offlne and stored to be used at runtme. DS j, s the dynamc slack obtaned from the task T j (whch s the task that s fnshed just before startng the task T ), and s already calculated at the end of the T j. (t +c )/ρ s the actual executon tme of the task T (ncludng the result comparson tme), and when the task fnshes, ts executon tme can be easly calculated usng the nternal system clock (as ths executon tme s the dfference between the start tme and fnsh tme of the task). In short, at the end of each task T, the dynamc slack tme DS, can be calculated at run-tme wth very low overhead as ts onlne calculaton only requres a few subtracton and addton operatons. The mnmum possble value of 1 ρ that does not cause the system relablty falls below the requred level (.e. values) can also be calculated offlne at desgn tme. To do ths we can solve the optmzaton problem of (19), but wthout consderng the constrant c1. Ths s because the constrant c1 s used to consder tme constrants, but to calculate values we want to know whch values of ρ can guarantee the requred level of relablty regardless of tme constrants. 5 EVALUATION AND DISSCUTIONS Experments n ths paper were conducted based on the power model of the Intel PXA270 processor [21]. Ths processor can operate at dfferent voltage levels n the range of V, and the correspondng frequences vary from 13MHz to 624MHz. The energy consumpton for actve cores s calculated by (8) where P S and P D (that are respectvely the statc and dynamc power consumpton of the system when operatng at the maxmum voltage and frequency) are 925mW and 260mW respectvely [21]. Also, the Intel PXA270 processor has a low power sleep mode wth mW of dle power consumpton. We consdered that when a core s dsabled or s temporarly unused, t enters the sleep mode and only consumes the dle power. We modfed the tool MEET [22] to profle executon tme and energy consumpton whle usng DVS based on the power model of Intel PXA270. Lke the works [5], [6], [13], [16], we performed systemlevel relablty smulaton where the relablty was calculated by (15) and expressed n terms of applcaton probablty of falure PoF app (.e. PoF app=1-r app). The fault rate was modelled usng (9) under the parameters λ 0=10-6 faults/s and d=3 [5], [6]. Therefore, the fault rate vares between 10 6 faults/s and 10 3 faults/s, correspondng to the maxmum and mnmum voltage levels. Prevous research works on relable real-tme systems that do not use NMR rely on fault-detecton mechansms [5], [6], [7], [8], [11]. However, they have usually overlooked the overhead and fault coverage of detecton mechansms. Indeed, they usually do not consder any specfc detecton mechansm and smply assume that a detecton mechansm wth perfect fault coverage s part of the tasks (e.g., [5], [6], [7]). However, to provde far comparsons, we need to nclude a real fault-detecton mechansm n any mplementaton of prevous works whch s used n our comparsons. To do ths, we consdered that the prevous works use fault-detecton mechansms ncluded n ther tasks (.e., software fault-detecton mechansms). We conducted a set of experments to nvestgate the energy and executon tme overheads of the software fault-detecton mechansms that can be used for prevous works. To consder effect of fault-detecton mechansms on energy and relablty we used two types of software fault-detecton mechansms n the mplementatons of prevous works that were used n our comparsons: 1. Heavy fault-detecton mechansms (called HFD): wth hgh fault-detecton overheads but relatvely hgh fault coverage. For ths case we assumed that the system uses multple fault-detecton mechansms based on code and data redundancy, arthmetc

11 SALEHI ET AL.: TWO-PHASE LOW-ENERGY N-MODULAR REDUNDANCY FOR HARD REAL-TIME MULTI-CORE SYSTEMS 11 TABLE 1 TIME (T) AND ENERGY (E) OVERHEADS OF HEAVY FAULT- DETECTION (HFD) AND LIGHT FAULT-DETECTION (LFD). Energy Consumpton (mj) No Fault- Detecton HFD LFD Overhead (%) HFD LFD Benchmark T(ms) E(mJ) T(ms) E(mJ) T(ms) E(mJ) T E T E QuckSort BtCounts BascMath SusanSmooth SusanCorners SusanEdges [5]-HFD [5]-LFD CTMR LE-TMR Robot Sparse fpppp MPEG4 MJPEG Benchmark code, consstency check, and control flow checkng [9], [10], [23], [24], [25], [26] to acheve hgh fault coverage for dfferent fault types. 2. Lght fault-detecton mechansms (called LFD): wth relatvely low fault-detecton overheads and also low fault coverage. For ths case we assumed that the system uses fewer mechansms to reduce the fault-detecton overhead wth the cost of decreased detecton coverage [26]. Table 1 shows the tme and energy overheads that the software fault-detecton mechansms mpose (assumng that we use the supply voltage 1.55V). To measure the overheads the applcatons were selected from the MBench [27] benchmarks. It should be noted that whle both tme and energy overheads of software faultdetecton mechansms are lower than the overhead of modular redundancy wth result comparson (majorty votng), the fault coverage of software mechansms s not suffcently hgh, unlke majorty votng that provdes hgh fault maskng [9], [10], [23], [24], [26]. Furthermore, these software fault-detecton mechansms are applcaton-specfc so that each task requres ts specfc detecton mechansm [9], [10], [25], [26], whle result comparson and majorty votng are general and can be used for any type of tasks wthout requrng any hardware modfcaton or redesgn [9], [10], [25]. To evaluate the effectveness of the proposed NMR technque (whch we call t LE-NMR), we compared LE- NMR wth a recent work (proposed n [5]). To provde a far comparson, for both the mplementatons of LE- NMR and the system of [5], we assumed that both use the same level of task replcaton,.e., when we consder an NMR wth N copes for each task, we also consdered that the system of [5] has N-1 backups for each task (.e., agan N copes for each task) to acheve fault tolerance. In addton, the system of [5] requres a fault-detecton mechansm to determne f a backup task must be executed or not. Lke most of the prevous works, [5] has not addressed any fault-detecton mechansm, but we consdered that the tasks that are scheduled n the system of [5] use task-specfc software mechansms for fault-detecton. To do ths, we consdered mplementatons of [5] where the tasks ncluded heavy fault-detecton mechansms (called [5]-HFD) and lght fault-detecton mechansms (called [5]-LFD). We also consdered n our experments an mplementaton of conventonal NMR, called CNMR, where we do not use the two phases ndspensable and on-demand. In conventonal NMR, all N copes of each task are executed n parallel (assumng that enough cores are avalable) and the statc slack tme s only used for energy reducton. It should be noted that there are varous technques to acheve low-energy fault-tolerance n real-tme systems (e.g., [6], [7], [8], [11], [13], [16], [17], [18]) and t s beyond the scope of ths paper to compare the proposed technque wth all these varous technques. The man reason to choose the technque of [5] for the comparson s that t s a recent work wth smlar condtons to the proposed technque, e.g., hard real-tme constrants, the use of DVS, and the frame-based applcaton model wth task precedence constrants (a set of dependent tasks wth a global deadlne) runnng on mult-core platforms. Also, t s noteworthy that for many of the prevous works t s not meanngful to compare them wth the proposed technque because they consderably dffer from ours n applcaton model (e.g., chan of dependent tasks n [6], sngle-task frame n [13], perodc tasks n [11], [17], and ndependent tasks n [18]). To compare LE-NMR wth [5] and conventonal NMR, we used both synthetc and practcal applcaton task graphs. To do ths, we used the task graph generator TGFF [28] and the Standard Task Graph set (STG) [29]. The STG benchmark sute contans both synthetc task graphs and practcal real-tme applcaton task graphs ncludng robot control, SPEC fpppp and a sparse matrx solver. We also conducted experments on two other realworld applcatons: MPEG4 decoder and MJPEG encoder (ther task graphs can be found n [15]). Fg. 6 and Table 2 show, respectvely, the energy consumpton and probablty of falure for [5]-HFD, [5]-LFD, CTMR, and the proposed LE-TMR when runnng the practcal applcatons. The three followng nterestng observatons can be made from Fg. 6 and Table 2: 1. LE-TMR not only provdes more energy savng (n average 28% and up to 33%) as compared to [5]- HFD, but also has a less probablty of falure,.e., LE-TMR s more relable. 2. Although LE-TMR provdes relatvely less energy savng (n average 12%) as compared to [5]-LFD, LE-TMR has a far less probablty of falure (t provdes much hgher relablty). 3. LE-TMR provdes more energy savng (n average 34%) as compared to CTMR, whle provdes almost the same level of relablty (Table 2). Fg. 6. Energy consumpton of LE-TMR, [5]-HFD, [5]-LFD, and CTMR when runnng the practcal applcatons.

12 12 TABLE 2 PROBABILITY OF FAILURE (POF) FOR LE-TMR, [5]-HFD, [5]-LFD, AND CTMR WHEN RUNNING THE PRACTICAL APPLICATIONS. Applcaton [5]-HFD [5]-LFD CTMR LE-TMR Robot Sparse Fpppp MPEG MJPEG Another set of experments were conducted n order to analyze how the parallelsm degree of task graphs affects the effectveness of our technque. To do ths, synthetc task graphs were generated. It s known that for task graphs wth the same number of tasks, the heght of the task graph can be used to take the parallelsm degree nto account [39]. Based on ths, n the experment three classes of task graphs wth dfferent parallelsm degrees were consdered. Let n be the number of nodes (tasks) n a task graph and h be the task graph heght. Clearly h can vary between 1 and n, therefore the three classes of consdered task graphs are: ) task graphs wth 1 h n/3 (called task graphs wth hgh parallelsm degree), ) task graphs wth n/3 h 2n/3 (called task graphs wth medum parallelsm degree), and ) task graphs wth 2n/3 h n (called task graphs wth low parallelsm degree). The tasks of the synthetc task graphs were randomly selected from the MBench benchmarks and the tme and energy overheads of the detecton mechansms for these tasks were taken from Table 1. The worst-case and actual executon tmes (W and t ) of the tasks were generated randomly [4], [5], [6]. The worst-case executon tmes were unformly dstrbuted between 10ms and 100ms. However, as the actual executon tmes for each task may have dfferent probablty dstrbutons, lke works [4], [5], [6], n our experments, we consdered the unform, normal, or exponental dstrbutons for the actual executon tme t and each task T was executed only for the duraton of t. In the experment, t was assumed that task graphs wth 20, 50, 100, 200, 500 tasks wth dfferent parallelsm degrees were executed on mult-core systems wth 2, 4, 8, 16 and 32 cores. Each case (e.g., a task graph wth 50 tasks on an 8-core system) was smulated for 1500 tmes wth dfferent parameters (.e., tasks worst-case and actual executon tmes and applcaton deadlne) and the average results are reported n Fgs 7 and 8. These fgures show the energy consumpton and probablty of falure (PoF) for LE-TMR, [5]-HFD, [5]-LFD, and CTMR. These observatons can be made from Fgs 7 and 8: 1. It can be seen from Fg. 7 that, for all the four systems, as the parallelsm degree of task graphs ncreases, the energy consumpton decreases. However, the energy consumpton of LE-TMR s always less than the other three systems. 2. Whle the energy consumpton of all the four systems decreases wth the ncrease n the task graph parallelsm degree, LE-TMR favours more energy reducton as compared to the others. For example, assumng we have 16 cores, as the task graph parallelsm degree ncreases from low (Fg. 7a) to hgh (Fg. 7c), the energy consumpton of LE-TMR reduces from 1698mJ to 1231mJ (28% reducton), whle the energy consumpton of CTMR reduces from 2132mJ to 1944mJ (9% reducton). 3. As Fg. 8 shows, LE-TMR has a far less probablty of falure than the mplementatons of [5], even compared to the mplementaton of [5] that uses heavy fault-detecton mechansms ([5]-HFD). Ths s because of the superorty of majorty votng (NMR) n coverng the faults as compared to faultdetecton mechansms [9], [10], [24], [25], [26]. 4. Whle LE-TMR provdes almost the same relablty as CTMR (Fg. 8), LE-TMR consumes much less energy than CTMR (Fg. 7) manly because of the more sophstcated energy-management technque that LE-TMR uses. We also compared LE-NMR wth N=5 and N=7 (.e., LE-5MR and LE-7MR respectvely) wth [5] and the conventonal NMR. The experments demonstrate that LE- NMR completely outperform [5] from both the energy- Energy Consumpron (mj) [5]-HFD [5]-LFD CTMR LE-TMR Number of cores 32 (a) Low parallelsm degree Energy Consumpron (mj) [5]-HFD [5]-LFD CTMR LE-TMR Number of cores (b) Medum parallelsm degree Fg. 7. Energy consumpton of LE-TMR, [5]-HFD, [5]-LFD, and CTMR when runnng the synthetc applcatons. Probablty of Falure (PoF) [5]-HFD [5]-LFD CTMR LE-TMR 1E-02 1E-04 1E-06 1E-08 1E Number of cores (a) Low parallelsm degree Probablty of Falure (PoF) [5]-HFD [5]-LFD CTMR LE-TMR 1E-02 1E-04 1E-06 1E-08 1E Number of cores (b) Medum parallelsm degree Energy Consumpron (mj) Probablty of Falure (PoF) [5]-HFD [5]-LFD CTMR LE-TMR Number of cores (c) Hgh parallelsm degree [5]-HFD [5]-LFD CTMR LE-TMR 1E-02 1E-04 1E-06 1E-08 1E Number of cores (c) Hgh parallelsm degree Fg. 8. Probablty of falure (PoF) n logscale for LE-TMR, [5]-HFD, [5]-LFD, and CTMR when runnng the synthetc applcatons.

13 SALEHI ET AL.: TWO-PHASE LOW-ENERGY N-MODULAR REDUNDANCY FOR HARD REAL-TIME MULTI-CORE SYSTEMS 13 consumpton and relablty vewponts. LE-5MR and LE- 7MR provde n average respectvely 19% (up to 22%), and 17% (up to 21%), and 31% (up to 36%) energy savng as compared to the correspondng mplementatons of [5] and the conventonal NMR. An nterestng observaton from the experments s that none of the mplementatons of [5] can acheve hgh relablty (the mplementatons of [5] cannot acheve a probablty of falure less than 10 3 ) whle LE-NMR satsfes the requred relablty level of safety-crtcal applcatons as they may requre probablty of falure be less than 10 9 [6], [9], [10]. Ths s because the mplementatons of [5] use software fault-detecton mechansms whle the fault coverage of these mechansms s not suffcently hgh [9], [10], [25], [26], unlke LE-NMR that uses majorty votng that provdes hgh fault maskng [9], [10], [24], [25], [26]. 6 CONCLUSION In ths paper, we descrbed how mult-core platforms can be exploted to acheve hgh relablty wth low energyoverhead for hard real-tme systems. To do ths, we proposed a low-energy NMR (we called t LE-NMR). To acheve energy savng n LE-NMR we explot two man strateges. Frst, we adopt a two-phase NMR technque, where usually (when no fault occurs) only one phase s executed, resultng n a consderable energy savng compared wth conventonal NMR systems. Second, to acheve further energy savng, we use DVS. In developng the proposed LE-NMR technque, we have consdered two new concepts: ) Block-parttoned schedulng and ) Pseudo-dynamc slack management. To explot avalable slacks n the system by DVS, we have developed an energy-management technque wth offlne and onlne parts. The offlne part at desgn tme derves and solves an optmzaton problem to explot the slacks that are known at desgn tme (.e., statc and pseudo-dynamc slacks), and to assgn dynamc slacks to the tasks at run-tme, the onlne part s used. The expermental results show that LE-NMR provdes up to 34% energy savng and s 6 orders of magntude hgher relable as compared to an mplementaton of a recent prevous work. ACKNOWLEDGMENT Mohammad Saleh and Alreza Ejlal acknowledge Research Vce-Presdency of Sharf Unversty of Technology for fundng ths work under grant no. G Bashr M. Al-Hashm acknowledges the EPSRC (UK), for fundng ths work n part under grant PRME EP/K034448/1. Expermental data used n ths paper can be found at DOI: /SOTON/ ( REFERENCES [1] J. Henkel, V. Narayanan, S. Parameswaran, and J. Tech, Run- Tme Adapton for Hghly-Complex Mult-Core Systems, Proc. Nnth IEEE/ACM/IFIP Int l Conf. Hardware/Software Codesgn and System Synthess (CODES+ISSS'13), pp. 1-8, Sept Oct , do: /CODES-ISSS [2] W.Y. Lee, Energy-Effcent Schedulng of Perodc Real-Tme Tasks on Lghtly Loaded Multcore Processors, IEEE Trans. Parall. Dstr. Syst., vol. 23, no. 3, pp , March 2012, do: /TPDS [3] A. Munr, S. Ranka, and A. Gordon-Ross, Hgh-Performance Energy-Effcent Multcore Embedded Computng, IEEE Trans. Parall. Dstr. Syst., vol. 23, no. 4, pp , Aprl 2012, do: /TPDS [4] H. Su, D. Zhu, and D. Mosse, Schedulng Algorthms for Elastc Mxed-Crtcalty Tasks n Multcore Systems, Proc. IEEE 19th Int l Conf. Embed. Real-Tme Computng Syst. and Applcatons (RTCSA'13), pp , Aug. 2013, do: /RTCSA [5] Y. Guo, D. Zhu, and H. Aydn, Relablty-Aware Power Management for Parallel Real-Tme Applcatons wth Precedence Constrants, Proc. Int l Green Computng Conf. and Workshops (IGCC), pp.1-8, July 2011, do: /IGCC [6] A. Ejlal, B.M. Al-Hashm, and P. Eles, Low-Energy Standby- Sparng for Hard Real-Tme Systems, IEEE Trans. Comput.-Ad. Des. Integr. Crcuts Syst., vol. 31, no. 3, pp , March 2012, do: /TCAD [7] M.K. Tavana, M. Saleh, and A. Ejlal, Feedback-Based Energy Management n a Standby-Sparng Scheme for Hard Real-Tme Systems, Proc. IEEE 32nd Real-Tme Systems Symposum (RTSS'11), pp , Nov Dec. 2011, do: /RTSS [8] R. Melhem, D. Mosse, and E. Elnozahy, The nterplay of power management and fault recovery n real-tme systems, IEEE Trans. Comput., vol. 53, no. 2, pp , Feb 2004, do: /TC [9] D.K. Pradhan, Fault-tolerant Computer System Desgn. Prentce- Hall, Inc., Upper Saddle Rver, NJ, [10] I. Koren, and C.M. Krshna, Fault-Tolerant Systems. Morgan Kaufmann, Elsever, San Francsco, CA, [11] M.A. Haque, H. Aydn, and D. Zhu, Energy-Aware Standby- Sparng Technque for Perodc Real-Tme Applcatons, Proc. IEEE 29th Int l Conf. Comput. Desgn (ICCD'11), pp , Oct. 2011, do: /ICCD [12] T.D. Burd, T.A. Perng, A.J. Stratakos, and R.W. Brodersen, A dynamc voltage scaled mcroprocessor system, IEEE J. Sold- State Crcuts (JSSC), vol. 35, no. 11, pp , Nov. 2000, do: / [13] D. Zhu, R. Melhem, D. Mosse, and E. Elnozahy, Analyss of an Energy Effcent Optmstc TMR Scheme, Proc. Tenth Int l Conf. Parall. and Dstr. Syst. (ICPADS'04), pp , July 2004, do: /ICPADS [14] J. Cong and K. Gururaj, Energy Effcent Multprocessor Task Schedulng under Input-dependent Varaton, Proc. Desgn, Automaton and Test n Europe Conf. and Exhbton (DATE'09), pp , Aprl 2009, do: /DATE [15] X. Q and D. Zhu, Energy effcent block-parttoned multcore processors for parallel applcatons, J. Comput. Scence Tech., vol. 26, no. 3, pp , May 2011, do: /s [16] X. Q, D. Zhu, and H. Aydn, Global schedulng based relablty-aware power management for multprocessor real-tme systems, J. Real-Tme Syst., vol. 47, no. 2, pp , March 2011, do: /s x. [17] M.A. Haque, H. Aydn, and D. Zhu, Energy-Aware Task Replcaton to Manage Relablty for Perodc Real-Tme Applcatons on Multcore Platform, Int l Green Computng Conf. (IGCC'13), pp. 1-11, June 2013, do: /IGCC

14 14 [18] D. Zhu, R. Melhem, and D. Mosse, Energy Effcent Redundant Confguratons for Real-Tme Parallel Relable Servers, J. Real-Tme Syst., vol. 41, no. 3, pp , Aprl 2009, do: /s [19] E.G. Coffman and R.L. Graham, Optmal Schedulng for Two- Processor Systems, Acta Informatca, vol. 1, no. 3, pp , 1972, do: /BF [20] M.T. Schmtz, B.M. Al-Hashm, and P. Eles, System-Level Desgn Technques for Energy-Effcent Embedded Systems. Norwell, MA: Kluwer, [21] Intel Corp., Intel PXA270 Processor, Avalable: [22] M. Bazzaz, M. Saleh, and A. Ejlal, An Accurate Instructon- Level Energy Estmaton Model and Tool for Embedded Systems, IEEE Trans. Instrum. Meas., vol. 62, no. 7, pp , July 2013, do: /TIM [23] N. Oh, P.P. Shrvan, and E.J. McCluskey, Control-Flow Checkng by Software Sgnatures, IEEE Trans. Rel., vol. 51, no. 1, pp , Mar 2002, do: / [24] J. Ademark, J. Vnter, P. Folkesson, and J. Karlsson, Expermental Evaluaton of Tme-Redundant Executon for a Brakeby-wre Applcaton, Proc. Int l Conf. Dependable Syst. and Networks (DSN 02), pp , 2002, do: /DSN [25] K.S. Ym, V. Sdea, Z. Kalbarczyk, D. Chen, and R.K.A. Iyer, A Fault-Tolerant Programmable Voter for Software-Based N- Modular Redundancy, Proc. IEEE Aerospace Conf., pp. 1-20, March 2012, do: /AERO [26] S. Feng, S. Gupta, A. Ansar, and S. Mahlke, Shoestrng: Probablstc Soft Error Relablty on the Cheap, Proc. 15th Archtectural Support for Programmng Languages and Operatng Syst. (ASPLOS 10), pp , 2010, do: / [27] M.R. Guthaus, J. S. Rngenberg, and D. Ernst, MBench: A free, commercally representatve embedded benchmark sute, Proc. IEEE Int l Workshop on Workload Characterzaton (WWC-4), pp. 3-14, Dec. 2001, do: /WWC [28] D. Rhodes and R. Dck, TGFF: Task Graphs for Free, Proc. 6th Int l Workshop on Hardware/Software Codesgn (CODES/CASHE '98), pp , Mar 1998, do: /HSC [29] T. Tobta and H. Kasahara, A standard task graph set for far evaluaton of multprocessor schedulng algorthms, J. Schedulng, vol. 5, no. 5, pp , Sep. 2002, do: /jos.116. [30] S.-H. Kang, H. Yang, K. Sungchan, I. Bacvarov, S. Ha, and L. Thele, Relablty-aware mappng optmzaton of mult-core systems wth mxed-crtcalty, Proc. Desgn, Automaton and Test n Europe Conf. and Exhbton (DATE 14), pp. 1-4, March 2014, do: /DATE [31] J.C. Smolens, B.T. Gold, J. Km, B. Falsaf, J.C. Hoe, A.G. Nowatzyk, Fngerprntng: Boundng Soft-Error-Detecton Latency and Bandwdth, IEEE Mcro, vol. 24, no. 6, pp , Nov./Dec. 2004, do: /mm [32] J. Lee, B. Yun, and K. G. Shn, Reducng Peak Power Consumpton n Mult-Core Systems wthout Volatng Real-Tme Constrants, IEEE Trans. Parall. Dstr. Syst., vol. 25, no. 4, pp , Aprl 2014, do: /TPDS [33] S. Saha, J. S. Deogun, Y. Lu, Adaptve energy-effcent task parttonng for heterogeneous mult-core multprocessor realtme systems, Int l Conf. Hgh Performance Computng and Smulaton (HPCS), pp , July 2012, do: /HPCSm [34] S. Rehman, F. Krebel, M. Shafque, J. Henkel, Relablty- Drven Software Transformatons for Unrelable Hardware, IEEE Trans. Comput.-Ad. Des. Integr. Crcuts Syst., vol. 33, no. 11, pp , Nov. 2014, do: /TCAD [35] T. Mller, N. Surapanen, R. Teodorescu, Flexble Error Protecton for Energy Effcent Relable Archtectures, 22nd Int l Symp. Comput. Arch. and Hgh Performance Comput. (SBAC-PAD), pp. 1-8, Oct. 2010, do: /SBAC-PAD [36] R. Jeyapaul, F. Hong, A. Rhsheekesan, A. Shrvastava, K. Lee, UnSync-CMP: Multcore CMP Archtecture for Energy- Effcent Soft-Error Relablty, IEEE Trans. Parall. Dstr. Syst., vol. 25, no. 1, pp , Jan. 2014, do: /TPDS [37] R. Vadlaman, J. Zhao, W. Burleson, and R. Tesser, Multcore soft error rate stablzaton usng adaptve dual modular redundancy, Proc. Desgn, Automaton and Test n Europe Conf. and Exhbton (DATE'10), pp , March 2010, do: /DATE [38] T. We, P. Mshra, K. Wu, H. Lang, Fxed-Prorty Allocaton and Schedulng for Energy-Effcent Fault Tolerance n Hard Real-Tme Multprocessor Systems, IEEE Trans. Parall. Dstr. Syst., vol. 19, no. 11, pp , Nov. 2008, do: /TPDS [39] H. Topcuoglu, S. Harr, M.-Y. Wu, Performance-effectve and low-complexty task schedulng for heterogeneous computng, IEEE Trans. Parall. Dstr. Syst., vol. 13, no. 3, pp , Mar 2002, do: / Mohammad Saleh receved the M.S. degree n computer engneerng from Sharf Unversty of Technology, Tehran, Iran, n 2010, where he s currently workng toward the Ph.D. degree n computer engneerng. From 2014 to 2015, he was a vstng researcher n the Char for Embedded Systems CES, Karlsruhe Insttute of Technology (KIT), Germany. Hs research nterests nclude low-power desgn of embedded systems, mult-/many-core systems wth a focus on dependablty/relablty, low power, and the tradeoff between the fault tolerance and energy effcency n real-tme systems. Alreza Ejlal s an Assocate Professor of Computer Engneerng at Sharf Unversty of Technology, Tehran, Iran. He receved a Ph.D. degree n computer engneerng from Sharf Unversty of Technology n From 2005 to 2006, he was a vstng researcher n the Electronc Systems Desgn Group, Unversty of Southampton, UK. In 2006 he joned Sharf Unversty of Technology as a faculty member n the department of computer engneerng and from 2011 to 2015 he was the drector of Computer Archtecture Group n ths department. Hs research nterests nclude low power desgn, real-tme embedded systems, and fault-tolerant embedded systems. Bashr M. Al-Hashm (M 99 SM 01 F 09) s a Professor of computer engneerng, Dean of Faculty of Scences and Engneerng, and the Drector of the Pervasve Systems Center, Unversty of Southampton, U.K. He s ARM Professor of computer engneerng and the Co-Drector of the ARM-ECS Research Center. Hs research nterests nclude methods, algorthms, and desgn automaton tools for low-power desgn and test of embedded systems.

Real-Time Systems. Multiprocessor scheduling. Multiprocessor scheduling. Multiprocessor scheduling

Real-Time Systems. Multiprocessor scheduling. Multiprocessor scheduling. Multiprocessor scheduling Real-Tme Systems Multprocessor schedulng Specfcaton Implementaton Verfcaton Multprocessor schedulng -- -- Global schedulng How are tasks assgned to processors? Statc assgnment The processor(s) used for

More information

Resource Allocation with a Budget Constraint for Computing Independent Tasks in the Cloud

Resource Allocation with a Budget Constraint for Computing Independent Tasks in the Cloud Resource Allocaton wth a Budget Constrant for Computng Independent Tasks n the Cloud Wemng Sh and Bo Hong School of Electrcal and Computer Engneerng Georga Insttute of Technology, USA 2nd IEEE Internatonal

More information

The Minimum Universal Cost Flow in an Infeasible Flow Network

The Minimum Universal Cost Flow in an Infeasible Flow Network Journal of Scences, Islamc Republc of Iran 17(2): 175-180 (2006) Unversty of Tehran, ISSN 1016-1104 http://jscencesutacr The Mnmum Unversal Cost Flow n an Infeasble Flow Network H Saleh Fathabad * M Bagheran

More information

Embedded Systems. 4. Aperiodic and Periodic Tasks

Embedded Systems. 4. Aperiodic and Periodic Tasks Embedded Systems 4. Aperodc and Perodc Tasks Lothar Thele 4-1 Contents of Course 1. Embedded Systems Introducton 2. Software Introducton 7. System Components 10. Models 3. Real-Tme Models 4. Perodc/Aperodc

More information

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009 College of Computer & Informaton Scence Fall 2009 Northeastern Unversty 20 October 2009 CS7880: Algorthmc Power Tools Scrbe: Jan Wen and Laura Poplawsk Lecture Outlne: Prmal-dual schema Network Desgn:

More information

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

Module 9. Lecture 6. Duality in Assignment Problems

Module 9. Lecture 6. Duality in Assignment Problems Module 9 1 Lecture 6 Dualty n Assgnment Problems In ths lecture we attempt to answer few other mportant questons posed n earler lecture for (AP) and see how some of them can be explaned through the concept

More information

Energy-Efficient Primary/Backup Scheduling Techniques for Heterogeneous Multicore Systems

Energy-Efficient Primary/Backup Scheduling Techniques for Heterogeneous Multicore Systems Energy-Effcent Prmary/Backup Schedulng Technques for Heterogeneous Multcore Systems Abhshek Roy, Hakan Aydn epartment of Computer Scence George Mason Unversty Farfax, Vrgna 22030 aroy6@gmu.edu, aydn@cs.gmu.edu

More information

Department of Electrical & Electronic Engineeing Imperial College London. E4.20 Digital IC Design. Median Filter Project Specification

Department of Electrical & Electronic Engineeing Imperial College London. E4.20 Digital IC Design. Median Filter Project Specification Desgn Project Specfcaton Medan Flter Department of Electrcal & Electronc Engneeng Imperal College London E4.20 Dgtal IC Desgn Medan Flter Project Specfcaton A medan flter s used to remove nose from a sampled

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS Avalable onlne at http://sck.org J. Math. Comput. Sc. 3 (3), No., 6-3 ISSN: 97-537 COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

More information

A FAST HEURISTIC FOR TASKS ASSIGNMENT IN MANYCORE SYSTEMS WITH VOLTAGE-FREQUENCY ISLANDS

A FAST HEURISTIC FOR TASKS ASSIGNMENT IN MANYCORE SYSTEMS WITH VOLTAGE-FREQUENCY ISLANDS Shervn Haamn A FAST HEURISTIC FOR TASKS ASSIGNMENT IN MANYCORE SYSTEMS WITH VOLTAGE-FREQUENCY ISLANDS INTRODUCTION Increasng computatons n applcatons has led to faster processng. o Use more cores n a chp

More information

Lecture 4: November 17, Part 1 Single Buffer Management

Lecture 4: November 17, Part 1 Single Buffer Management Lecturer: Ad Rosén Algorthms for the anagement of Networs Fall 2003-2004 Lecture 4: November 7, 2003 Scrbe: Guy Grebla Part Sngle Buffer anagement In the prevous lecture we taled about the Combned Input

More information

Structure and Drive Paul A. Jensen Copyright July 20, 2003

Structure and Drive Paul A. Jensen Copyright July 20, 2003 Structure and Drve Paul A. Jensen Copyrght July 20, 2003 A system s made up of several operatons wth flow passng between them. The structure of the system descrbes the flow paths from nputs to outputs.

More information

CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016

CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016 CS 29-128: Algorthms and Uncertanty Lecture 17 Date: October 26, 2016 Instructor: Nkhl Bansal Scrbe: Mchael Denns 1 Introducton In ths lecture we wll be lookng nto the secretary problem, and an nterestng

More information

Minimizing Energy Consumption of MPI Programs in Realistic Environment

Minimizing Energy Consumption of MPI Programs in Realistic Environment Mnmzng Energy Consumpton of MPI Programs n Realstc Envronment Amna Guermouche, Ncolas Trquenaux, Benoît Pradelle and Wllam Jalby Unversté de Versalles Sant-Quentn-en-Yvelnes arxv:1502.06733v2 [cs.dc] 25

More information

NP-Completeness : Proofs

NP-Completeness : Proofs NP-Completeness : Proofs Proof Methods A method to show a decson problem Π NP-complete s as follows. (1) Show Π NP. (2) Choose an NP-complete problem Π. (3) Show Π Π. A method to show an optmzaton problem

More information

Single-Facility Scheduling over Long Time Horizons by Logic-based Benders Decomposition

Single-Facility Scheduling over Long Time Horizons by Logic-based Benders Decomposition Sngle-Faclty Schedulng over Long Tme Horzons by Logc-based Benders Decomposton Elvn Coban and J. N. Hooker Tepper School of Busness, Carnege Mellon Unversty ecoban@andrew.cmu.edu, john@hooker.tepper.cmu.edu

More information

Appendix B: Resampling Algorithms

Appendix B: Resampling Algorithms 407 Appendx B: Resamplng Algorthms A common problem of all partcle flters s the degeneracy of weghts, whch conssts of the unbounded ncrease of the varance of the mportance weghts ω [ ] of the partcles

More information

ECE559VV Project Report

ECE559VV Project Report ECE559VV Project Report (Supplementary Notes Loc Xuan Bu I. MAX SUM-RATE SCHEDULING: THE UPLINK CASE We have seen (n the presentaton that, for downlnk (broadcast channels, the strategy maxmzng the sum-rate

More information

Calculation of time complexity (3%)

Calculation of time complexity (3%) Problem 1. (30%) Calculaton of tme complexty (3%) Gven n ctes, usng exhaust search to see every result takes O(n!). Calculaton of tme needed to solve the problem (2%) 40 ctes:40! dfferent tours 40 add

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have

More information

Chapter - 2. Distribution System Power Flow Analysis

Chapter - 2. Distribution System Power Flow Analysis Chapter - 2 Dstrbuton System Power Flow Analyss CHAPTER - 2 Radal Dstrbuton System Load Flow 2.1 Introducton Load flow s an mportant tool [66] for analyzng electrcal power system network performance. Load

More information

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve

More information

EEL 6266 Power System Operation and Control. Chapter 3 Economic Dispatch Using Dynamic Programming

EEL 6266 Power System Operation and Control. Chapter 3 Economic Dispatch Using Dynamic Programming EEL 6266 Power System Operaton and Control Chapter 3 Economc Dspatch Usng Dynamc Programmng Pecewse Lnear Cost Functons Common practce many utltes prefer to represent ther generator cost functons as sngle-

More information

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution Department of Statstcs Unversty of Toronto STA35HS / HS Desgn and Analyss of Experments Term Test - Wnter - Soluton February, Last Name: Frst Name: Student Number: Instructons: Tme: hours. Ads: a non-programmable

More information

Temperature. Chapter Heat Engine

Temperature. Chapter Heat Engine Chapter 3 Temperature In prevous chapters of these notes we ntroduced the Prncple of Maxmum ntropy as a technque for estmatng probablty dstrbutons consstent wth constrants. In Chapter 9 we dscussed the

More information

On the Multicriteria Integer Network Flow Problem

On the Multicriteria Integer Network Flow Problem BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 5, No 2 Sofa 2005 On the Multcrtera Integer Network Flow Problem Vassl Vasslev, Marana Nkolova, Maryana Vassleva Insttute of

More information

COS 521: Advanced Algorithms Game Theory and Linear Programming

COS 521: Advanced Algorithms Game Theory and Linear Programming COS 521: Advanced Algorthms Game Theory and Lnear Programmng Moses Charkar February 27, 2013 In these notes, we ntroduce some basc concepts n game theory and lnear programmng (LP). We show a connecton

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Simultaneous Optimization of Berth Allocation, Quay Crane Assignment and Quay Crane Scheduling Problems in Container Terminals

Simultaneous Optimization of Berth Allocation, Quay Crane Assignment and Quay Crane Scheduling Problems in Container Terminals Smultaneous Optmzaton of Berth Allocaton, Quay Crane Assgnment and Quay Crane Schedulng Problems n Contaner Termnals Necat Aras, Yavuz Türkoğulları, Z. Caner Taşkın, Kuban Altınel Abstract In ths work,

More information

Foundations of Arithmetic

Foundations of Arithmetic Foundatons of Arthmetc Notaton We shall denote the sum and product of numbers n the usual notaton as a 2 + a 2 + a 3 + + a = a, a 1 a 2 a 3 a = a The notaton a b means a dvdes b,.e. ac = b where c s an

More information

Two Methods to Release a New Real-time Task

Two Methods to Release a New Real-time Task Two Methods to Release a New Real-tme Task Abstract Guangmng Qan 1, Xanghua Chen 2 College of Mathematcs and Computer Scence Hunan Normal Unversty Changsha, 410081, Chna qqyy@hunnu.edu.cn Gang Yao 3 Sebel

More information

Global EDF Scheduling for Parallel Real-Time Tasks

Global EDF Scheduling for Parallel Real-Time Tasks Washngton Unversty n St. Lous Washngton Unversty Open Scholarshp Engneerng and Appled Scence Theses & Dssertatons Engneerng and Appled Scence Sprng 5-15-2014 Global EDF Schedulng for Parallel Real-Tme

More information

The optimal delay of the second test is therefore approximately 210 hours earlier than =2.

The optimal delay of the second test is therefore approximately 210 hours earlier than =2. THE IEC 61508 FORMULAS 223 The optmal delay of the second test s therefore approxmately 210 hours earler than =2. 8.4 The IEC 61508 Formulas IEC 61508-6 provdes approxmaton formulas for the PF for smple

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

Outline. Communication. Bellman Ford Algorithm. Bellman Ford Example. Bellman Ford Shortest Path [1]

Outline. Communication. Bellman Ford Algorithm. Bellman Ford Example. Bellman Ford Shortest Path [1] DYNAMIC SHORTEST PATH SEARCH AND SYNCHRONIZED TASK SWITCHING Jay Wagenpfel, Adran Trachte 2 Outlne Shortest Communcaton Path Searchng Bellmann Ford algorthm Algorthm for dynamc case Modfcatons to our algorthm

More information

Assortment Optimization under MNL

Assortment Optimization under MNL Assortment Optmzaton under MNL Haotan Song Aprl 30, 2017 1 Introducton The assortment optmzaton problem ams to fnd the revenue-maxmzng assortment of products to offer when the prces of products are fxed.

More information

DUE: WEDS FEB 21ST 2018

DUE: WEDS FEB 21ST 2018 HOMEWORK # 1: FINITE DIFFERENCES IN ONE DIMENSION DUE: WEDS FEB 21ST 2018 1. Theory Beam bendng s a classcal engneerng analyss. The tradtonal soluton technque makes smplfyng assumptons such as a constant

More information

MMA and GCMMA two methods for nonlinear optimization

MMA and GCMMA two methods for nonlinear optimization MMA and GCMMA two methods for nonlnear optmzaton Krster Svanberg Optmzaton and Systems Theory, KTH, Stockholm, Sweden. krlle@math.kth.se Ths note descrbes the algorthms used n the author s 2007 mplementatons

More information

HMMT February 2016 February 20, 2016

HMMT February 2016 February 20, 2016 HMMT February 016 February 0, 016 Combnatorcs 1. For postve ntegers n, let S n be the set of ntegers x such that n dstnct lnes, no three concurrent, can dvde a plane nto x regons (for example, S = {3,

More information

CONTRAST ENHANCEMENT FOR MIMIMUM MEAN BRIGHTNESS ERROR FROM HISTOGRAM PARTITIONING INTRODUCTION

CONTRAST ENHANCEMENT FOR MIMIMUM MEAN BRIGHTNESS ERROR FROM HISTOGRAM PARTITIONING INTRODUCTION CONTRAST ENHANCEMENT FOR MIMIMUM MEAN BRIGHTNESS ERROR FROM HISTOGRAM PARTITIONING N. Phanthuna 1,2, F. Cheevasuvt 2 and S. Chtwong 2 1 Department of Electrcal Engneerng, Faculty of Engneerng Rajamangala

More information

Design and Optimization of Fuzzy Controller for Inverse Pendulum System Using Genetic Algorithm

Design and Optimization of Fuzzy Controller for Inverse Pendulum System Using Genetic Algorithm Desgn and Optmzaton of Fuzzy Controller for Inverse Pendulum System Usng Genetc Algorthm H. Mehraban A. Ashoor Unversty of Tehran Unversty of Tehran h.mehraban@ece.ut.ac.r a.ashoor@ece.ut.ac.r Abstract:

More information

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.

More information

Pop-Click Noise Detection Using Inter-Frame Correlation for Improved Portable Auditory Sensing

Pop-Click Noise Detection Using Inter-Frame Correlation for Improved Portable Auditory Sensing Advanced Scence and Technology Letters, pp.164-168 http://dx.do.org/10.14257/astl.2013 Pop-Clc Nose Detecton Usng Inter-Frame Correlaton for Improved Portable Audtory Sensng Dong Yun Lee, Kwang Myung Jeon,

More information

Lab 2e Thermal System Response and Effective Heat Transfer Coefficient

Lab 2e Thermal System Response and Effective Heat Transfer Coefficient 58:080 Expermental Engneerng 1 OBJECTIVE Lab 2e Thermal System Response and Effectve Heat Transfer Coeffcent Warnng: though the experment has educatonal objectves (to learn about bolng heat transfer, etc.),

More information

An Interactive Optimisation Tool for Allocation Problems

An Interactive Optimisation Tool for Allocation Problems An Interactve Optmsaton ool for Allocaton Problems Fredr Bonäs, Joam Westerlund and apo Westerlund Process Desgn Laboratory, Faculty of echnology, Åbo Aadem Unversty, uru 20500, Fnland hs paper presents

More information

Simultaneous Hardware and Time Redundancy with Online Task Scheduling for Low Energy Highly Reliable Standby-Sparing System

Simultaneous Hardware and Time Redundancy with Online Task Scheduling for Low Energy Highly Reliable Standby-Sparing System Smultaneous Hardware and Tme Redundancy wth Onlne Task Schedulng for Low Energy Hghly Relable Standby-Sparng System MOHAMMAD KHAVARI TAVANA, Sharf Unversty of Technology NASIBEH TEIMOURI, Sharf Unversty

More information

NUMERICAL DIFFERENTIATION

NUMERICAL DIFFERENTIATION NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the

More information

Winter 2008 CS567 Stochastic Linear/Integer Programming Guest Lecturer: Xu, Huan

Winter 2008 CS567 Stochastic Linear/Integer Programming Guest Lecturer: Xu, Huan Wnter 2008 CS567 Stochastc Lnear/Integer Programmng Guest Lecturer: Xu, Huan Class 2: More Modelng Examples 1 Capacty Expanson Capacty expanson models optmal choces of the tmng and levels of nvestments

More information

This column is a continuation of our previous column

This column is a continuation of our previous column Comparson of Goodness of Ft Statstcs for Lnear Regresson, Part II The authors contnue ther dscusson of the correlaton coeffcent n developng a calbraton for quanttatve analyss. Jerome Workman Jr. and Howard

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

Last Time. Priority-based scheduling. Schedulable utilization Rate monotonic rule: Keep utilization below 69% Static priorities Dynamic priorities

Last Time. Priority-based scheduling. Schedulable utilization Rate monotonic rule: Keep utilization below 69% Static priorities Dynamic priorities Last Tme Prorty-based schedulng Statc prortes Dynamc prortes Schedulable utlzaton Rate monotonc rule: Keep utlzaton below 69% Today Response tme analyss Blockng terms Prorty nverson And solutons Release

More information

1 Derivation of Rate Equations from Single-Cell Conductance (Hodgkin-Huxley-like) Equations

1 Derivation of Rate Equations from Single-Cell Conductance (Hodgkin-Huxley-like) Equations Physcs 171/271 -Davd Klenfeld - Fall 2005 (revsed Wnter 2011) 1 Dervaton of Rate Equatons from Sngle-Cell Conductance (Hodgkn-Huxley-lke) Equatons We consder a network of many neurons, each of whch obeys

More information

A 2D Bounded Linear Program (H,c) 2D Linear Programming

A 2D Bounded Linear Program (H,c) 2D Linear Programming A 2D Bounded Lnear Program (H,c) h 3 v h 8 h 5 c h 4 h h 6 h 7 h 2 2D Lnear Programmng C s a polygonal regon, the ntersecton of n halfplanes. (H, c) s nfeasble, as C s empty. Feasble regon C s unbounded

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

System in Weibull Distribution

System in Weibull Distribution Internatonal Matheatcal Foru 4 9 no. 9 94-95 Relablty Equvalence Factors of a Seres-Parallel Syste n Webull Dstrbuton M. A. El-Dacese Matheatcs Departent Faculty of Scence Tanta Unversty Tanta Egypt eldacese@yahoo.co

More information

x = , so that calculated

x = , so that calculated Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to

More information

9 Derivation of Rate Equations from Single-Cell Conductance (Hodgkin-Huxley-like) Equations

9 Derivation of Rate Equations from Single-Cell Conductance (Hodgkin-Huxley-like) Equations Physcs 171/271 - Chapter 9R -Davd Klenfeld - Fall 2005 9 Dervaton of Rate Equatons from Sngle-Cell Conductance (Hodgkn-Huxley-lke) Equatons We consder a network of many neurons, each of whch obeys a set

More information

Topic 23 - Randomized Complete Block Designs (RCBD)

Topic 23 - Randomized Complete Block Designs (RCBD) Topc 3 ANOVA (III) 3-1 Topc 3 - Randomzed Complete Block Desgns (RCBD) Defn: A Randomzed Complete Block Desgn s a varant of the completely randomzed desgn (CRD) that we recently learned. In ths desgn,

More information

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty Addtonal Codes usng Fnte Dfference Method Benamn Moll 1 HJB Equaton for Consumpton-Savng Problem Wthout Uncertanty Before consderng the case wth stochastc ncome n http://www.prnceton.edu/~moll/ HACTproect/HACT_Numercal_Appendx.pdf,

More information

Grover s Algorithm + Quantum Zeno Effect + Vaidman

Grover s Algorithm + Quantum Zeno Effect + Vaidman Grover s Algorthm + Quantum Zeno Effect + Vadman CS 294-2 Bomb 10/12/04 Fall 2004 Lecture 11 Grover s algorthm Recall that Grover s algorthm for searchng over a space of sze wors as follows: consder the

More information

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests Smulated of the Cramér-von Mses Goodness-of-Ft Tests Steele, M., Chaselng, J. and 3 Hurst, C. School of Mathematcal and Physcal Scences, James Cook Unversty, Australan School of Envronmental Studes, Grffth

More information

Chapter Newton s Method

Chapter Newton s Method Chapter 9. Newton s Method After readng ths chapter, you should be able to:. Understand how Newton s method s dfferent from the Golden Secton Search method. Understand how Newton s method works 3. Solve

More information

A new construction of 3-separable matrices via an improved decoding of Macula s construction

A new construction of 3-separable matrices via an improved decoding of Macula s construction Dscrete Optmzaton 5 008 700 704 Contents lsts avalable at ScenceDrect Dscrete Optmzaton journal homepage: wwwelsevercom/locate/dsopt A new constructon of 3-separable matrces va an mproved decodng of Macula

More information

Variability-Driven Module Selection with Joint Design Time Optimization and Post-Silicon Tuning

Variability-Driven Module Selection with Joint Design Time Optimization and Post-Silicon Tuning Asa and South Pacfc Desgn Automaton Conference 2008 Varablty-Drven Module Selecton wth Jont Desgn Tme Optmzaton and Post-Slcon Tunng Feng Wang, Xaoxa Wu, Yuan Xe The Pennsylvana State Unversty Department

More information

Uncertainty in measurements of power and energy on power networks

Uncertainty in measurements of power and energy on power networks Uncertanty n measurements of power and energy on power networks E. Manov, N. Kolev Department of Measurement and Instrumentaton, Techncal Unversty Sofa, bul. Klment Ohrdsk No8, bl., 000 Sofa, Bulgara Tel./fax:

More information

Edge Isoperimetric Inequalities

Edge Isoperimetric Inequalities November 7, 2005 Ross M. Rchardson Edge Isopermetrc Inequaltes 1 Four Questons Recall that n the last lecture we looked at the problem of sopermetrc nequaltes n the hypercube, Q n. Our noton of boundary

More information

Some modelling aspects for the Matlab implementation of MMA

Some modelling aspects for the Matlab implementation of MMA Some modellng aspects for the Matlab mplementaton of MMA Krster Svanberg krlle@math.kth.se Optmzaton and Systems Theory Department of Mathematcs KTH, SE 10044 Stockholm September 2004 1. Consdered optmzaton

More information

8 Derivation of Network Rate Equations from Single- Cell Conductance Equations

8 Derivation of Network Rate Equations from Single- Cell Conductance Equations Physcs 178/278 - Davd Klenfeld - Wnter 2015 8 Dervaton of Network Rate Equatons from Sngle- Cell Conductance Equatons We consder a network of many neurons, each of whch obeys a set of conductancebased,

More information

Annexes. EC.1. Cycle-base move illustration. EC.2. Problem Instances

Annexes. EC.1. Cycle-base move illustration. EC.2. Problem Instances ec Annexes Ths Annex frst llustrates a cycle-based move n the dynamc-block generaton tabu search. It then dsplays the characterstcs of the nstance sets, followed by detaled results of the parametercalbraton

More information

Speeding up Computation of Scalar Multiplication in Elliptic Curve Cryptosystem

Speeding up Computation of Scalar Multiplication in Elliptic Curve Cryptosystem H.K. Pathak et. al. / (IJCSE) Internatonal Journal on Computer Scence and Engneerng Speedng up Computaton of Scalar Multplcaton n Ellptc Curve Cryptosystem H. K. Pathak Manju Sangh S.o.S n Computer scence

More information

One-sided finite-difference approximations suitable for use with Richardson extrapolation

One-sided finite-difference approximations suitable for use with Richardson extrapolation Journal of Computatonal Physcs 219 (2006) 13 20 Short note One-sded fnte-dfference approxmatons sutable for use wth Rchardson extrapolaton Kumar Rahul, S.N. Bhattacharyya * Department of Mechancal Engneerng,

More information

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE Analytcal soluton s usually not possble when exctaton vares arbtrarly wth tme or f the system s nonlnear. Such problems can be solved by numercal tmesteppng

More information

Integrated approach in solving parallel machine scheduling and location (ScheLoc) problem

Integrated approach in solving parallel machine scheduling and location (ScheLoc) problem Internatonal Journal of Industral Engneerng Computatons 7 (2016) 573 584 Contents lsts avalable at GrowngScence Internatonal Journal of Industral Engneerng Computatons homepage: www.growngscence.com/ec

More information

Graph Reconstruction by Permutations

Graph Reconstruction by Permutations Graph Reconstructon by Permutatons Perre Ille and Wllam Kocay* Insttut de Mathémathques de Lumny CNRS UMR 6206 163 avenue de Lumny, Case 907 13288 Marselle Cedex 9, France e-mal: lle@ml.unv-mrs.fr Computer

More information

4DVAR, according to the name, is a four-dimensional variational method.

4DVAR, according to the name, is a four-dimensional variational method. 4D-Varatonal Data Assmlaton (4D-Var) 4DVAR, accordng to the name, s a four-dmensonal varatonal method. 4D-Var s actually a drect generalzaton of 3D-Var to handle observatons that are dstrbuted n tme. The

More information

Difference Equations

Difference Equations Dfference Equatons c Jan Vrbk 1 Bascs Suppose a sequence of numbers, say a 0,a 1,a,a 3,... s defned by a certan general relatonshp between, say, three consecutve values of the sequence, e.g. a + +3a +1

More information

An Admission Control Algorithm in Cloud Computing Systems

An Admission Control Algorithm in Cloud Computing Systems An Admsson Control Algorthm n Cloud Computng Systems Authors: Frank Yeong-Sung Ln Department of Informaton Management Natonal Tawan Unversty Tape, Tawan, R.O.C. ysln@m.ntu.edu.tw Yngje Lan Management Scence

More information

EEE 241: Linear Systems

EEE 241: Linear Systems EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they

More information

Errors for Linear Systems

Errors for Linear Systems Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch

More information

CHAPTER 17 Amortized Analysis

CHAPTER 17 Amortized Analysis CHAPTER 7 Amortzed Analyss In an amortzed analyss, the tme requred to perform a sequence of data structure operatons s averaged over all the operatons performed. It can be used to show that the average

More information

The Order Relation and Trace Inequalities for. Hermitian Operators

The Order Relation and Trace Inequalities for. Hermitian Operators Internatonal Mathematcal Forum, Vol 3, 08, no, 507-57 HIKARI Ltd, wwwm-hkarcom https://doorg/0988/mf088055 The Order Relaton and Trace Inequaltes for Hermtan Operators Y Huang School of Informaton Scence

More information

Energy-Aware Standby-Sparing on Heterogeneous Multicore Systems

Energy-Aware Standby-Sparing on Heterogeneous Multicore Systems Energy-Aware Standby-Sparng on Heterogeneous Multcore Systems ABSTRACT Abhshek Roy, Hakan Aydn epartment of Computer Scence George Mason Unversty Farfax, Vrgna 220 aroy6@gmu.edu, aydn@cs.gmu.edu Standby-sparng

More information

Physics 5153 Classical Mechanics. D Alembert s Principle and The Lagrangian-1

Physics 5153 Classical Mechanics. D Alembert s Principle and The Lagrangian-1 P. Guterrez Physcs 5153 Classcal Mechancs D Alembert s Prncple and The Lagrangan 1 Introducton The prncple of vrtual work provdes a method of solvng problems of statc equlbrum wthout havng to consder the

More information

Some Comments on Accelerating Convergence of Iterative Sequences Using Direct Inversion of the Iterative Subspace (DIIS)

Some Comments on Accelerating Convergence of Iterative Sequences Using Direct Inversion of the Iterative Subspace (DIIS) Some Comments on Acceleratng Convergence of Iteratve Sequences Usng Drect Inverson of the Iteratve Subspace (DIIS) C. Davd Sherrll School of Chemstry and Bochemstry Georga Insttute of Technology May 1998

More information

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011 Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected

More information

Copyright 2017 by Taylor Enterprises, Inc., All Rights Reserved. Adjusted Control Limits for P Charts. Dr. Wayne A. Taylor

Copyright 2017 by Taylor Enterprises, Inc., All Rights Reserved. Adjusted Control Limits for P Charts. Dr. Wayne A. Taylor Taylor Enterprses, Inc. Control Lmts for P Charts Copyrght 2017 by Taylor Enterprses, Inc., All Rghts Reserved. Control Lmts for P Charts Dr. Wayne A. Taylor Abstract: P charts are used for count data

More information

Every planar graph is 4-colourable a proof without computer

Every planar graph is 4-colourable a proof without computer Peter Dörre Department of Informatcs and Natural Scences Fachhochschule Südwestfalen (Unversty of Appled Scences) Frauenstuhlweg 31, D-58644 Iserlohn, Germany Emal: doerre(at)fh-swf.de Mathematcs Subject

More information

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg prnceton unv. F 17 cos 521: Advanced Algorthm Desgn Lecture 7: LP Dualty Lecturer: Matt Wenberg Scrbe: LP Dualty s an extremely useful tool for analyzng structural propertes of lnear programs. Whle there

More information

Economics 101. Lecture 4 - Equilibrium and Efficiency

Economics 101. Lecture 4 - Equilibrium and Efficiency Economcs 0 Lecture 4 - Equlbrum and Effcency Intro As dscussed n the prevous lecture, we wll now move from an envronment where we looed at consumers mang decsons n solaton to analyzng economes full of

More information

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law:

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law: CE304, Sprng 2004 Lecture 4 Introducton to Vapor/Lqud Equlbrum, part 2 Raoult s Law: The smplest model that allows us do VLE calculatons s obtaned when we assume that the vapor phase s an deal gas, and

More information

A Simple Inventory System

A Simple Inventory System A Smple Inventory System Lawrence M. Leems and Stephen K. Park, Dscrete-Event Smulaton: A Frst Course, Prentce Hall, 2006 Hu Chen Computer Scence Vrgna State Unversty Petersburg, Vrgna February 8, 2017

More information

( ) = ( ) + ( 0) ) ( )

( ) = ( ) + ( 0) ) ( ) EETOMAGNETI OMPATIBIITY HANDBOOK 1 hapter 9: Transent Behavor n the Tme Doman 9.1 Desgn a crcut usng reasonable values for the components that s capable of provdng a tme delay of 100 ms to a dgtal sgnal.

More information

Solutions to exam in SF1811 Optimization, Jan 14, 2015

Solutions to exam in SF1811 Optimization, Jan 14, 2015 Solutons to exam n SF8 Optmzaton, Jan 4, 25 3 3 O------O -4 \ / \ / The network: \/ where all lnks go from left to rght. /\ / \ / \ 6 O------O -5 2 4.(a) Let x = ( x 3, x 4, x 23, x 24 ) T, where the varable

More information

Coarse-Grain MTCMOS Sleep

Coarse-Grain MTCMOS Sleep Coarse-Gran MTCMOS Sleep Transstor Szng Usng Delay Budgetng Ehsan Pakbazna and Massoud Pedram Unversty of Southern Calforna Dept. of Electrcal Engneerng DATE-08 Munch, Germany Leakage n CMOS Technology

More information

Copyright 2017 by Taylor Enterprises, Inc., All Rights Reserved. Adjusted Control Limits for U Charts. Dr. Wayne A. Taylor

Copyright 2017 by Taylor Enterprises, Inc., All Rights Reserved. Adjusted Control Limits for U Charts. Dr. Wayne A. Taylor Taylor Enterprses, Inc. Adjusted Control Lmts for U Charts Copyrght 207 by Taylor Enterprses, Inc., All Rghts Reserved. Adjusted Control Lmts for U Charts Dr. Wayne A. Taylor Abstract: U charts are used

More information

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis Resource Allocaton and Decson Analss (ECON 800) Sprng 04 Foundatons of Regresson Analss Readng: Regresson Analss (ECON 800 Coursepak, Page 3) Defntons and Concepts: Regresson Analss statstcal technques

More information

Highly-Efficient Multi-Coil Wireless Power Transfer (WPT)

Highly-Efficient Multi-Coil Wireless Power Transfer (WPT) Hghly-Effcent Mult-Col Wreless Power Transfer WPT Mehd Kan May 3, 04 GT-Boncs ab, School of Electrcal and Computer Engneerng Georga Insttute of Technology, Atlanta, GA www.gtboncs.org WPT Applcatons Chargng

More information