Precise and Efficient FIFO-Replacement Analysis Based on Static Phase Detection

Similar documents
1 PYTHAGORAS THEOREM 1. Given a right angled triangle, the square of the hypotenuse is equal to the sum of the squares of the other two sides.

Project 6: Minigoals Towards Simplifying and Rewriting Expressions

Algorithm Design and Analysis

Introduction to Olympiad Inequalities

Algorithm Design and Analysis

Chapter 4 State-Space Planning

CS 573 Automata Theory and Formal Languages

6.5 Improper integrals

Lecture Notes No. 10

Algorithms & Data Structures Homework 8 HS 18 Exercise Class (Room & TA): Submitted by: Peer Feedback by: Points:

Precise timing analysis for direct-mapped caches

A Lower Bound for the Length of a Partial Transversal in a Latin Square, Revised Version

Technische Universität München Winter term 2009/10 I7 Prof. J. Esparza / J. Křetínský / M. Luttenberger 11. Februar Solution

Math 32B Discussion Session Week 8 Notes February 28 and March 2, f(b) f(a) = f (t)dt (1)

TIME AND STATE IN DISTRIBUTED SYSTEMS

NON-DETERMINISTIC FSA

CS 491G Combinatorial Optimization Lecture Notes

Line Integrals and Entire Functions

University of Sioux Falls. MAT204/205 Calculus I/II

Test Generation from Timed Input Output Automata

22: Union Find. CS 473u - Algorithms - Spring April 14, We want to maintain a collection of sets, under the operations of:

System Validation (IN4387) November 2, 2012, 14:00-17:00

ANALYSIS AND MODELLING OF RAINFALL EVENTS

Generalization of 2-Corner Frequency Source Models Used in SMSIM

CS 2204 DIGITAL LOGIC & STATE MACHINE DESIGN SPRING 2014

where the box contains a finite number of gates from the given collection. Examples of gates that are commonly used are the following: a b

Probability. b a b. a b 32.

(a) A partition P of [a, b] is a finite subset of [a, b] containing a and b. If Q is another partition and P Q, then Q is a refinement of P.

Trigonometry Revision Sheet Q5 of Paper 2

Arrow s Impossibility Theorem

Automatic Synthesis of New Behaviors from a Library of Available Behaviors

Behavior Composition in the Presence of Failure

Electromagnetism Notes, NYU Spring 2018

Engr354: Digital Logic Circuits

Lecture 6: Coding theory

INTEGRATION. 1 Integrals of Complex Valued functions of a REAL variable

The University of Nottingham SCHOOL OF COMPUTER SCIENCE A LEVEL 2 MODULE, SPRING SEMESTER MACHINES AND THEIR LANGUAGES ANSWERS

Part I: Study the theorem statement.

QUADRATIC EQUATION. Contents

Learning Partially Observable Markov Models from First Passage Times

Tutorial Worksheet. 1. Find all solutions to the linear system by following the given steps. x + 2y + 3z = 2 2x + 3y + z = 4.

Matrices SCHOOL OF ENGINEERING & BUILT ENVIRONMENT. Mathematics (c) 1. Definition of a Matrix

Chapter 8 Roots and Radicals

PAIR OF LINEAR EQUATIONS IN TWO VARIABLES

, g. Exercise 1. Generator polynomials of a convolutional code, given in binary form, are g. Solution 1.

Discrete Structures Lecture 11

Alpha Algorithm: Limitations

Section 4.4. Green s Theorem

Part 4. Integration (with Proofs)

MAT 403 NOTES 4. f + f =

8 THREE PHASE A.C. CIRCUITS

Section 1.3 Triangles

Chapter 3. Vector Spaces. 3.1 Images and Image Arithmetic

Lecture Summaries for Multivariable Integral Calculus M52B

s the set of onsequenes. Skeptil onsequenes re more roust in the sense tht they hold in ll possile relities desried y defult theory. All its desirle p

A Study on the Properties of Rational Triangles

I1 = I2 I1 = I2 + I3 I1 + I2 = I3 + I4 I 3

Spacetime and the Quantum World Questions Fall 2010

Intermediate Math Circles Wednesday, November 14, 2018 Finite Automata II. Nickolas Rollick a b b. a b 4

Proving the Pythagorean Theorem

For a, b, c, d positive if a b and. ac bd. Reciprocal relations for a and b positive. If a > b then a ab > b. then

5. Every rational number have either terminating or repeating (recurring) decimal representation.

#A42 INTEGERS 11 (2011) ON THE CONDITIONED BINOMIAL COEFFICIENTS

Arrow s Impossibility Theorem

Lesson 2: The Pythagorean Theorem and Similar Triangles. A Brief Review of the Pythagorean Theorem.

Alpha Algorithm: A Process Discovery Algorithm

Symmetrical Components 1

Hyers-Ulam stability of Pielou logistic difference equation

Dorf, R.C., Wan, Z. T- Equivalent Networks The Electrical Engineering Handbook Ed. Richard C. Dorf Boca Raton: CRC Press LLC, 2000

AP Calculus BC Chapter 8: Integration Techniques, L Hopital s Rule and Improper Integrals

Global alignment. Genome Rearrangements Finding preserved genes. Lecture 18

Exercise sheet 6: Solutions

ILLUSTRATING THE EXTENSION OF A SPECIAL PROPERTY OF CUBIC POLYNOMIALS TO NTH DEGREE POLYNOMIALS

2.4 Theoretical Foundations

SECTION A STUDENT MATERIAL. Part 1. What and Why.?

Can one hear the shape of a drum?

Section 6: Area, Volume, and Average Value

THE PYTHAGOREAN THEOREM

CS311 Computational Structures Regular Languages and Regular Grammars. Lecture 6

Review of Gaussian Quadrature method

for all x in [a,b], then the area of the region bounded by the graphs of f and g and the vertical lines x = a and x = b is b [ ( ) ( )] A= f x g x dx

Expand the Shares Together: Envy-free Mechanisms with a Small Number of Cuts

Intermediate Math Circles Wednesday 17 October 2012 Geometry II: Side Lengths

On Implicative and Strong Implicative Filters of Lattice Wajsberg Algebras

Petri Nets. Rebecca Albrecht. Seminar: Automata Theory Chair of Software Engeneering

Figure 1. The left-handed and right-handed trefoils

y1 y2 DEMUX a b x1 x2 x3 x4 NETWORK s1 s2 z1 z2

Polynomials. Polynomials. Curriculum Ready ACMNA:

CS 347 Parallel and Distributed Data Processing

Transition systems (motivation)

p-adic Egyptian Fractions

Activities. 4.1 Pythagoras' Theorem 4.2 Spirals 4.3 Clinometers 4.4 Radar 4.5 Posting Parcels 4.6 Interlocking Pipes 4.7 Sine Rule Notes and Solutions

Efficient Parameterized Algorithms for Data Packing

Logic Synthesis and Verification

Outline. Theory-based Bayesian framework for property induction Causal structure induction

Ling 3701H / Psych 3371H: Lecture Notes 9 Hierarchic Sequential Prediction

Solutions to Assignment 1

Metodologie di progetto HW Technology Mapping. Last update: 19/03/09

2.4 Linear Inequalities and Interval Notation

Nondeterministic Automata vs Deterministic Automata

Transcription:

Preise nd Effiient FIFO-Replement Anlysis Bsed on Stti Phse Detetion Dniel Grund Srlnd University, Srrüken, Germny Jn Reineke Srlnd University, Srrüken, Germny University of Cliforni, Berkeley, USA Astrt Shedulility nlysis for hrd rel-time systems requires ounds on the exeution times of its tsks. To otin useful ounds in the presene of hes, stti timing nlyses must predit he hits nd misses with high preision. For hes with lest-reently-used (LRU) replement poliy, preise nd effiient he nlyses exist. However, other widely used poliies like first-in first-out (FIFO) re inherently hrder to nlyze. The min ontriutions of this pper re preise nd effiient must- nd my-nlyses of FIFO sed on the novel onept of stti phse detetion. The nlyses sttilly prtition sequenes of memory esses s they will our during progrm exeution into phses. If susequent phses ontin esses to the sme (similr) set of memory loks, eh phse ontriutes it to the overll gol of prediting hits (misses). The new must-nlysis is signifintly more preise thn prior nlyses. Both nlyses n e implemented spe-effiiently y shring informtion using strt LRU-stks. I. Introdution In hrd rel-time systems, timeliness of retions must e gurnteed off-line. Therey, one fundmentl prolem is to ound the worst-se exeution time (WCET) of progrms [1]. To otin tight nd thus useful ounds on the exeution times, timing nlyses must tke into ount the he rhiteture of the employed proessors. However, developing he nlyses nlyses tht ompute sound pproximtions to he ontents t progrm points is hllenging prolem. At its hert, he nlysis is onerned with the nlysis of the employed replement poliy. For LRU replement, preise nd effiient nlyses hve een developed [2], [3], [4], [5]. In prtie however, other poliies like FIFO or pseudo-lru (PLRU) re more widely used, e.g. in the INTEL XSCALE, some ARM, nd severl POWERPC series. In ddition, FIFO is the predominnt replement poliy in other trnsprent uffers, like rnh trget uffers [6], where the requirement of low lteny preludes expensive updte omputtions s required y LRU. Anlyzing FIFO is hrder thn nlyzing LRU sine its ehvior is more sensitive to its stte [7]. As opposed to LRU, essing set of memory loks tht would entirely fit into the he does not imply tht ll of those loks re hed fterwrds. Anlogously, essing set of memory loks tht is lrger thn the he does not neessrily evit ll other previous he ontents. However, if the sme fitting set of loks is essed repetedly, then eventully those loks must e hed. Anlogously, repetedly essing non-fitting set will eventully evit ll non-essed loks. As the sis of this pper, we introdue two theorems tht show extly how muh informtion out the he ontents is ville fter how mny of suh phses. The two theorems respetively re onerned with the two kinds of informtion tht n e nturlly distinguished: Must-informtion [2] tht llows for prediting hits, nd myinformtion [2] tht llows for prediting misses. Our min ontriutions re must- nd my-nlysis of FIFO sed on the two theorems. In ontrst to the theorems, the two nlyses operte on ritrry ontrol flow y joining nlysis informtion. By sttilly deteting phses, the nlyses re le to grdully uild up preise must- nd my-informtion. We lso show how the nlysis informtion n e enoded speeffiiently y employing the strt LRU stks of [2]. Setion II introdues the nottion used in this pper nd presents the theoretil foundtions we will use. In Setion III, we give motivtionl exmples, introdue the theorems, nd present formliztions of the nlyses. In Setion IV, we over relted work nd disuss qulittive differenes nd drw onnetions etween the novel phsedeteting nlyses nd stte-of-the-rt nlyses: All existing FIFO nlyses [8], [6] first need to otin my-informtion to otin non-trivil must-informtion. This is not neessry for the phse-deteting must-nlysis: It n otin preise mustinformtion even for progrms where no stti nlysis n otin my-informtion. The phse-deteting my-nlysis is superior to prior nlyses, too: It n predit misses on sequenes tht ontin only esses to k+1 pirwise different loks. This is the theoretil minimum for prediting misses. Prior nlyses require nerly twie the numer of pirwise different loks to predit misses. In Setion V, we evlute the phse-deteting nlyses t hnd of syntheti enhmrks tht provide detiled quntittive mesures. We ompre the phse-deteting nlyses with prior nlyses nd to the olleting semntis of FIFO, whih delimits the preision of ny stti nlysis. The must-nlysis loses most of the gp etween prior nlyses nd the olleting semntis. In most ses, the my-nlysis is lso more preise thn prior ones. II. Foundtions A. Stti Anlysis Stti nlysis determines properties of progrms without tully exeuting the progrms. Sine the properties to determine re ommonly inomputle, strtion hs to e employed. The level of strtion governs the trde-off etween nlysis preision nd nlysis omplexity. One forml method in stti nlysis is strt interprettion [9], whih our work is sed on. Insted of representing onrete semnti informtion in onrete domin D, one

represents more strt informtion in n strt domin D. The reltion etween onrete nd strt n e given y n strtion funtion α D : P(D) D nd onretiztion funtion γ D : D P(D). The result of n strt interprettion re invrints for eh progrm point, whih re represented y vlues of domin D. A progrm is nlyzed y performing fixed-point omputtion on set of equtions indued y tht progrm. The equtions re set up with the help of n strt trnsformer, U D : D I D, tht desries how strt vlues efore nd fter instrutions I re orrelted. If n instrution hs multiple predeessors, join funtion J D : D D D omines ll inoming vlues into single one. B. Ches Ches re fst ut smll memories tht store suset of the min memory s ontents to ridge the lteny gp etween CPU nd min memory. To profit from sptil lolity nd to redue mngement overhed, min memory is logilly prtitioned into eqully-sized memory loks B. Bloks re hed s whole in he lines of equl size. When essing memory lok, the he logi hs to determine whether the lok is stored in the he ( he hit ) or not ( he miss ). To enle n effiient look-up, hes re prtitioned into eqully-sized he sets q Q k nd eh lok n only e stored in one he set. The size of he set is lled the ssoitivity of the he. A he with ssoitivity k is often lled k-wy set-ssoitive. It onsists of k wys, eh of whih onsists of one he line in eh he set. In the reminder of the pper, we will ontinue to use k for the ssoitivity of he. Sine the numer of memory loks tht mp to set is usully fr greter thn the ssoitivity of the he, solled replement poliy must deide whih memory lok to reple upon he miss. Replement poliies try to exploit temporl lolity nd se their deisions on the history of memory esses. Usully, he sets re treted independently of eh other suh tht esses to one set do not influene replement deisions in other sets. Well-known poliies for individul he sets re lestreently used (LRU), pseudo-lru (PLRU) ost-effiient vrint of LRU, nd first-in first-out (FIFO). For detils on the implementtion of hes in hrdwre refer to Jo [10]. C. The FIFO Poliy A k-wy ssoitive FIFO he set q n therefore e modeled s k-tuple of memory loks i B, whih re ordered from lst-in to first-in from left to right: q = [ 1,..., k ] Q k := B k The updte funtion U Qk : Q k B Q k models the effet on he set when essing memory lok : [ 1,..., k ] : i : i = U Qk ([ 1,..., k ], ) := [, 1,..., k 1 ] : otherwise A he hit (first se) does not hnge the he set. A he miss (seond se) inserts the new lok t position 1, shifting the others to the right nd eviting the lok t the rightmost position. Finlly, let C n : Q k P(B) e the he ontents: : n = 0 C n ([ 1,..., k ]) := 1,..., n } : n < k 1,..., k } : n k omputes the set of memory loks ontined in the lst n lst-in positions of he set. For short, C(q) := C k (q). D. Aess Sequenes Let S := B e the set of finite ess sequenes, e.g. s 1 :=,,,, s 2 := d S. Let A : S P(B), where A(s) is the set of memory loks essed y s, e.g. A(s 1 ) =,, }. Furthermore, let denote ess sequene ontention, e.g. s 1 s 2 =,,,, d. The updte funtion U Qk n e lifted from individul ess to ess sequenes in the expeted wy. E. Stti Che Anlysis The im of stti he nlysis is to lssify individul memory esses s hits (H) or misses (M). However, for some esses n nlysis might fil to lssify them s hits or misses, i.e. they remin unlssified (). The lssifition lttie nd its indued join ( ) nd meet ( ) re defined s: H M H M H H M M H M H H H M M M H M Che nlysis y strt interprettion omputes must- nd my-he informtion [2] t progrm points: Must- nd myhe informtion re under- nd over-pproximtions, respetively, to the ontents of ll onrete he sttes tht will our whenever progrm exeution rehes progrm point. Must-he informtion is used to derive informtion out he hits. The more he hits n e predited, the etter the upper ound on the exeution times. My-he informtion is used to sfely predit he misses. Prediting more he misses will result in etter lower ound on the exeution times. In generl, he nlysis hs to onsider ny possile he stte t progrm strt: No memory lok must e hed, ny lok my e hed. Assuming n empty he is not sfe euse the progrm might inur less he misses thn strting with prtilly filled he [11]. One wy to ttenute this lk of informtion is to invlidte the he ontents t the strt of the progrm. This wy, one n sfely ssume n empty he, i.e. t progrm strt one would get omplete must- nd my-informtion. However, he informtion n e prtilly lost during the nlysis, e.g. due to ontrol-flow joins, nd then hs to e regined. As most he rhitetures mnge their he sets independently from eh other, he nlyses n nlyze them independently s well. Thus, we limit ourselves to the nlysis of single he set. 2

q 1 = [,,, ] [,,, ] [,,, ] [,,, ] [,,, ] = q 1 q 2 = [, x,, ] [, x,, ] [, x,, ] [, x,, ] [, x,, ] = q 2 q 3 = [x, y, z, ] [x, y, z, ] [x, y, z, ] [, x, y, z] [,, x, y] = q 3 q 4 = [x, y,, z] [, x, y, ] [, x, y, ] [, x, y, ] [,, x, y] = q 4 Fig. 1: Dependeny of FIFO he set ontents on the initil stte. III. FIFO Anlysis y Phse-Detetion In this setion we desrie our ides nd formlize the nlysis. Setion III-A motivtes why prediting hits for FIFO is diffiult nd presents phse detetion s solution. Setion III-B then shows how these ides n e trnslted into n effiient strt domin for FIFO must-nlysis. Setion III-C nd Setion III-D re strutured nlogously nd tret the predition of misses y my-nlysis. A. Prediting Hits: Chllenge & Ide To see the diffiulty inherent in FIFO, onsider the exmples in Figure 1. The ess sequene s =,,, is rried out on different he sets q i of ssoitivity 4. Although only 3 different memory loks,, } re essed, some of the resulting he sets q i do not ontin ll of the essed loks. In ontrst, k-wy he set with LRU replement lwys onsists of the k most-reently-used memory loks, e.g.,, } would e hed fter rrying out s, independently of the initil stte. This mkes nlysis of FIFO onsiderly hrder thn nlysis of LRU. To generlize, onsider FIFO he set with unknown ontents. After oserving memory ess to lok, trivil must-informtion is ville: One knows tht must e hed ut the position of within the he set is unknown. As the ess to ould not e lssified s miss, nother ess to different lok my tully evit. This is the se if the ess to is hit on the first-in, i.e. right-most, position nd the ess to is miss (s in se of q 3 in Figure 1). Thus, without impliitly or expliitly lssifying some esses s misses, it is hrd to infer tht two or more loks re hed. The following lemm shows how muh informtion is ville fter essing set of memory loks one: Lemm 1. Let s S, A(s) = n k, i.e. the ess sequene s ontins t most k pirwise different loks. Then, for ll he sets q Q k nd q := U Qk (q, s): A(s) C(q ) C 1 (q ) A(s) If no miss hppened during s, ll loks were hed in q. Hene, the loks re still hed fter the updte (A(s) C(q )) sine FIFO does not hnge its stte upon hit. Otherwise, t lest one miss must hve hppened. In tht se, the lst-in position of q must ontin lok tht used miss (C 1 (q ) A(s)). Lemm 1 is tight in the sense tht A(s) C(q ) C i (q ) A(s) does not hold for i 2 in generl. Hene, essing set of memory loks does not imply tht ll of the essed loks will e hed, s n lso e seen in Figure 1. However, if the sme set of memory loks is essed multiple times susequently, misses umulte in the lst-in positions. To profit from this, one hs to prtition ess sequenes into susequenes tht eh ess the sme set of memory loks. We ll suh susequenes phses: Definition 2 (Phse). For set of memory loks B, B-phse is n ess sequene s suh tht A(s) = B. The phse loks of B-phse re the memory loks in B. If B = A(s) = n we sy tht s is of size n. For exmples of phses nd their inlusion onsider:,,}-phse }},,,, }}}},}-phse,}-phse,,d}-phse,,d}-phse }}}},,,,, d, d, d,, d,, }}}}}},d}-phse,d}-phse,d}-phse The following theorem shows how muh informtion is ville fter j susequent B-phses: Theorem 3. Let s S e B-phse of size n k tht n e prtitioned into j n B-phses, s i, i.e. s := s 1... s j. Then for eh q Q k, nd q := U Qk (q, s): A(s) C(q ) C j (q ) A(s) To evit newly inserted lok from FIFO he set, it tkes k misses. Sine s ontins t most k different loks ( A(s) k), lok inserted y miss in s nnot e evited y s. Hene, there n e t most one miss to eh lok in B. Thus, if essing lok results in miss, the essed lok must e different from ll previously essed loks tht resulted in misses. After j B-phses, either (t lest) one miss hppened in eh phse (whih implies C j (q ) A(s)) or there ws phse with no misses t ll. In the ltter se, ll loks in B were hed efore tht phse nd remin hed throughout the reminder of s (A(s) C(q )). As orollry, fter t most B B-phses ll loks in B must e hed: Sine C n (q ) = A(s) = n, C n (q ) A(s) implies C n (q ) = A(s). This result is tight: In generl, n 1 phses re not suffiient to gurntee hits, i.e. miss n hppen in the lst phse, s n. Sketh of the nlysis: While the nlysis proesses memory esses one-y-one, it virtully prtitions the ess se- 3

B 1 p(1), pp(1) B 2 p(2), pp(2) B 3 p(3), pp(3) B 4 p(4), pp(4) 0, 0 0, 0 0, 0 0, 0 } 1, 0 H } 1, 0 } 1, 0 } 1, 0 0, 2 0, 2 } 1, 0 } 1, 0 } 1, 0 0, 3 } 1, 0 } 1, 1 } 1, 1 0, 3 } 1, 0 } 2, 0 } 1, 2 0, 3 H } 1, 0 } 2, 0 } 1, 2 0, 3 } 1, 0 } 1, 0 } 2, 0 0, 3 Fig. 2: Evolution of must-nlysis informtion when proessing,,,,,,,. Eh stte onsists of n strt LRU-stk with nnotted phse progress nd phse ounters (p(n), pp(n)). The seond ess to must e hit sine there ws one }-phse efore. The lst ess to must e hit sine there were two, }-phses efore (p(2) = 2, L 2 (lru ) =, }). The whole sequene n e prtitioned into two,, }-phses. quene into phses. To predit hits using Theorem 3, the mustnlysis proeeds in three stges: In the first two stges, it umultes informtion y virtully prtitioning the ess sequene into phses. In the first stge, the phse loks B re determined, i.e. the memory esses enountered in the progrm define the first B-phse. In the seond stge, the nlysis tries to detet nother B 1 B-phses. If this sueeds the nlysis hs deteted B B-phses in totl nd n proeed with the third stge. In the third stge, the nlysis n exploit the umulted informtion. It n predit hits for ll esses to loks in B, until the first ess to lok tht is not ontined in B. If, t ny time fter the first stge, lok is essed tht does not elong to the phse loks B, the nlysis hs to restrt in stge one. This is euse n ess suh lok my evit ny lok ontined in B from the he. Even fter B B-phses one only knows tht ll loks in B must e hed. There is no informtion out the positions of the loks within the he set. Hene, ny lok my reside in the first-in position. To rrive t vile nlysis, one hs to overome lst ostle: In stge one, it is not pprent how to hoose the phse loks B, i.e. when to end the first phse. Note, however, tht for given ess sequene, the phse size uniquely determines the phse loks B: For phse size n, the phse loks re the n most-reently-used loks. If the phse size is hosen too lrge, ompleting stge two might tke long or my never hppen. If it is hosen too smll, esses to loks not in B might hppen frequently, whih fores the nlysis to restrt in stge one. To solve this prolem, our must-nlysis tully performs k nlyses in prllel, one for eh phse size from 1 to k. B. Effiient Must-nlysis Implementtion Coneptully, eh of the k nlyses hs to mintin the following informtion: phse loks B B, whih re defined y stge one. phse progress P B B, whih ontins loks tht hve lredy een essed in the urrent phse. This is used in stge two to determine when B-phse ends. phse ounter p N, whih ounts the numer of ompleted B-phses. Used to deide when stge two ends. Representing this informtion nively would result in n ineffiient implementtion. In this setion we show how the k nlyses n e implemented spe-effiiently y shring informtion etween them. Eh nlysis hs to mintin its phse loks B. As soon s lok B is essed, the nlysis hs to strt over. As noted ove, the phse loks for phse size n, B n, re lwys the n most-reently-used loks. This implies tht the loks of phse with lrger size lwys susume the loks of phse of smller size, i.e. B n B n 1. The dt struture perfetly suitle for this kind of suffix-shring is n LRU-stk: In n LRU-stk the n most-reently-used loks re ontined in the n topmost positions. Thus, single LRU-stk of size k is suffiient to represent the phse loks for ll phse sizes. Figure 2 shows how n LRU-stk hnges upon esses. To generlize from ess sequenes to ritrry ontrol-flow, one needs n strtion for LRU-stks. The strtion must llow for onservtively onluding when phse ends, i.e. it must e le to tell when ll phse loks must hve een essed. This ondition is fulfilled, for instne, y the LRU must-nlysis of Ferdinnd [2], whih is the first onstituent of our domin: lru Lru k := P(B) k The LRU must-nlysis mintins n strt LRU muststk with set of loks for eh position. To extrt the (pproximtion of) phse loks from the must-stk we define the funtion L n : Lru k P(B): L n([b 1,..., B k ]) := n i=1 B i For short, L (lru ) := L k (lru ). Our nlysis uilds upon the importnt invrint of the LRU must-nlysis tht L n(lru ) is n underpproximtion of the set of the n most-reentlyused loks. Hene, the phse loks for phse size n re pproximted y L n(lru ). Eh of the k nlyses needs to ount the numer of deteted phses. We represent this s mpping from phse size to phse ounter: PC k := f : 1,..., k} 0,..., k} i : f(i) i} For p PC k, p(n) is lower ound on the numer of deteted L pp(n) (lru)-phses. The first olumn next to the stks in Figure 2 shows the vlue of the phse ounters. To determine when phse ends, eh nlysis hs to mintin its phse progress P, i.e. the loks tht hve lredy een essed in the urrent phse. These loks re lwys suset of the orresponding phse loks, P n B n. More preisely, if i pirwise different loks hve lredy een essed in phse, these loks re ontined in the i topmost positions of 4

the LRU-stk, i.e. P i = L i (lru ). Hene, the phse progress for single phse size n e represented y pointer into the LRU-stk. For the k phse sizes, there re k pointers, whih we represent s mpping from phse size to phse progress: PP k := f : 1,..., k} 0,..., k 1} i : f(i) < i} For pp PP k, pp(n) indites the phse progress for phse size n. The loks ontined in the pp(n) topmost positions in the LRU-stk, L pp(n) (lru ), re n underpproximtion of the set of loks tht hve een essed in the urrent phse. The seond olumn next to eh stk in Figure 2 shows the vlue of the phse progress for eh phse size. In summry, our must-nlysis onsists of k nlyses. The phse loks of ll k nlyses re mnged olletively y single LRU must-nlysis Lru k. The k phse progress re represented y pointers PP k into the LRU must-stk, nd there re k phse ounters PC k. The domin of our must-nlysis is: PMust k := Lru k PP k PC k Conretiztion: The set of onrete he sets represented y our must-informtion is given y the onretiztion funtion. Sine the must nlysis omprises k nlyses for different sets of phse loks, the onretiztion is the intersetion of the k respetive onretiztions: γ PMust k : PMust k P(Q k ) γ PMust k ((lru, pp, p)) := 1 n k γ n : P(B) P(B) N P(Q k ) γ n (B, P, p) := q Q k B C(q) γ n (L n(lru ), L pp(n) (lru ), p(n)) C p+m (q) B m := P \ C(q) } Due to Theorem 3 either ll phse loks re hed (B C(q)) or t lest p misses must hve hppened. For eh lok tht ws essed in the urrent phse ut is not hed (P \ C(q)) n dditionl miss must hve hppened. Hene, the lst p+m lst-in positions must ontin phse loks (C p+m (q) B). Clssifition: The funtion C : PMust k B Clss shows how our must-nlysis lssifies memory esses. H : i : p(i) = i, L i (lru ) C((lru, pp, p), ) := M : p(k) = k, L (lru ) : otherwise The nlysis n lssify hit for n ess to lok if it deteted i phses of size i nd elongs to the orresponding phse loks (se 1). See Figure 2 for exmples. If k loks re hed, no other lok my e hed (se 2). In this speil se the nlysis n predit misses for ll loks not elonging to the phse loks for phses size k. Otherwise, the nlysis nnot lssify the ess (se 3). U PMust k Updte: The updte funtion is defined omponent-wise: : PMust k B PMust k U PMust k ((lru, pp, p), ) := (lru, pp, p ) lru := ULru k (lru, ) pp(n) : L pp(n) (lru ) pp := λn. pp(n) + 1 : L n(lru ) \ L pp(n) (lru ) ms(lru, n) : L n(lru ) p 0 : L := λn. n(lru ) p(n) : otherwise p p (n) + 1 : pp (n) = n, p (n) < n := λn. p (n) : otherwise pp 0 : pp (n) = n := λn. pp (n) : otherwise For the LRU must-stk, the norml updte of the LRU-nlysis n e reused, whih mintins the invrint desried ove. For n explntion on how the strt LRU-stk is updted the interested reder is referred to [2]. pp nd p re updted in two steps. The first step (pp nd p ) hs three ses: 1) In the first se, is phse lok ut hs lredy een essed in the urrent phse ( L pp(n) (lru )); no phse progress. See the seond ess to in Figure 2. 2) In the seond se, is phse lok nd hs not een essed in the urrent phse; the phse progress n e inremented. E.g. the seond ess to in Figure 2 inrements pp(2) nd pp(3). 3) In the lst se, the essed lok is not phse lok ( L n(lru )); the phse loks hnge nd the nlysis hs to restrt in stge one. In Figure 2, the lst ess to enfores restrt for phse size 2 nd 1. Although the phse ounter is reset first (p (n) = 0), the nlysis does not hve to strt from srth. Insted of strting new phse with only s phse progress, the (up to) n most-reently-used loks eome the new phse progress. These might e enough to omplete new first phse. The nlysis ompletes phse of size n if pp(n) = n. Hene, t tht point L n(lru ) = n must hold. However, fter join it my e tht L n(lru ) < n. To prevent the se L pp(n) (lru ) < pp(n) = n we mke sure tht pp(n) L pp(n) (lru ) holds t ny time. To do so, the funtion ms : Lru k N N omputes mximl onrete su-stk. For given position p it omputes the mximl position n in the stk suh tht the su-stk up to position n ontins extly n loks: ms(lru, p) := mxn L n(lru ) = n p}. The seond step of the updte (pp nd p ) hndles phse ompletion. If phse is ompleted (pp (n) = n), the phse ounter is inremented nd the phse progress is reset to 0. Join: For the LRU-stk, the join funtion of the Lru k domin n e reused. 5

J PMust k : PMust k PMust k PMust k J PMust k ((lru 1, pp 1, p 1 ), (lru 2, pp 2, p 2 )) := (lru, pp, p ) lru := JLru k (lru 1, lru 2 ) p minp := λn. 1 (n), p 2 (n)} : L n (lru ) = n 0 : otherwise minpp 1 (n), pp 2 (n)} : p 1 (n) = p 2 (n) pp := λn. pp 1 (n) : p 1 (n) < p 2 (n) pp 2 (n) : p 1 (n) > p 2 (n) pp ms(lru, pp (n)) : L := λn. n (lru ) = n ms(lru, n) : otherwise The updtes of pp nd p hve two min ses: If the su-stk still ontins n elements fter the join ( L n (lru ) = n), the minimum of the respetive vlues re the join result. Otherwise, the nlysis hs to restrt with p = 0, pp = ms(lru, n) s in one se of the updte. The three ses in pp stem from the ft tht one tully tkes the minimum of the overll progress, pp nd p omined, nd not the minimum of the individul vlues. E.g. if p 1 (n) > p 2 (n) then p 1, pp 1 is the greter overll progress. Hene pp 2 is the smller phse progress, regrdless of pp 1. C. Prediting Misses: Chllenge & Ide Prediting misses for FIFO is lso more diffiult thn prediting misses for LRU: Consider the ess sequene s =,,, d, e. If s is onduted on 4-wy ssoitive he set with LRU replement, the resulting he set will onsist of the 4 most-reently-used memory loks,, d, e} independently of the set s initil stte. Susequent esses to other memory loks n e lssified s he misses. For FIFO, this is not neessrily the se: If s is onduted on the initil stte [f, g, h, i] with FIFO replement, the resulting he set will e U Qk ([f, g, h, i],,,, d, e ) = [e, d,, ] whih hs the sme ontents s in the LRU se. On the other hnd, for nother initil stte, U Qk ([f,,, ], s) = [e, d, f, ]. In the ltter se, the memory lok f, whih is not ontined in the sequene s, hs survived in the he set. As ny memory lok ould hve survived in ple of f, no my-informtion is ville, i.e. it is not possile to lssify esses s misses t this point. However, some knowledge n e inferred out the stte of the FIFO he set fter onduting s: Sine s ontins l = 5 k = 4 different memory loks, there must hve een t lest l k = 5 4 = 1 he miss. So the l k lst-in positions must ontin loks of the set,,, d, e}. We sy tht the positions re overed y the set. In ft, one does know it more out the stte of the he thn tht. Assume tht there were extly l k misses on the sequene. Then, there must hve lso een hits to k different memory loks. In tht se, the ontents of the he set would e ompletely overed y the ontents of the sequene. Otherwise, there must hve een t lest l k + 1 misses. So, in ny se, the l k + 1 lst-in positions must e overed y the ontents of s: Lemm 4. Let s S, A(s) = n k. Then, for eh q Q k nd q := U Qk (q, s): C n k+1 (q ) A(s) As in the must-nlysis, the ide for the my-nlysis is to split the ess sequenes into severl phses. Eh phse will ontriute it to the overll gol of prediting misses. Lemm 4 shows the progress tht n e hieved in single phse. In ontrst to the must-nlysis, different phses n ontriute differently. A single, long phse s with n = A(s) = 2k 1 loks lone n provide my-informtion: n k + 1 = k. Therefore C k (q ) A(s). A short phse with A(s) = k on the other hnd, provides only little progress towrds prediting misses. The following theorem shows tht the progress of seprte phses dds up: Theorem 5. Let s e n ess sequene tht n e prtitioned into phses s := s 1... s j suh tht eh s i ontins esses to t lest k pirwise different loks: A(s i ) = n i k. Then for eh q Q k nd q := U Qk (q, s): C P j i=1 (ni k+1)(q ) A(s) = i A(s i) This is similr to Theorem 3 for the must-nlysis. However, there is one importnt differene: For the my-nlysis it is not neessry tht ll phses ess the sme loks nd hene my e of different size. Those relxed onditions for the my-nlysis entil n dditionl degree of freedom: The my-nlysis n finish eh phse profitly s soon s it ontins t lest k loks ( A(s i ) = n i k). Depending on the following esses, it my e enefiil or detrimentl to the preision of the nlysis to do so. For n exmple, onsider 4-wy ssoitive he set q nd the sequenes s 1 =,,, d, e,,,, d, e nd s 2 =,,, d, e, f, g,,,, whih hve ommon prefix. For s 1 it is est to prtition it into two phses,,, d, e,,, d, e of size 5 eh, whih results in C 2+2 (q ),,, d, e}. For s 2 it is etter to prtition into,,, d, e, f, g,,. This wy, Theorem 5 lredy gurntees C 4 (q ),,, d, e, f, g} fter the first phse of size 7. Agin, we resolve this dependeny on future esses y effiiently performing multiple nlyses in prllel. D. Effiient My-nlysis Implementtion As in the must-nlysis, one n use n LRU-stk to olletively represent the phse loks for ll phse sizes. However, the strtion of the stk hs to e different for the mynlysis: The respetive (su-)stk ontents must e superset of the umulted phse loks of ll phses, i.e. of the set A(s) = i A(s i) in Theorem 5. This wy, the nlysis n soundly predit misses for ll loks not ontined in the sustk. Consequently, the first onstituent of the my-domin is the domin of the LRU my-nlysis of [2]: lru Lru k := P(B) k 6

To extrt the (pproximtion of) umulted phse loks from the my-stk we define the funtion L n : Lru k P(B): L n([b 1,..., B k ]) := n i=1 B i For short, L (lru ) := L k (lru ). The importnt invrint of the LRU my-nlysis is tht L n(lru ) is n overpproximtion of the set of the n most-reently-used loks. Although Theorem 5 only requires n i k, i.e. the phse size n i is not ounded, one n limit the stk size. After one phse of size 2k 1, Theorem 5 gurntees C 2k 1 k+1 (q) = C(q) A(s), i.e. no loks other thn those of the sequene s my e hed. Hene, it would e redundnt to onsider phse sizes lrger thn 2k 1. So we will use Lru 2k 1. There is no diret nlogy to the phse ounter of the must-nlysis: Sine in the my-nlysis phses my e of different size, nd different sizes indue different progress in nlysis informtion, it would not e useful to simply ount the numer of phses. Insted, the my-nlysis keeps trk of its progress y ounting the numer of overed wys: The term C P j i=1 (ni k + 1) (q ) A(s) of Theorem 5 reflets tht. The first (n i k + 1) wys of the he set q re overed y the set A(s). A(s) is in turn over-pproximted using L n(lru ), where n = A(s). If the numer of overed wys rehes k, the nlysis n predit misses for loks not ontined in the umulted phse loks L n(lru ). Due to the dditionl degree of freedom desried in Setion III-C, it is enefiil to llow for multiple phse progress for eh set of umulted phse loks. This wy, the nlysis n lwys follow oth options: finishing phse nd strting new one, whih is enefiil if smll set of memory loks is repetedly essed s in s 1 =,,, d, e,,,, d, e. ontinuing phse, whih is enefiil if lrge numer of different memory loks re essed in short period s in s 2 =,,, d, e, f, g,,,. One phse is finished, the prtitioning of the ess sequene up to this point eomes irrelevnt. Wht mtters is the numer of overed wys (n i k + 1) provided y the prtitioning. When different prtitions finish their urrent phse fter the sme ess, we keep only the gretest numer of overed wys. The informtion out overed wys n e represented s mpping from umulted phse loks nd phse progress to overed wys: CW k := 1,..., 2k 1} 0,..., 2k 1} 0,..., k} For w CW k, w(p, pp) is lower ound on the numer of lst-in he set wys tht re overed y loks ontined in the LRU sustk L p (lru ). For given phse progress pp, L pp(lru ) is n overpproximtion of the loks tht hve een essed in the urrent phse. If lok L pp(lru ) is essed, the phse progress n sfely e inremented. Altogether, the domin of the my-nlysis is: PMy k := Lru 2k 1 CW k Exmple: The nlysis mintins the numer of overed wys w(p, pp) for eh set of umulted phse loks, overpproximted y L p (lru ), nd phse progress, overpproximted y L pp(lru ). The following shows n ess sequene with exmples of my-nlysis informtion of 4-wy he set. The exmple only onsiders the loks,,, d, e, f}, whih re ontined in the top p = 6 LRU-stk positions, nd different prtitions of those into phses.... x y z d e f e d e p = 6 pp = 0, w(6, 0) = 4 pp = 2, w(6, 2) = 3 pp = 5, w(6, 5) = 4 If the phse progress t the end of the sequene is pp = 5, the urrent phse strted with the seond ess to. Up to tht point, the est prtitioning provided 2 overed wys. The phse progress pp = 5 of the urrent phse provides n dditionl 2 overed wys. In totl w(6, 5) = 4 t the end of the sequene. If the previous phse ws finished fter the ess to f, the est wvlue ws 3. In this se, only e nd d hve een essed in the urrent phse, whih provides no dditionl gurntee. Hene, w(6, 2) = 3. If phse ws finished fter the lst ess, the est prtition up to this point provides w(6, 0) = 4. Note tht phses n lwys e finished, i.e. i : w(p, 0) w(p, i). Conretiztion: As the my-nlysis performs multiple su-nlyses in prllel, n strt element my n lso e interpreted s onjuntion of onstrints, one onstrint per su-nlysis. The informtion mintined in eh su-nlysis is the loks of the phse, the loks of the phse progress, the phse progress, nd the numer of overed wys. The funtion CS omputes the onstrint set of n strt element: CS((lru, w)) := (L p(lru ), L pp(lru ), pp, w(p, pp)) 1 p < 2k, 0 pp p} In the exmple ove, (,,, d, e, f}, d, e}, 2, 3) is ontined for p = 6, pp = 2. As eh onstrint in onstrint set holds true, the onretiztion of n strt element is the intersetion of ll onretiztions of the individul onstrints: γ PMyk : PMy k P(Q k ) γ PMyk ((lru, w)) := CS((lru,w)) γ w () The onretiztion of single onstrint is given y: γ w : P(B) P(B) N N P(Q k ) γ w (B, P, pp, w) := q Q k C w+ (q) B := mx0, minpp, k 1} P C(q) }} where re dditionlly overed wys tht re not lredy ounted for y w. pp loks hve een essed in the urrent phse, t most P C(q) re still ontined in the he. Hene, 7

pp P C(q) is lower ound on the numer of dditionl misses tht hve hppened. For eh miss, lst-in position is overed. However, the vlue of w lredy reflets the numer of overed wys due to pp eing lrger thn k 1. The min ensures tht no miss is ounted twie. Clssifition: The nlysis n lssify miss for if is not ontined in set of umulted phse loks ( L p (lru )) nd those loks over ll k wys of the he set (w(p, 0) = k). As w(p, 0) w(p, i), it is suffiient to only hek this vlue in the ondition of the lssifition funtion C : PMy k B Clss: C((lru M : p : L p, w), ) := (lru ), w(p, 0) = k : otherwise Join: To define the join nd updte opertions it is neessry to define prtil order on the onstrints. Let 1, 2 e onstrints of the form (B, P, pp, w). The prtil order on these onstrints is defined y set inlusion of their respetive onretiztions: 1 2 : γ w ( 1 ) γ w ( 2 ). With the onretiztion funtion one n find tht: (B, P, pp, w) (B, P, pp, w ) B B (pp = 0 P P ) w w + min0, mink 1, pp} mink 1, pp }}. A gurntee is weker, if less wys (w w) re overed, or if more memory loks (B B) over the sme numer of wys, or the sme B overs the sme numer of wys ut with higher phse progress (P P ). As one n lwys end phse, the phse progress does not mtter for pp = 0. The join funtion first omputes the joined LRU-stk lru. For eh pir of su-stks indued y p nd pp, eh opernd provides est gurntee on the numer of overed wys (w i ). To e sound, the join hs to tke the minimum of those two est gurntees. J PMyk : PMy k PMy k PMy k J PMyk ((lru 1, w 1 ), (lru 2, w 2 )) := (lru, w ) lru := JLru k (lru 1, lru 2 ) w := λp.λpp. minw 1, w 2 } To e optiml, one hs to mximize w over ll pirs of sustks n opernd provides, whih n e further simplified. w i := mx CS(lru i, wi) n (L p (lru ), L pp (lru ), pp, n)} Updte: Like the join, whih determines the est gurntees provided y the opernds, the updte n lso e explined in terms of gurntees. The proeeding is the sme: One hs to ompute gurntees for n updted LRU-stk. The differene to the join is tht old gurntees provided y the opernd need to e updted first. As the my-updte n profit from it, we distinguish etween different lssifitions l. U PMyk : PMy k B PMy k U PMyk ((lru, w),, l) := (lru, w ) lru := ULru k (lru, ) w := λp.λpp. mx n CS(lru,w) U w (, l), (L p (lru ), L pp (lru ), pp, n)} The updte of onstrint in se of miss is: U w ((B, P, pp, w),, M) := (}, }, 1, 1), (B }, }, 1, w + 1), (B }, P }, pp, w + 1)} The three resulting onstrints orrespond to omplete restrt, strting new phse, nd ontinuing the phse. In eh se, w n e inremented when is dded to B. The updte of onstrint in se of hit is: U w ((B, P, pp, w),, H) := (, }, 1, 0), (B, }, 1, w)} (B, P, pp, w) : P (B, P }, pp + 1, w) : P, pp < k 1 (B P }, P }, pp + 1, w + 1) : P, pp k 1 The first two resulting onstrints orrespond to omplete restrt nd strting new phse. The third onstrint, ontinuing the phse, depends on further onditions: If the essed lok hs lredy een essed ( P ) there is no progress. If P, is dded to the phse nd the phse progress is inremented (pp + 1). If furthermore pp k 1, one more wy must e overed y B together with one of the loks essed in this phse (P }). IV. Relted Work nd Qulittive Comprison We limit our generl disussion out he nlyses to those direted t WCET nlysis, whih inlude [12], [3], [13], [2]. Mueller et l. [12] present stti he simultion for diretmpped instrution hes. It lssifies instrutions s lwysmiss, lwys-hit, first-miss, or onflit. White et l. [3] extend this work to dt hes, where the min hllenges lie in the nlysis of essed ddresses. Furthermore, n instrution he nlysis for set-ssoitive LRU hes is skethed. Li et l. [13] present timing nlysis sed on integer liner progrmming (ILP) formultions. It n hndle set-ssoitive hes y enoding their onrete semntis using liner onstrints. However, sine this pproh integrtes pipeline, he, nd pth nlysis into one ILP, it suffers from omplexity prolems. In prtie it is limited to diret-mpped hes nd simple pipelines. Ferdinnd et l. [2] introdue the onepts of my- nd must-hes nd present preise, ontext-sensitive LRU nlysis sed on strt interprettion. As explined in Setion III, sttilly nlyzing FIFO is inherently more diffiult thn nlyzing LRU. LRU-nlysis n e seen s speil se of phse-deteting nlyses: For LRU it is suffiient to detet one phse to prove tht the ssoited phse loks re hed, insted of n phses for phse size n. 8

The onept of reltive ompetitiveness (RC) [8] ounds the performne of one replement poliy reltive to the performne of nother one. Under ertin onditions, this llows for using he nlyses for one poliy s he nlyses for other poliies. For instne, n LRU my-nlysis for 2k 1-wy ssoitive he n e reused s my-nlysis for k-wy FIFO. Likewise, n LRU must-nlysis for 1-wy ssoitive he (diret mpped) n e used s must-nlysis for k- wy FIFO. Due to the generi nture of this pproh, however, the resulting nlyses my e rther impreise. There is striking reltion etween those RC-nlyses nd the nlyses sed on phses. PMUST nd PMAY onsist of spetrum of su-nlyses for phse sizes 1... k nd respetively 1... 2k 1. The nlyses sed on reltive ompetitiveness mrk the extreml points of this spetrum. The PMAY su-nlysis for phse size 2k 1 oinides with the RC my-nlysis nd the PMUST su-nlysis for phse size 1 oinides with the RC must-nlysis. The nlyses PMUST nd PMAY presented in this pper gin informtion y glol oservtions in the following sense: They oserve severl esses ( phse) nd dedue tht some property holds for suset of those esses, e.g. t lest one of the essed loks must e hed ut its position within the he set is unknown. In ontrst, oth nlyses presented in [6] re imed t exploiting lol miss lssifitions: If n individul ess is predited s miss, its position is known (lst-in position) nd it will tke k misses to evit it. In [6], the nonil must-nlysis, CM, exploits miss lssifition for lok y prediting hits for until k further misses might hve hppened. CM is omplemented y the erly-miss exploiting my-nlysis, EMX, whih provides miss lssifitions. However, efore EMX n predit ny misses, it needs to oserve esses to 2k pirwise different loks. If no misses re predited y EMX, CM n only predit trivil hits, i.e. for susequent esses to the sme lok. As onsequene, CM n only predit non-trivil hits if more thn 2k pirwise different loks re essed. In ontrst, PMUST, n infer preise must-informtion for progrms where no stti nlysis ould lssify n ess s miss. Additionlly, PMAY, n lssify misses without hving to oserve esses to 2k pirwise different loks. Aesses to k + 1 n e suffiient. In the relted field of memory mngement, Mdison nd Btson [14] propose ounded lolity intervls to pture the intuitive notion of progrm phses. To determine the ounded lolity intervls, they introdue dynmi nlysis tht uses n extended LRU stk. Besides the referened elements, this stk dditionlly keeps trk of referene times of elements in sustks. Their stk nd its updte re similr to ours, however the stks re nnotted differently for different purposes. V. Quntittive Evlution We ompre three FIFO nlyses with eh other nd to the olleting semntis: 100 90 80 70 60 50 40 30 20 10 0 1 5 8 10 15 20 25 30 n 100% ) 0% RC Misses ) EMX+CM PD+CM COLLSEM Unknown ) Fig. 3: Averge hit- nd miss-rtes gurnteed y the nlyses for n 8-wy he set. RC: omintion of must- nd my-nlysis, oth solely sed on reltive ompetitiveness [8] s explined in Setion IV. EMX+CM: the nlysis presented in [6], whih onsists of EMX nd CM s explined in Setion IV. PD+CM: the phse-deteting nlyses proposed in this pper, PMUST nd PMAY, omined with CM. COLLSEM: the olleting semntis, whih determines the set of he set sttes tht my reh progrm point. If memory ess nnot e lssified s hit or miss in the olleting semntis, no sound stti nlysis n do so. We omputed this using n expensive nlysis sed on powerset domin of onrete he-set sttes. To quntify the preision of the nlyses, we nlyzed rndom ess sequenes; the sme s in [6]. For eh 1 n 31, we generted 100 rndom ess sequenes tht ontin 500 esses to n pirwise different loks. Hene, the greter n, the lower is the lolity. Figure 3 shows the results, i.e. hit- nd miss-rtes gurnteed y the four nlyses. The shpe of the plot mrks identify the nlysis, e.g. irles for RC. The numer of different loks (n) in the generted ess sequenes is plotted ginst the x-xis. The perentge of lssifitions (H,, M) is plotted ginst the y-xis. For eh nlysis there re two urves, whih prtition the 100%. The lower urve, with filled plot mrks, shows the gurnteed hit-rte. The upper one, with empty plot mrks, is plotted top-down (from 100% downwrds) nd shows the gurnteed miss-rte. The differene etween the upper nd the lower urve gives the perentge of unlssified esses. For exmple, onsider the squres t n = 15: For 100 ess sequenes, eh 500 esses long nd ontining n = 15 distint loks, the verge gurnteed hit-rte otined y PD+CM ws 12%, the verge gurnteed miss-rte ws 9%, nd on verge 79% ould not e lssified. For disussion of the olleting semntis, we refer the reder to [6]. Using nlyses tht re only sed on reltive ompetitiveness onsiderly overpproximtes the olleting semntis, s the gp etween RC nd COLLSEM shows. Hits 9

Both, RC nd EMX+CM nnot predit ny misses with less thn 2k pirwise different loks. Hene, the urves of RC nd EMX+CM oinide up to n = 15. For lrger n, EMX predits more misses thn RC, nd due to oopertion with CM more hits n e predited. This is the min ontriution of [6] regrding nlysis preision. PMUST predits lrge frtion of the hits for n k nd loses the gp to the olleting semntis. This is prtiulrly importnt for loops in whih different loks re reused (temporl lolity). Consider loop tht itertes i times nd esses n k different loks, i.e. s = 1,..., n i. An nlysis sed on PMUST will result in gurnteed hit-rte of 100(1 n i ) (e.g. 87.5% for i = 32 nd n = 4). In ontrst, EMX+CM nnot predit single hit. Susequent esses to the sme lok (sptil lolity), the trivil hits, n e predited eqully well y CM nd PMUST. For 1 n k, the predited hits re solely due to PMUST. Strting with n = k + 1, oth, PMUST nd CM predit hits. PMAY is inomprle to EMX, i.e. it is not etter for ll inputs. However, it n predit misses with less pirwise different loks; the upper urve lredy strts deresing t n = k+1. For lrger n, misses hppen reltively often nd EMX performs slightly etter. However, PMAY n e implemented more effiiently thn EMX: While EMX needs to mintin independent sets of memory lok, PMAY uses sets tht n e enoded s sustks of single LRU stk. This roughly orresponds to redution in spe from O(k 2 ) to O(k). VI. Applition in WCET Anlysis As in the se of strt LRU domins, employing the phsedeteting FIFO domins in WCET nlysis requires speil ttention regrding the nlysis of loops. Even for simple loops, like for (1..n) }, the re nlysis s desried ove will not e le to lssify ny hits. This is euse of the join in the loop heder: The numer of oserved, }-phses is 0 efore the loop nd 1 t the end of the loop ody. The join results in p(2) = 0: No hits n e lssified. However, ontext-sensitive nlyses n solve this prolem. For simple loops, virtul loop peeling [15] is suffiient. A loop ontining n esses to he set needs to e virtully peeled n times. This wy, the ontext-sensitive nlysis of the ove loop mimis ontext-insensitive nlysis of the progrm for (3..n) }. Then, p(2) = 2 efore the loop nd thus fter the join, too: The hits to nd get lssified. Although this topi deserves loser ttention, the full prtiulrs for different kinds of loops nnot e disussed here. VII. Conlusions With Theorem 3 nd 5 we provide tight ounds on the numer of phses tht need to hppen until ll loks essed in these phses must e hed in FIFO he (until nonessed loks must e evited, respetively). Furthermore, we generlized dynmi phse detetion to stti phse detetion y employing strt LRU stks. Both together, stti phse detetion nd the theorems, llow for designing preise FIFO nlyses nd nswering questions left open y [6]. Espeilly PMUST hs muh higher preision thn prior nlyses nd is the first nlysis tht n predit signifint mount of hits in simple loops (n k). For k < n < 2k, PMAY is lso lwys more preise thn prior nlyses. For n 2k, it generlly depends on the input, ut it is mostly more preise. PMUST nd PMAY tully perform spetrum of sunlyses in prllel to resolve dependenies on future esses. Nevertheless, the nlysis informtion tht needs to e mintined n e effiiently enoded: We store sets of sets of memory loks in strt LRU stks nd nnotte these stks with the remining nlysis informtion, e.g. the phse progress nd phse ounters. VIII. Aknowledgments We wnt to thnk Sestin Hk nd the nonymous reviewers for their helpful omments on this pper. The reserh leding to these results hs reeived funding from the Europen Community s Seventh Frmework Progrmme FP7/2007-2013 under grnt greement n 216008 (Predtor). Referenes [1] R. Wilhelm et l., The worst-se exeution-time prolem overview of methods nd survey of tools, Trnstions on Emedded Computing Systems, vol. 7, no. 3, pp. 1 53, 2008. [2] C. Ferdinnd nd R. Wilhelm, Effiient nd preise he ehvior predition for rel-time systems, Rel-Time Systems, vol. 17, no. 2-3, pp. 131 181, 1999. [3] R. T. White, C. A. Hely, D. B. Whlley, F. Mueller, nd M. G. Hrmon, Timing nlysis for dt hes nd set-ssoitive hes, in RTAS. IEEE, 1997, p. 192. [4] S. Ghosh, M. Mrtonosi, nd S. Mlik, Preise miss nlysis for progrm trnsformtions with hes of ritrry ssoitivity, in ASPLOS. ACM, 1998, pp. 228 239. [5] S. Chtterjee, E. Prker, P. J. Hnlon, nd A. R. Leek, Ext nlysis of the he ehvior of nested loops, in PLDI. ACM, 2001, pp. 286 297. [6] D. Grund nd J. Reineke, Astrt interprettion of FIFO replement, in SAS. Springer, 2009, pp. 120 136. [7] J. Reineke, D. Grund, C. Berg, nd R. Wilhelm, Timing preditility of he replement poliies, Rel-Time Systems, vol. 37, no. 2, pp. 99 122, 2007. [8] J. Reineke nd D. Grund, Reltive ompetitive nlysis of he replement poliies, in LCTES. ACM, 2008, pp. 51 60. [9] P. Cousot nd R. Cousot, Building the Informtion Soiety. Kluwer, 2004, h. Bsi Conepts of Astrt Interprettion. [10] B. Jo, S. W. Ng, nd D. T. Wng, Memory Systems: Che, DRAM, Disk. Morgn Kufmnn, 2008. [11] J. Reineke, Ches in WCET nlysis, Ph.D. disserttion, Srlnd University, 2008. [12] F. Mueller, D. B. Whlley, nd M. Hrmon, Prediting instrution he ehvior, in LCTRTS. ACM, 1994. [13] Y.-T. S. Li, S. Mlik, nd A. Wolfe, Che modeling for reltime softwre: Beyond diret mpped instrution hes, in RTSS. IEEE, 1996, p. 254. [14] A. W. Mdison nd A. P. Btson, Chrteristis of progrm lolities, Comm. of the ACM, vol. 19, no. 5, pp. 285 294, 1976. [15] F. Mrtin, M. Alt, R. Wilhelm, nd C. Ferdinnd, Anlysis of loops, in CC. Springer, 1998, pp. 80 94. 10