I R I P U B L I C A T I O N I N T E R N E N o 1599 S INSTITUT DE RECHERCHE EN INFORMATIQUE ET SYSTÈMES ALÉATOIRES A THE NOTION OF VETO NUMBER FOR DISTRIBUTED AGREEMENT PROBLEMS ROY FRIEDMAN, ACHOUR MOSTEFAOUI, MICHEL RAYNAL ISSN 1166-8687 I R I S A CAMPUS UNIVERSITAIRE DE BEAULIEU - 35042 RENNES CEDEX - FRANCE
http://www.irisa.fr The Notion of Veto Number for Distributed Agreement Problems Roy Friedman, Achour Mostefaoui, Michel Raynal Thème 1 Réseaux et systèmes Projet Adept Publication interne n 1599 Janvier 2004 11 pages Abstract: This paper introduces the notion of veto number that can be associated with a- greement problems. An agreement problem has veto number l when l is the minimal number of processes that control the allowed decision values, i.e., if each of them changes its mind on the value it proposes, then it forces deciding on a different value. The paper presents and investigates this concept. Key-words: Agreement problem, Asynchronous distributed system, Consensus, Distributed algorithm, One shot problem, Process crash, Failure detector. (Résumé : tsvp) email: {rfriedma,mostefaoui,raynal}@irisa.fr Centre National de la Recherche Scientifique (UMR 6074) Université de Rennes 1 Insa de Rennes Institut National de Recherche en Informatique et en Automatique unité de recherche de Rennes
La notion de nombre véto dans les problèmes d accord Résumé : Cet article définit la notion de nombre véto qui peut être associé à chaque problème d accord. Il représente en quelque sorte la difficulté de celui-ci. Un problème d accord a un nombre véto l si l est le nombre minimum de processus qui contrôlent toute valeur de décision. Si l un de ces processus change d avis sur la valeur qu il propose, il force les autres à en faire de même. Cet article investigue ce concept. Mots clés : Problème d accord, système asynchrone, consensus, vecteur d entrée, défaillance de processus.
1 Introduction Agreement problems are central issues when one is interested in designing fault-tolerant applications in asynchronous distributed systems prone to failures. The most know of these problems is consensus: each process proposes a value, and (at least) the non-faulty processes have to decide a value (termination), such that no two different values are decided (uniform agreement), and a decided value has to be a proposed value (validity). Many results have been produced on this problem. The most famous of them is the so-called FLP impossibility [4] that states that consensus cannot be solved in asynchronous distributed systems as soon as even a single process can crash. An important concept that has been introduced to prove this result is the notion of valence that can be associated with a global state: a state is x-valent if set of values that can be decided from it includes x different values. This means that a single value can be decided from a 1-valent state, while no definitive choice has yet been done in a x-valent state where x > 1. Another notion of number that has been introduced in the context of agreement problems is the notion of consensus number [9]. This notion allows ranking the power synchronization primitives (or synchronization objects) in asynchronous shared memory systems prone to process crashes. An object has consensus number k if k is the greatest integer such that this object allows solving consensus among k processes in presence of up to k 1 crashes. It is shown in [9] that (among other objects) read/write objects have consensus number 1, while compare&swap objects have consensus number +. This paper presents and investigates the notion of veto number (denoted l) that can be associated with agreement problems. This notion captures the minimal number of processes that control the decision taking in the sense that, if each of these processes changes its mind on the value it proposes, then the decision can no longer be the same. The veto number notion is interesting to understand and solve agreement problems in asynchronous distributed systems where up to f processes can crash. Several results provided by the veto number notion are presented. Moreover, the paper introduces agreement protocols whose design is based on their veto number. Interestingly, this study shows a borderline separating the cases f < l and f l (where f is the maximum number of processes that can crash). The paper is made up of six sections. Section 2 presents the computation model and defines the agreement problems we are interested in. Section 3 presents the l-veto concept. Then, Section 4 presents results obtained thanks to this concept. Section 5 presents protocols solving l-veto problems. Finally, Section 6 provides a few concluding remarks. 2 Computation Model and Definitions 2.1 Asynchronous Distributed Systems with Process Crash Failures We consider a system consisting of a finite set Π of n processes, namely, Π = {p,q,... }. A process can fail by crashing, i.e., by prematurely halting. It behaves correctly (i.e., according to its specification) until it (possibly) crashes. By definition, a correct process is a process that does not crash. A faulty process is a process that is not correct; f denotes the maximum number of processes that can crash (1 f < n). Processes communicate and synchronize by sending and receiving messages through channels. Every pair of processes is connected by a channel. Channels are assumed to be reliable: they do PI n 1599
not create, alter or lose messages. There is no assumption about the relative speed of processes or message transfer delays. Let AS n,f ( ) denotes such an asynchronous distributed system. 2.2 One-shot Agreement Problems In a one-shot agreement problem, each process p starts with an individual input value v p. The input values are from a particular value set V in. Moreover, let denote a default value (such that / V in ), and V in, denote the set V in { }. All the correct processes are required to produce outputs from a value set V out. We say that a process decides when it produces an output value. Let I = [v 1,...,v p,...,v n ] Vin n be a vector whose pth entry contains the value proposed by process p. Such a vector is called an input vector [11]. Let B fail be a subset of processes, and let F(I, B fail ) be a mapping from Vin n into a non-empty subset of V out. The mapping F(I, B fail ) associates a set of possible output values with each input vector in runs in which the processes of B fail fail. For simplicity, we denote F(I) = F(I, ), or in other words, F(I) is the set of possible decision values from I when there are no failures. We also assume that for any Bfail 1 and B2 fail, if Bfail 1 B2 fail, then for any vector I, we have F(I, B1 fail ) F(I, B2 fail ). Essentially, this means that having a certain number of failures cannot prevent a decision value that is allowed with fewer (or no) failures. F(I) is called the decision value set associated with I. If it contains x values, the corresponding input vector I is said to be x-valent. For x = 1, I is said to be univalent. Definition A one-shot agreement problem is characterized by a set V in, a set V out, and a particular mapping F(I, B fail ) with the following properties: Termination: Each correct process decides. Agreement: No two processes decide different values (sometimes called Uniform Agreement). Validity: In runs in which processes in B fail fail, the value decided on from the input vector I is a value from the set F(I, B fail ). In particular, in failure free runs, the value decided on from the input vector I is a value from the set F(I). Examples We consider here three examples of well-known one-shot agreement problems. Each is defined by specific values of V in, V out, and a particular function F(). Consensus: V in = V out = the set of values that can be proposed. I ( an input vector): B fail : F(I, B fail ) = {x where x appears in I}. Interactive Consistency: V in is the set of values that can be proposed, V out = V n in,. I, B fail : F(I, B fail ) is the set of all vectors J that satisfy the following: k : if k / B fail then J: J[k] = I[k], k : if k B fail then J[k] {I[k], }. In particular, this means that I : F(I) = I. Non-Blocking Atomic Commit: Irisa
V in = {yes,no}, V out = {commit,abort}. F([yes,...,yes]) =commit. B fail : F([yes,...,yes], B fail ) = {commit,abort}. B fail, I such that I includes at least one abort : F(I, B fail ) =abort. Thus, in the Consensus problem, there is no distinction between the allowed set of decision values in runs with and without failures. On the other hand, Non-Blocking Atomic Commit and Interactive Consistency allow a different output when there are failures. Let us observe that not all agreement problems are one-shot. As an example, the membership problem [2] is an agreement problem that is not one-shot: its specification is not limited to a single invocation of a membership primitive, but rather involves the entire execution of the application in which it is used. 3 The Concept of Veto Number 3.1 Irreconcilable Input Vectors Let {I i } 1 i k (k > 1) be a set of input vectors, {V i } 1 i k the corresponding set of decision value sets, i.e., V i = F(I i ) for 1 i k. Definition 1 Set {I i } 1 i k is said to be made up of irreconcilable input vectors if 1 i k V i =. Let us note that, when the set of decision values V out is binary, only sets of univalent input vectors can be irreconcilable. The following lemma directly follows from the above definition: Lemma 1 Let {I i } be a minimal set of irreconcilable input vectors, and let I 1 {I i }. For any decision value v 1 V 1 = F(I 1 ), there is a vector I 2 in {I i } such that v 1 V 2 = F(I 2 ). (We then say that I 2 counters I 1 on v 1.) 3.2 Veto Number The intuition that underlies the veto number notion is simple. It is defined for failure-free runs, and concerns the minimal number of processes such that the decided value can no longer be the same when each of these processes changes its mind on the value it proposes. So, the veto number l of a one-shot agreement problem is the size of the smallest set of processes that, in worst case scenarios, control the decision value. For example, in the non-blocking atomic commit problem, as soon as a single process votes no, the decision is abort whatever the votes of the other processes. Hence, l = 1 for this problem. Similarly, the veto number of the interactive consistency problem is 1: if a single process changes its initial value, the decided vector changes accordingly. Differently, the veto number of the binary Consensus problem is n, since in failure-free runs, the only input vectors that enforce specific decision values are when all processes propose the same input value. More formally, to have a veto number, a one-shot agreement problem P needs to have at least one set of irreconcilable input vectors. Given S x a minimal set of irreconcilable input vectors of a problem P, let l(s x ) be the number of distinct entries for which at least two vectors of S x differ 1, i.e., the number of entries k such that there are two vectors I a and I b of S x with I a [k] I b [k]. As an example let S x = {[a,a,a,a,e,b,b],[a,a,a,a,e, c,c],[a,a, a,f,e,b, c]}. We have l x = 3. 1 Let us notice that the Hamming distance is defined on pair of vectors: it measures the number of their entries that differ. Here we consider the whole set of vectors defining S x. PI n 1599
Definition 2 Let P be an agreement problem whose minimal sets of irreconcilable input vectors are S x, 1 x m. The veto number of P is the integer l = min(l(s 1 ),...,l(s m )). When we consider the previous example, this means that there is a set of 3 processes that control the decision value. Therefore, intuitively, we show that no decision can be made without first consulting these processes, or knowing definitely that a failure has occurred. If a one-shot agreement problem has no set of irreconcilable input vectors, we say that its veto number is + (by definition). We also say that a one-shot agreement problem is an l-veto problem if its veto number is l. 4 Results Based on the Veto Number Concept Lemma 2 Let P be a one-shot agreement problem for which there is no set of irreconcilable input vectors (hence, its veto number is + ). Then P can be solved in AS n,f ( ) with f < n. Proof Since there is no set of irreconcilable input vectors, there is at least one value that appears in the decision sets of all possible input vectors. Therefore, it is always possible to deterministically decide on the smallest such value. Lemma 2 4.1 Results on Failure Detectors Two classes of failure detectors with eventual accuracy Failure detectors have been formally defined by Chandra and Toueg who have introduced several classes of failure detectors [1]. A failure detector class is formally defined by two abstract properties, namely a Completeness property and an Accuracy property. In this paper, we are interested in the following properties: Strong Completeness: Eventually, every process that crashes is permanently suspected by every correct process. Eventual Strong Accuracy: There is a time after which no correct process is suspected. Eventual Weak Accuracy: There is a time after which some correct process is never suspected. Combining the completeness property with every accuracy property provides us with the following three classes of failure detectors [1]: P: The class of Eventually Perfect failure detectors. This class contains all the failure detectors that satisfy strong completeness and eventual strong accuracy. S: The class of Eventually Strong failure detectors. This class contains all the failure detectors that satisfy strong completeness and eventual weak accuracy. In the following, AS n,f (X) denotes an asynchronous distributed system made up of n processes communicating through reliable links, where up to f processes may crash, and equipped with a failure detector of the class X (X being S or P). Irisa
On a limitation of P The following theorem is proved in [5] using a proof that is centered around the concept of l-veto. Theorem 1 [5] f, there is no one-shot agreement problem that can be solved in AS n,f ( P) and cannot be solved in AS n,f ( S). The corollary that follows is an immediate consequence of this theorem. Corollary 1 P cannot be the weakest class of failure detectors that allow to solve one-shot a- greement problems in asynchronous distributed systems prone to process crash failures. 4.2 A Class of Non-Wait-Free Problems Wait-free implementation The notion of wait-free implementation has been formalized by Herlihy in [9]. A wait-free implementation of an object solving a problem (e.g., a consensus object) is one that guarantees that any process can complete its operations in a finite number of steps, regardless of the execution speed of the other processes. Hence, in a wait-free computation, no process can be prevented from terminating by undetected process crashes or arbitrary variations in their speed [9]. This means that wait-free implicitly considers the case f = n 1. A class of problems that cannot have wait-free implementation The following theorem characterizes a class of agreement problems that cannot have a wait-free implementation (and consequently cannot have a sequential specification [9, 10]). Theorem 2 Let P be a one-shot agreement problem with veto number l < n. P has no wait-free implementation. Proof The theorem follows directly from the definition of l-veto number. That is, such problems have distinct input vectors such that no decision can be safely taken by a process until its causal history includes at least n l + 1 processes. In particular, no decision can be safely taken when more than l processes crash. Theorem 2 The next corollary follows from the previous theorem and the fact that interactive consistency and non-blocking atomic commit have veto number 1. Corollary 2 The interactive consistency problem and the non-blocking atomic commit problem have no wait-free implementation. Consequently, they also cannot have sequential specifications. We would like to point out that l-veto problems with l < n have no wait-free implementation in an inherent and profound manner. That is, for many problems, having a wait-free implementation or not depends on the level of abstraction used for solving them. For example, in asynchronous shared memory systems, as mentioned before, consensus can be implemented in a wait-free fashion using compare&swap objects, but not with read/write objects, and definitely not in a pure message passing model. However, unless processes can guess the input values of each other, an l-veto problem cannot have a wait-free implementation regardless of the communication abstraction used or failure detection capabilities. This is because wait-freeness means that a process can always terminate even if it is the only one currently participating in the protocol, be the other processes faulty or alive (i.e., wait-freeness implies (n 1)-fault tolerance, but not vice-versa). PI n 1599
5 Solving Agreement Problems with Veto Number This section focuses on solving l-veto problems when l < n in asynchronous distributed systems equipped with a consensus black box 2. Two cases are considered according to the value of l with respect to f. Let V be a vector with no entry equal to. The notation V V means j {1,...,n} : V [j] V [j] = V [j]. 5.1 Solving l-veto Problems when f < l < n When f < l < n it is relatively simple to reduce an l-veto problem to consensus. Such a reduction is described in Figure 1. It is made up of three parts. V i is the local view p i has of the actual input vector I. This view is built at lines 1-4. This part (lines 5-7) is the core of the reduction protocol. Each process p i first computes the set V i including all the input vectors from which its local view V i can be obtained (line 5). Then, p i computes the intersection of the values that can be decided from each of these possible input vectors (line 6). Finally, p i takes arbitrarily one of these values and keeps it in w i (line 7). The last part (lines 8-9) is a consensus invocation where p i proposes the value w i it has previously computed. Function Reduction 1 (v i) (1) V i [,..., ]; (2) for 1 j n do send value (v i) to p j enddo; (3) wait until (value( ) has been received from at least (n l + 1) processes); (4) for 1 j n do if (value(v j) received from p j) then V i[j] v j endif enddo; (5) let V i = T {V i V i has no entries equal to V i V i }; (6) let X = V i V F(V i i ); (7) w i any value from X; (8) output i Consensus (w i); (9) return (output i) Figure 1: Reducing l-veto Problems to Consensus when f < l < n Theorem 3 Let P be an l-veto problem (l < n). Let us consider an asynchronous message-passing system where consensus can be solved and such that f < l. The protocol described in Figure 1 solves P. Proof As f < l, we have n f n l + 1, from which we conclude that no process can block forever at line 3. The termination property follows directly from this observation and the fact that consensus can be solved in the system. The agreement property follows directly from consensus agreement. 2 Such a black box can be built in asynchronous message-passing systems equipped with a failure detector of the class S when f < n/2 [1]. When f < n, it can be built in asynchronous message-passing systems equipped with a failure detector of the class P f + S [3, 6]. Irisa
The validity property follows from the very definition of veto number. As the veto number is l, it follows from the lines 5-6 that all the vectors in V i are not irreconcilable. Moreover, due to the very construction of V i, the actual input vector I is a member of V i. As the vectors in V i are not irreconcilable, it follows that the sets of values that can be decided from each vector of V i have a non-empty intersection. Consequently, X is not empty and contains values that can be decided from the actual input vector I. Finally, due to the consensus validity, the value that is decided is one of these values, and the validity property follows. Theorem 3 5.2 Solving l-veto Problems when n > f l We now consider the case of l-veto problems in systems where f l. When we consider the protocol described in Figure 1, the new constraint f l creates two new problems we have to solve (these problems are implicitly solved when l > f). One is to prevent the permanent blocking that could appear at line 3 of Figure 1 (as now n f < n l + 1), and the second is the fact that some value has to be decided even when l or more processes crash, i.e., when a set of processes that could change the decision value have crashed. We solve the first of these problems by introducing an appropriate class of failure detectors, and the second by restricting the class of l-veto problems. The Failure Detector Class?P l This class extends the class of anonymously perfect failure detectors (denoted?p) that has been introduced in [7] to solve the non-blocking atomic commit problem (the class?p is actually?p 1 ). Let each process be equipped with a flag initialized to false. Any failure detector belonging to?p l satisfies the following properties: Anonymous completeness: If at least l crashes occur, eventually the flag of every correct process remains permanently equal to true. Anonymous accuracy: No flag is set to true, unless at least l processes crash. A sub-class of l-veto problems The sub-class of the l-veto problems we consider in the following includes the l-veto problems for which a predetermined value can be decided in the runs where l or more processes crash. That value is not necessarily related to the input vector. An example of such a problem is non-blocking atomic commit. This is a 1-veto problem where, in presence of one (or more) crash, the predetermined value abort can be decided even if all processes have proposed yes. Let predet val the set of these predetermined values. In the non-blocking atomic commit problem, this set comprises a single value (namely, abort). In the general case, this set can contain several values. A?P l -Based Protocol The protocol described in Figure 2 enriches the protocol of Figure 1 in order to solve the l-veto problems of interest. A process p i first sends its input value v i to all (line 2) and then waits until it has received values from at least (n l + 1) processes or is informed by?p l (through the boolean flag i ) that there are at least l crashes (line 3). Then, there are two cases, according to the number x of processes from which p i has received values. In each case, p i sets a local variable w i to a value that could be decided in this run, and then, as before, participates in a consensus where it proposes w i. PI n 1599
Function Reduction 2 (v i) (1) V i [,..., ]; (2) for 1 j n do send value (v i) to p j enddo; (3) wait until (value( ) has been received from at least (n l + 1) processes flag i); (4) for 1 j n do if (value(v j) received from p j) then V i[j] v j endif enddo; (5) if (values have been received from at least (n l + 1) processes) (6) then let V i = {V i V i has no entries equal to V i V T (7) let X = V i V F(V i i ); (8) w i any value from X (9) else w i any value taken from predet val (10) endif; (11) output i Consensus (w i); (12) return (output i) i }; Figure 2: A?P l -Based Reduction of l-veto Problems to Consensus when f l x n l + 1. This case is the same as the previous one: despite the fact that f l, p i has enough proposed values in its view. x n l. In that case, p i sets w i to a value taken from the set of predetermined values (e.g., the value abort in the case of the non-blocking atomic commit problem). The proof that this protocol is correct is left to the reader. It is a straightforward extension of the proof of the previous reduction protocol. Remark The protocol described in Figure 2 can be seen as a generalization of the non-blocking atomic commit protocol described in [7] that reduces atomic commit to consensus with the help of?p (which does correspond to?p 1 ). Let us also remark that, differently from the non-blocking atomic commit problem, the interactive consistency problem does not belong to the class of l-veto problems that the protocol of Figure 2 can reduce to consensus 3. 6 Concluding Remark This paper has presented the notion of l-veto number that can be associated with agreement problems. An interesting problem that remains open is the following one: Is?P l the weakest failure detector to reduce l-veto problems to consensus when f l?. Other interesting questions concern the use of the l-veto number to rank the difficulty of agreement problems. References [1] Chandra T.D. and Toueg S., Unreliable Failure Detectors for Reliable Distributed Systems. Journal of the ACM, 43(2):225-267, 1996. [2] Chockler G., Keidar I. and Vitenberg R., Group Communication Specifications: a Comprehensive Study. ACM Computing Surveys, 33(4):427-469, 2001. 3 Let us also observe that, differently from the non-blocking atomic commit problem, the interactive consistency problem is equivalent to the construction of a perfect failure detector [8]. Irisa
[3] Delporte-Gallet C., Fauconnier H. and Guerraoui R., Failure Detection Lower Bounds on Registers and Consensus. Proc. 16th Int. Symposium on Distributed Computing (DISC 02), Springer-Verlag LNCS #2508, pp. 237-251, 2002. [4] Fischer M.J., Lynch N. and Paterson M.S., Impossibility of Distributed Consensus with One Faulty Process. Journal of the ACM, 32(2):374-382, 1985. [5] Friedman R., Mostéfaoui A. and Raynal M., On the Respective Power of P and S to Solve One-Shot Agreement Problems. Research Report #1547, irisa, Université de Rennes 1 (France), 2003, 20 pages. http://www.irisa.fr/bibli/publi/pi/2003/1547/1547.html. [6] Friedman R., Mostéfaoui A. and Raynal M., A Weakest Failure Detector-Based Asynchronous Consensus Protocol for f < n. Research Report #1557, irisa, Université de Rennes 1 (France), 2003, 11 pages. http://www.irisa.fr/bibli/publi/pi/2003/1557/1557.html. To appear in Information Processing Letters. [7] Guerraoui R., Non-Blocking Atomic Commit in Asynchronous Distributed Systems with Failure Detectors. Distributed Computing, 15:17-25, 2002. [8] Hélary J.-M., Hurfin M., Mostefaoui A., Raynal M. and Tronel F., Computing Global Functions in Asynchronous Distributed Systems with Process Crashes. IEEE Transactions on Parallel and Distributed Systems, 11(9):897-909, 2000. [9] Herlihy M.P., Wait-Free Synchronization. ACM Transactions on Programming Languages and Systems, 13(1):124-149, 1991. [10] Herlihy M.P. and Wing J.L., Linearizability: a Correctness Condition for Concurrent Objects. ACM Transactions on Prog. Languages and Systems, 12(3):463-492, 1990. [11] Mostéfaoui A., Rajsbaum S. and Raynal M., Conditions on Input Vectors for Consensus Solvability in Asynchronous Distributed Systems. Journal of the ACM, 50(6):922-954, 2003. PI n 1599