Ordinal aggregation and ranking Dieter Denneberg, Universität Bremen denneberg@math.uni-bremen.de October 2011 Abstract In applications often the scales of functions are not naturally embedded in the real numbers. One only has a natural linear ordering on the scale, e.g. for judgements. Imprecise answers like don t know should be allowed, so the functions to be integrated are not single valued, rather interval-valued. The purely ordinal model for integration in [3] will be extended to the latter case, but only for finite scales. Some practical applications are discussed. 1 Introduction Our paper [3] on ordinal integration lacks elaborated examples. Answering questions from practice we noticed, that interval valued functions should be aggregated, too. This generalization of the model in [3] is done here. The essential tools are already available in that paper and for proofs we often refer to it. Given a linearly ordered scale L, the intervals of L form a lattice Intv(L), but it is not linearly ordered, so intervals exist which are not comparable. If one needs always comparability, there is a solution at least if L is finite, which we will suppose for the main part of the paper. In this case Intv(L) is graded and one can use the ranks of the intervals, which are natural numbers. This new scale of ranks extends the original scale L by inserting one additional grade in between two succeeding grades of L. This new view is the content of Section 2. For an easy access the model is not presented in full generality of [3]. We restrict to common probability measures µ on a set Ω and define the quantile correspondence in Section 3. It is one of the building blocks of our ordinal aggregation functionals, defined in Section 4. The other one is the commensurability fuction, relating the range [0, 1] R of the measures, with the scale L. But be cautious in applications, the cardinal structure of the real unit interval [0, 1] may superpose the purely ordinal view by means of the monotone commensurability function. Aggregation functionals are needed, for example, in competitions. We address this application in Section 5. Example 1.1 A jury has to rank competitors. The jury consists of a fixed number of judges ω Ω, each having equal weight µ(ω) = 1 Ω. The judgements are 1
expressed on a finite linear scale L. A competitor is judged by judge ω with the value f(ω) L. So the jury produces for each competitor a function f : Ω L. An aggregation functional for these functions f is needed in order to get a ranking for the competitors. [1], [2] discuss desirable properties of the functional. In Section 5 we adapt this application to our model. Another type of applications uses a linear scale with neutral point O with good values above and bad values below. Those scales are called bipolar scales in [3] and are described in Section 6. Aggregation with bipolar scales is the aim of the still incomplete Section 7. As application the Net Promoter Score 1, a management tool, is explained. 2 The interval extension of an ordinal scale We suppose that the scale L is a linearly ordered set, which is a complete lattice with inf and sup as lattice operations and. L has a minimal and a maximal element, denoted O and I, respectively. Denote with the ordering and with Intv(L) the set of closed intervals [a, b] := {x L a x b}, a, b L, a b. We need an ordering for intervals, the appropriate one turned out to be Topkis in [5]. Defining for [a 1, b 1 ], [a 2, b 2 ] Intv(L), [a 1, b 1 ] [a 2, b 2 ] iff a 1 a 2 = a 1 and b 1 b 2 = b 2, Intv(L) becomes a partially ordered set. Notationally, we identify one point intervals [a, a] with that point a. Thus we perceive L as subset of Intv(L). In fact (L, ) is a subposet of the poset (Intv(L), ), i.e. and coincide on L. Also (Intv(L),, ) is a lattice with lattice operations [a 1, b 1 ] [a 2, b 2 ] := [a 1 a 2, b 1 b 2 ], [a 1, b 1 ] [a 2, b 2 ] := [a 1 a 2, b 1 b 2 ]. (1) which contains (L,, ) as a sublattice. I and O are also top and bottom of the interval lattice (Intv(L),, ). Be aware that there is also a lattice structure on Intv(L) { } induced by set inclusion as ordering. But our ordering had naturally been derived from the ordering on L. Regarding an interval [a, b] as word ab, we can introduce the lexicographic order on Intv(L). It will be denoted lex and extends our order to a linear order. In this order the bottom of the intervals counts first. With the opposite words ba one gets another lexicographic order lex, where the top of the intervals counts first. The equalizer of both lexicographic orders is, i.e. I 1 I 2 iff I 1 lex I 2 and I 1 lex I 2. 1 Thomas Wyler, Allianz Zürich, directed my attention to this application of [3] 2
For the rest of this section we suppose that L is finite, say L = {x 0, x 1,..., x n } and O = x 0 x 1 x n = I. Then Intv(L) is graded of rank 2n. This is seen inductively. We show the Hasse diagrams for n = 1, 2, 3. [x 2, x 2 ] [x 1, x 1 ] [x 1, x 2 ] [x 0, x 1 ] [x 1, x 1 ] [x 0, x 2 ] [x 0, x 0 ] [x 0, x 1 ] [x 3, x 3 ] [x 2, x 3 ] [x 2, x 2 ] [x 1, x 1 ] [x 0, x 0 ] [x 0, x 1 ] [x 0, x 0 ] [x 1, x 3 ] [x 1, x 2 ] [x 0, x 3 ] [x 0, x 2 ] The rank function of Intv(L) will be denoted ϱ : Intv(L) {0, 1,..., 2n} N 0. For example ϱ([x 2, x 2 ]) = ϱ([x 1, x 3 ]) = 4. It is increasing, even more [a, b] [c, d] ϱ([a, b]) < ϱ([c, d]), The original scale L is the vertical line on the left hand side of the Hasse diagram. In the grading of Intv(L) the subset L assumes the even ranks only, ϱ(x k ) = 2k. Two intervals of the same rank are incomparable, but there are incomparable intervals of different ranks as [x 1, x 1 ] and [x 0, x 3 ]. The rank function defines a preorder on Intv(L), i.e. a reflexive and transitive binary relation. Antisymmetry fails for n > 1. Appealing for some applications is the following fact: By composing an interval valued functional F with the rank function we get ϱ F, which is a functional with linearly ordered range. But be cautious in selecting the scale L for an application. If one enlarges L to a linear scale L by inserting unused scale values, the preordering of inervals by the ranks may change for incomparable intervals. Example 2.1 Denote with L n the linear scale of rank n and define λ : L 2 L 3 by x 0 y 0, x 1 y 2 and x 2 y 3. Extend λ to Λ : Intv(L 2 ) Intv(L 3 ) by [a, b] [λ(a), λ(b)]. Now ϱ([x 1, x 1 ]) = ϱ([x 0, x 2 ]) = 2 and for the images of the intervals we get 2 ϱ(λ([x 1, x 1 ])) = 4 and ϱ(λ([x 0, x 2 ])) = 3. But defining λ by x 0 y 0, x 1 y 1 and x 2 y 3 the rank of Λ([x 1, x 1 ]) is below the rank of Λ([x 0, x 2 ]). In contrast to the ranks, the ordering of the intervals remains unchanged under extension of the scale L. 2 We use the same symbol ϱ for the rank functions of different graded lattices. 3
Proposition 2.1 Let λ : L L be an increasing application between totally ordered scales, then the induced mapping Λ : Intv(L) Intv(L ), [a, b] [λ(a), λ(b)] is increasing, too, and a lattice morphism. Suppose furthermore that λ is injective, then Λ is injective, too, and for [a, b], [c, d] Intv(L) [a, b] [c, d] Λ([a, b]) Λ([c, d]). Proof Since λ is an increasing application between linearly ordered sets it is a lattice morphism. Then Λ is a lattice morphism by definition (1) of the lattice operations. Question: extensions of to a linear ordering of Intv(L) by an injection L L into a sufficiently large L. 3 The quantile correspondence For simplicity we suppose that the scale L is finite and that integration or aggregation is done w.r.t. a probability measure µ : 2 Ω [0, 1] R on a finite set Ω. Example 3.1 In the applications Ω may be the set of judges or voters in a competition or the set of persons being inquired. In these cases the normalized counting measure µ(a) := A Ω, A 2Ω, i.e. the uniform distribution on Ω will do the job. Our aim is to define with a given µ an ordinal mean value or ordinal average for interval valued functions f : Ω Intv(L) on Ω. Those functionals are often called (ordinal) aggregation functionals. Example 3.2 In judgement applications the answer don t know of an interrogated person ω should get the value f(ω) = L, the whole scale. Since L x iff x = O, the weight of this answer would be attached to the bottom of the scale as we will see next. The distribution function G µ,f : L [0, 1] of f is defined as G µ,f (x) := µ( {ω f(ω) x} ), x L. Condition f(ω) x amounts to f(ω) [x, I], whence G µ,f is a decreasing function. In general it is not surjective, even if the range [0, 1] of G µ,f is restricted to 4
image(µ) = {µ(a) A Ω}. The inverse correspondence of G µ,f has as domain only the image of G µ,f. Setting Q µ,f : [0, 1] Intv(L), p { {x Gµ,f (x) = p } if p image(g µ,f ) {x Gµ,f (x) > p } else. we extend the domain and call Q µ,f the quantile correspondence of f w.r.t. µ. Then Q µ,f (p) is the (1 q)-quantile 3 of f. Especially, Q µ,f ( 1 2 ) is the median of f. The quantile correspondence is a decreasing application like G µ,f. For fixed p the functional f Q µ,f (p) is already an aggregation functional (Example 4.3). A related one is f grade k (f) in the following example. Example 3.3 Let Ω = m and f : Ω L a single-valued function, then the values f(ω), ω Ω, form a chain x 1 x 2 x m with repetitions. grade k (f) := x k L is called the k-th order value or k-th grade of f, k = 1,..., m. So far the grades are defined without reference to a measure. But selecting the uniform distribution µ on Ω, it is obvious that the m grades of f determine the distribution function G µ,f uniquely and vice versa. Furthermore the order values are quantiles w.r.t. µ. More precisely, the (m + 1 k)-th order value is grade m+1 k (f) = Q µ,f ( k 1 2 m ) L, k = 1,..., m ( 1 2 could be replaced by any real a with 0 < a < 1). If m is odd, the median of f is the m+1 2 -th order value grade m+1 (f). If m is even, then the median equals 2 ] Q µ,f ( 1 2 [grade ) = m (f), grade m +1(f), which may be a singleton or a proper 2 2 interval. A similar procedure works for interval valued functions. Example 3.4 Let L = n, Ω = m and f : Ω Intv(L), then the ranks ϱ f(ω) form a chain r 1 r 2 r m with repetitions in {0, 1,..., 2n} N 0. rank k (f) := r k is called the k-th rank of f, k = 1,..., m. If f is single valued then rank k (f) = ϱ (grade k (f)) = grade k (ϱ f). Applying Example 3.3 we see, that the k-th rank of f is a quantile of ϱ f. 4 Aggregation functionals Select an appropriate increasing function l : [0, 1] L. It relates the scale [0, 1] of the measure µ to the scale Intv(L) of the interval valued functions f : Ω 3 If we had applied the increasing distribution function as it is common in probability theory, we had gotten the q-quantile. But if one wants to use non-additive monotone measures, one is forced to apply the decreasing distribution function as we do in [3] and in the present paper. 5
Intv(L), hence we call it the commensurability function. The interval-valued Fan-Sugeno functional (or integral) w.r.t. l is defined as S µ,l (f) := l(p) Q µ,f (p) for f : Ω Intv(L). (2) p [0,1] Example 4.1 Let L = [0, 1] R, l = id [0,1] and f : Ω [0, 1]. Then the top x [0,1] x G µ,f (x) of the interval S µ,id (f) = x [0,1] x Q µ,f (x), is the Sugeno integral of f w.r.t. µ ([4]). Example 4.2 let l x 0 be constant, then the Fan-Sugeno functional is essentially constant as well, S µ,l (f) = x 0 except for the f with G µ,f (x 0 ) = 0 or 1, where S µ,l (f) could be a proper interval with top x 0. Example 4.3 let l = I [q,1] be the indicator function of the interval [q, 1] [0, 1], then the Fan-Sugeno functional is the (1 q)-quantile, S µ,l (f) = Q µ,f (q). Here are the basic properties of Fan-Sugeno functionals. Proposition 4.1 Let L be a complete linear lattice. Let µ, ν : 2 Ω [0, 1] be probability measures on a set Ω and k, l : [0, 1] L commensurability functions. The Fan-Sugeno functional has the following properties where f, g : Ω Intv(L), a L: (i) l(µ(a)) = x S µ,l (I A ) x for A Ω, especially µ can be reconstructed from S µ,l and l, supposed l is injective; (ii) f g implies S µ,l (f) S µ,l (g) ; (iii) If l(o) = O, then S µ,l (a f) = a S µ,l (f) ; (iv) S µ,l (f g) S µ,l (f) S µ,l (g) and equality holds if µ is an upper chain measure; (v) Comonotonic maxitivity: if f, g are comonotonic, then S µ,l (f g) = S µ,l (f) S µ,l (g) ; (vi) ν µ and k l imply S ν,k (f) S µ,l (f). Proof [3] Proposition 7.1 supposes single valued functions f : Ω L, but it holds for interval valued functions, too. Property (ii) says that the Fan-Sugeno functional is an order morphism, i.e. an increasing mapping. Here the order on Intv(L) Ω is the pointwise ordering. A functional S : Intv(L) Ω Intv(L) is called strictly increasing if it is increasing and for functions f, g : Ω Intv(L) µ({ω f(ω) g(ω)}) = 1 S µ,l (f) S µ,l (g). This condition does not hold if l is constant (Example 4.2), but it holds if l is essentially injective. 6
Proposition 4.2 Suppose, in addition to the assumptions of Proposition 4.1, that Ω 2, L 3 and that the Fan-Sugeno functional S µ,l is srictly increasing. Then there exists q [0, 1] such that S µ,l (f) = Q µ,f (q). Proof Let us suppose that S µ,l is not a quantile, then, according to Example 4.3, there exists a p 0 {µ(a) A Ω}, say p 0 = µ(a 0 ),??? such that x 0 := l(p 0 ) O, I. We construct functions f, g : Ω L with f(ω) g(ω) ω and S µ,l (f) = S µ,l (g), a contradiction. There exist x 1, x 1 L such that x 1 is successor of x 0 and x 0 successor of x 1. Set { x0 if ω A f(ω) := 0 else x 1 { x1 if ω A, g(ω) := 0 else x 0. Then f(ω) g(ω) ω and S µ,l (f) = x 0 = S µ,l (g). A useful property of the Fan-Sugeno functional is Lemma 4.3 Let Ω be finite, f : Ω L a function and ω 1 Ω a point with positive weight µ(ω 1 ) > 0. Then f(ω 1 ) and S µ,l (f) are comparable in Intv(L). Proof Let S µ,l (f) = [a, b], then f(ω 1 ) and S µ,l (f) are incomparable iff a f(ω 1 ) b. Suppose this holds, there is an open interval I [0, 1] of length µ(ω 1 ) > 0 such that Q µ,f (p) = f(ω 1 ) for p I. With p 1 I we distinguish two cases. If l(p 1 ) f(ω 1 ) then (2) implies S µ,l (f) Q µ,f (p 1 ) l(p 1 ) = f(ω 1 ) l(p 1 ) = f(ω 1 ), contradicting incomparability. In the other case l(p 1 ) f(ω 1 ) we get from monotonicity of l and Q µ,f (p) that Q µ,f (p) l(p) f(ω 1 ) for all p [0, 1]. Then (2) implies S µ,l (f) f(ω 1 ), a contradiction. 5 Application to grading competitors We resume the application of ranking competitors in Example 1.1. So we suppose that Ω is finite and µ the uniform distribution on Ω. In this section we restrict to single valued function f : Ω L. The following definitions, adapted from Balinski and Laraki [1] 4, are desirable properties of aggregation functionals for ranking purposes. A functional S : L Ω Intv(L) is called a social-grading functional if S is (i) anonymous, i.e. S(f π) = S(f) for any permutation π of Ω. (ii) unanimous, i.e. S applied to a constant function returns this constant, S(I) = I for any I Intv(L). (iii) strictly increasing. 4 There only functionals S : L Ω L are considered. 7
A functional S : L Ω Intv(L) is called strategy-proof-in-grading, if for any f, g : Ω L and any ω 0 Ω f(ω 0 ) S(f), g(ω) = f(ω) ω ω 0 S(f) S(g), (3) f(ω 0 ) S(f), g(ω) = f(ω) ω ω 0 S(f) S(g). (4) Notice that f(ω 0 ) L and S(f) Intv(L) could be incomparable. As we have seen in Lemma 4.3 this does not happen for Fan-Sugeno functionals. Condition (3) (analogously (4)) excludes, that for the competitor with score f the outcome S(f) can be manipulated by judge ω 0 in favor of this competitor: If judge ω 0 values a competitor better than the jury, he couldn t have increased the jury s valuation S(f) in changing his valuation f(ω 0 ). Proposition 5.1 For finite Ω, Fan-Sugeno functionals are strategy-proof-in-grading on L Ω, i.e. for single valued functions. Proof By symmetry it is sufficient to prove one of the implications (3) and (4). Let s take f and g with the assumptions of (4). We distinguish two cases. First, if g(ω 0 ) f(ω 0 ), then g f, so that S µ,l (g) S µ,l (f) by monotonicity of the Fan-Sugeno functional ([3] Proposition 7.1). Second suppose g(ω 0 ) f(ω 0 ). For x f(ω 0 ) we then get {ω g(ω) x} = {ω f(ω) x} and similar with, so that by (??) the distribution functions of g and f coincide for these x. Then also the pseudo inverse functions coincide for p < G µ,f (f(ω 0 )), Q µ,g (p) = Q µ,f (p) for p < G µ,f (f(ω 0 )). Since by assumption f(ω 0 ) S(f) we conclude from (2) S µ,l (g) = S µ,l (f). [1] Theorem 2 states, that the class of strategy-proof-in-grading social grading functions consist only of the k-th grades (Example 3.3). Proposition 5.1 seems to contradict to this result since the class of Fan-Sugeno functionals, which are strictly monotone, is much larger than the class of k-th grades (Proposition 4.2). But one has to recall, that [1] allows only single valued functionals and, if the median is a proper interval, takes the bottom of this interval. The other intention of [1], to get a ranking for the competitors, i.e. for functions f, seems to be more complicated in our context since our aggregation functionals are interval valued and intervals may be incomparable. But our extended scale Intv(L) is a graded lattice (see Section 1), which at least allows to compare ranks, i.e. to use ϱ S as aggregation functional. But then the number of ties could increase. Another possibility could be to use one of the linear orders lex or lex, extending the partial order on Intv(L) to a linear order (see Section 2). This may be appropriate if the bottom (respectively the top) of the intervals should get more weight for the ranking. Anyway, ties S(f) = S(g) remain a problem for rankings. 8
6 Bipolar scales The interval extension Intv(L) of the linear scale L = [O, I] from Section 1 and the interval scale Intv(L) of the opposite linear scale L = [ I, O] with inverse ordering are united to the bipolar scale R, where O and O are identified, R := Intv(L) ( Intv(L))/(O = O). The ordering of R is denoted with like for Intv(L) and the lattice operations with and. The latter will be modified below in order to mimic the behavior of addition and multiplication on R {, } w.r.t. the sign of the reals. The set of one-point intervals in R is R := L ( L)/(O = O). Like L, also R is a linearly ordered set and a complete lattice. I is the bottom of R, I the top and O is called the neutral point. The bipolar scale R is a proper sublattice of Intv(R), R Intv(R), for example R = [ I, I] / R. The reflection Intv(R) Intv(R), [a, b] [ b, a], at O R is an antitone bijection. It applies R to R and Intv(L) to Intv(L). The absolute value is defined as usual { X if X O : R Intv(L), X := X if X O. The sign function is I sign : R R, sign X := O I for X O for X = O for X O Next we extend meet and join on Intv(L) to bipolar meet and bipolar join on R setting for X, Y R { X Y if sign X = sign Y X Y := ( X Y ) else. X Y if X, Y O X Y if X, Y O X Y := X if sign X sign Y, X Y. Y if sign X sign Y, X Y O if X = Y or X, Y incomparable in Intv(L) Bipolar meet resembles multiplication of real numbers with the property the product of negative numbers is positive etc.. Similarly bipolar join resembles addition with the property the sum of x and x is zero. For the properties of the bipolar operations see Proposition 2.1 in [3]. Here is an example for a bipolar scale in management science, but with cardinal aggregation. 9.
Example 6.1 The Net Promoter Score, a management tool, uses as scale the natural numbers from 0 to 10 with the natural ordering. This scale serves only as a frame for the inquiry, the evaluation is done with the bipolar scale R = { I, O, I}, where I represents the values 0 to 6 of the original scale, O the values 6 to 7 and I the values 9 to 10. Here Ω is the set of inquired persons, each having equal weight, i.e. µ is the uniform distribution (Example 3.1). The result of an inquiry is a function f : Ω R. The Net Promoter Score is defined as µ(f = I) µ(f = I). That s just the integral f dµ if one perceives f as real valued function with values in { 1, 0, 1}, replacing R. In Example 7.1 we will see how this score can be modified to become a purely ordinal one. 7 Aggregation of bipolar functions If gains and losses are possible, experimental findings suggest 5, that the positive and negative part of a function have to be aggregated separately. There result two Fan-Sugeno functionals, a symmetric and an asymmetric one. Which one to apply in practice depends on the application. Examples for different types of applications should be elaborated. For a function f : Ω Intv(R) we define the positive and negative parts f +, f : Ω Intv(L) by f + (ω) := f(ω) O, f := ( f) + or f (ω) = ( f(ω) O ) and f can be reconstructed from both parts, f + (ω) if f (ω) = O f(ω) = f (ω) if f + (ω) = O f + (ω) f (ω) else, for ω Ω. If f(ω) R for all ω, then f is the bipolar join of f + and f as defined in the last section, f = f + ( f ) for f : Ω R. The symmetric Fan-Sugeno functional or integal of a function f is now defined as the bipolar difference of the functional s values of the positive and negative parts of f, SS µ,l (f) := S µ,l (f + ) ( S µ,l (f ) ), f : Ω Intv(R). As the name suggests, it is symmetric w.r.t. reflection, SS µ,l ( f) = SS µ,l (f). 5 See Section 1 of [3] for a discussion of this point. 10
In general f and f + ( f ) are not equal, but their symmetric Fan-Sugeno integrals coincide. The asymmetric Fan-Sugeno functional of a function f... The next example shows, that the asymmetric Fan-Sugeno functional attributes the weight of the answer don t know to the neutral point O of the bipolar scale and not to the bottom of the scale as in Example 3.2 with unipolar scale. Example 7.1 Resuming Example 6.1 we allow for intervals as answers of the inquiry, f : Ω Intv(R). 11
References [1] M. Balinski and R. Laraki: Judge : Don t Vote!. École Polytechnique CNRS, Cahier n 2010-27 (2010). [2] M. Balinski and R. Laraki: Majority judgement : measuring, ranking and electing. MIT Press, Boston 2011. [3] D. Denneberg and M. Grabisch: Measure and integral with purely ordinal scales. Journal of Mathematical Psychology 48 (2004), S. 15-27. [4] M. Sugeno: Theory of fuzzy integrals and its applications. Ph.D. Thesis, Tokyo Institute of Technology (1974). [5] D.M. Topkis: Supermodularity and complementarity. Princeton University Press, Princeton 1998. 12