OPHI Working Paper 26 - PDF Free Download

Rank Robustness of Composite Indices * James E. Foste, Vandebilt Univesity and Univesity of Oxfod Mak McGillivay, Austalian Agency fo Intenational Development Suman Seth, Vandebilt Univesity Peliminay daft. Not fo quotation o citation without pemission. Abstact Many common multidimensional indices take the fom of a composite index o a weighted aveage of seveal dimension-specific achievements. Rankings aising fom such an index ae dependent upon an initial weighting vecto, and any given judgment could, in pinciple, be evesed if an altenative weighting vecto was employed. This pape examines a vaiable-weight obustness citeion fo composite indicatos that views a compaison as obust if the anking is not evesed at any weight vecto within a given set. We chaacteize the esulting obustness elations fo vaious sets of weighting vectos and illustate how they modeate the complete odeing geneated by the composite indicato. We popose a measue by which the obustness of a given compaison may be gauged and illustate its usefulness using data fom the Human Development Index. In paticula, we show how some county ankings ae fully obust to changes in weights while othes ae quite fagile. We investigate the pevalence of the diffeent levels of obustness in theoy and pactice and offe insight as to why cetain datasets tend to have moe obust compaisons. JEL Classifications: I31, O12, O15, C02. Keywods: Composite indicato, Multidimensional index, Human Development Index, Weighting vecto, Robustness, Positive association, Rank coelation, Kendall s tau. OPHI Woking Pape 26 Januay 2009 * This pape has benefited fom discussions with and comments on ealie dafts fom a numbe of people. They include Sabina Alkie, Fahad Noobakhsh, Noa Makova, Tony Shoocks and paticipants of the Oxfod Povety and Human Development Initiative (OPHI) Wokshop on Weighting held on 27-28 May June 2008 at the Univesity of Oxfod and the paticipants of the Ameican Economic Association Meeting 2009 held at San Fancisco. The usual disclaime applies. Coespondence to James Foste, james.e.foste@vandebilt.edu. Depatment of Economics, Box 1611 Station B, Vandebilt Univesity, Nashville, TN 37235, USA +1 615 322 2192, James.E.Foste@vandebilt.edu and Oxfod Povety & Human Development Initiative (OPHI), Queen Elizabeth House (QEH), Depatment of Intenational Development, 3 Mansfield Road, Oxfod OX4 1SD, UK +44 1865 271915. Chief Economist, Austalian Agency fo Intenational Development (AusAID), GPO Box 887, Canbea ACT 2601, Austalia, +61 2 6206 43, Mak.McGillivay@ausaid.gov.au. Depatment of Economics, Vandebilt Univesity, VU Station B#351819 2301, Vandebilt Place, Nashville, TN 37235, USA, +1 615 383 6910, Suman.Seth@vandebilt.edu. www.ophi.og.uk 1

1. Intoduction Remakable attention is given to ankings aising fom vaious indicatos. This is especially tue of county ankings. People ae natually cuious as to how thei county compaes to othes, national pide is often at stake, and national govenments ae often quick to claim cedit fo a high o highe than expected anking if it can be linked, dubiously o othewise, to public policy. Moe geneally, the media, business goups, civil society, sections of the eseach community and intenational oganisations egulaly monito and epot on county ankings of indices assessing a vaiety of phenomena such as sustainability, couption, ule of law, national income, economic policy efficacy, institutional pefomance, happiness, human well-being, tanspaency, globalisation, human feedom, peace o vulneability. It is widely ecognised that many of the peceding phenomena ae multidimensional. This, combined with the availability of moe and bette data, has in ecent decades led to the inceased use of composite indices. These indices by thei vey natue combine in vaious ways indicatos of achievement in the dimensions of the phenomenon in question. Many of the ankings that attact geatest inteest aise fom indices of this type. The inteest in national govenments and othes in ankings aising fom composite indices is, howeve, blind to long held concens egading thei constuction. A cental concen is the weighting of dimension specific achievements. In a pefect wold the weight vectos would be based on infomation on a meta poduction function fo the phenomenon in question. An absence of accepted infomation on these functions has esulted in one of thee weighting schemes. The most common is to select weights abitaily, typically by taking the simple aithmetic mean of the indicatos in question. 1 Using this mean is intepeted as assigning equal weights to each dimension. The poponents of this equal weight appoach acknowledge that is deficient as in eality the dimensions will almost cetainly have diffeential impotance, but ague that thee is no accepted basis o guidance fo doing othewise. In this sense the equal weight appoach is seen as the least deficient available weighing scheme, one that is likely to attact the least disageement. 2 Ambiguity ove the numeical values of weights employed by composite indices natually leads one to question the anking aising fom these measues. 1 Othe appoaches ae eithe nomative o statistical. The nomative appoach involves setting weights eithe in accodance with individual o societal noms, the fome often being those of the designes of the index in question. The second is statistical, being puely data-diven. Many diffeent such appoaches have been poposed. The most popula being pincipal components analysis, with the fist pincipal component extacted fom the dimension achievement indicatos seving as the composite index. Both appoaches ae fundamentally flawed, the fome because of a lack of guidance as to whose noms should be used and the latte because of a difficulties in intepetation. 2 Fo example, the poponents of the Envionmental Sustainability Index (ESI) agued fo equal weights on the gounds that that no objective mechanism exists to detemine the elative impotance of the diffeent aspects of envionmental sustainability. Othe composite indices, used in envionmental, well-being and elated fields that employ equal weights include the Child Well-being Index, Commitment to Development Index, Economic Resilience Index, Economic Vulneability Index, Envionmental Pefomance Index, Envionmental Sustainability Index, Gende Empowement Measue, Gende-elated Development Index, Genuine Pogess Measue, Global Peace Index, Human Development Index, Human Povety Index, Index of Economic Feedom, Global Peace Index and the Physical Quality of Life Index. In most of the above cases the index is fomed by taking the simple aithmetic mean of the component indicatos. www.ophi.og.uk 2

Specifically, one can ask to what extent these ankings ae dependent upon the initial weighting vecto, and whethe any given judgment could be evesed if an altenative weighting vecto was employed. Such is the focus on this pape. Using a dominancebased analytical famewok, it examines a vaiable-weight obustness citeion fo composite indicatos that views a compaison as obust if ankings ae not evesed at any weight vecto within a given set. The pape chaacteizes the esulting obustness elations fo vaious sets of weighting vectos and illustates how they modeate the complete odeing geneated by the composite indicato in question. It poposes a measue by which the obustness of a given compaison may be gauged and illustate its usefulness using data fom the Human Development Index (HDI). The HDI is a vey well known and widely used measue of well-being at the level of nations and the ankings it povides ae the subject of intense intenational inteest. 3 The pape shows how some county ankings ae fully obust to changes in weights while othes ae quite fagile, and investigates the pevalence of the diffeent levels of obustness in theoy and pactice and offe insight as to why cetain datasets tend to have moe obust compaisons. It fom the outset be emphasised that the fundamental pupose of the pape is not to discouage the epoting o use of multidimensional indices and the ankings they povide. Rathe, it is to facilitate moe incisive intepetation of these ankings. The emainde of pape is stuctued as follows. Section II povides with a desciption of the mathematical concepts, notations and definitions used thoughout the pape. A fomal teatment of the notion of dominance and its elation to ank obustness is povided in Section III. Section III also defines and chaacteizes a patial odeing, analogous to that of Foste and Shoocks (1988) that facilitates the constuction of a measue of obustness. Section IV constucts a ank obustness measue. Section V looks at the pevalence of obust compaisons, highlighting how the numbe of ambiguous compaisons acoss an entie sample of obsevation depends on the association between the dimension indicatos used in the index in question. The HDI and a numbe of othe indices ae used in this section to illustate key points. The pape concludes in Section VI. Special attention is given with some eflections on the futue design of composite indices, in paticula tade offs between ank obustness and empiical edundancy. 2. Concepts, Definitions and Notation This section povides the mathematical definitions and notation used in the pape. The numbe of dimensions to be summaized in a composite index is denoted by an intege D 2. An achievement vecto x = (x 1,,x D ) R D indicates the scoe achieved in each of the D dimensions, while X R D gives the set of all of achievement vectos that ae possible. A dataset is a finite set ˆX of ˆn elements dawn fom X. Fo any a,b R D, let a = Σ D d =1 a d denote the sum of a s components 3 The annual publication of the Human Development Repot is a much awaited intenational event owing almost entiely to the HDI county anking it contains. This is evident fom a 2006 aticle in the New Yok Times, which with a not insignificant dose of fanfae epoted that fo the sixth yea in a ow, Noway was anked fist on the United Nations' human development index as the county poviding its citizens with the best chance of living a long and pospeous life. (New Yok Times, 2006). www.ophi.og.uk 3

and let a b = Σ D d =1 a d b d epesent the inne poduct of a and b. The expession a b indicates that a d b d fo d = 1,, D; this is the vecto dominance elation. If a b with a b, then this situation is denoted by a > b; while a >> b indicates that a d > b d fo d = 1,, D. The least uppe bound of a and b, denoted by a b, is the vecto having max {a i, b i } as its i th coodinate; the geatest lowe bound of a and b, denoted by a b, is the vecto having min {a i, b i } as its i th coodinate. *** Figue 1 Hee *** The unit simplex is defined as S = {s R D : s 0 and s = 1} and is depicted in Figue 1 above fo D = 3. It contains all possible ways of weighting the vaious achievements. The vetices of the simplex ae given by v d = e d fo d = 1,, D, whee e d is the usual basis element that places full weight on the single achievement d. The centoid o cental point of the simplex v 0 = (1/D,,1/D) places equal weight on all achievements. Clealy, v 0 is the simple aveage of the vetices of S, o v 0 = Σ D d =1 v d / D. Fo example, when D = 3, the vetices of the simplex S ae v 1 = (1,0,0), v 2 = (0,1,0), and v 3 = (0,0,1), while the centoid is v 0 = (⅓,⅓,⅓). In what follows, we constuct smalle vesions of S by popotionally contacting the vetices of S towad a given point w 0 of S. Fo any (0,1] and d = 1,2,, D, define v d = (1 )w 0 + v d, and let S be the egula simplex geneated by the vetices v 1,v 2,...,v D ; equivalently, let S be the convex hull of {v 1,v 2,...,v D }. Fo example, if = 0.25 and w 0 = v 0 then v 1 = (0.5, 0.25, 0.25), v 2 = (0.25, 0.5, 0.25), and v 3 = (0.25, 0.25, 0.5) as illustated in Panel I of Figue 2. The esulting simplex S is outlined by an equilateal tiangle whose vetices ae one-fouth of the way to the vetices of S fom the centoid v 0. If w 0 = (0.6, 0.2, 0.2), then v 1 = (0.7, 0.15, 0.15), v 2 = (0.45, 0.4, 0.15), and v 3 = (0.45, 0.15, 0.4), and S is the egula simplex depicted in Panel II. Note that as dops to 0, the simplex S shinks to the point w 0, while if = 1, we have S = S. *** Figue 2 Hee *** Lemma 1 Let v 0 is the centoid of the unit simplex S. Fo any w 0 S such that w 0 v 0, S (w 0 ) and S (v 0 ) ae equal in volume fo all (0,1]. Poof The d th vetex of the simplex S (w 0 ) and S (v 0 ) ae given by the vectos v d (w 0 ) = (1 )w 0 + v d and v d (v 0 ) = (1 )v 0 + v d, espectively. The diffeence between the d th vetex of these two simplexes ae given by the vecto δ d = v d (w 0 ) v d (v 0 ) = (1 )(w 0 v 0 ). Theefoe, δ d = (1 )(w 0 v 0 ) fo all d and S (w 0 ) is obtained fom S (v 0 ) www.ophi.og.uk 4

by shifting all vetices of the late by δ d. Hence, S (w 0 ) and S (v 0 ) ae equal in volume fo all (0,1]. 4 3. Robust Compaisons While thee ae many conceivable ways of aggegating achievements, we focus hee on a common fom of multidimensional index based on a weighted aveage of individual levels. A composite indicato C: X S R applies the enties of a weighting vecto w S to the espective enties in the achievement vecto x X and sums to obtain the geneal fom C(x;w) = w x. To implement this appoach in pactice one must select a specific weighting vecto. In what follows, it is assumed that an initial weighting vecto w 0 S has aleady been chosen; this fixes the specific composite indicato C 0 : X R defined by C 0 (x) = C(x;w 0 ) fo all x X. The associated stict odeing of achievement vectos will be denoted by C 0, so that x C 0 y if and only if C 0 (x) > C 0 (y). Aguably the wold s best known composite multidimensional index is the HDI. It fist appeaed in the United Nations Development Pogam (UNDP) Human Development Repot 1990 (UNDP, 1990). HDI values have since been published annually fo moe than 170 counties. The HDI povides infomation on D = 3 achievements, namely, health, education and income. 5 The weighting used in the HDI is v 0, o equal weighting, and the esulting composite index has the fom C 0 (x) = v 0 x = (⅓)x 1 + (⅓)x 2 + (⅓)x 3. Table 1 povides data fo the ten counties with the highest HDI levels in 2004 as given in UNDP (2006). The counties ae listed in ode of HDI fom highest (Noway), to second highest (Iceland), and so foth, until the tenth highest county (Nethelands). The associated infomation on county compaisons is also epesented in matix fom in Table 2. Wheneve a column county has a highe HDI value than a ow county, this is indicated with C 0 in the associated cell. Note that evey cell below the diagonal is filled, eflecting the fact that the HDI, like any composite index, geneates a complete odeing once a specific weighting vecto has been chosen. *** Table 1 Hee *** *** Table 2 Hee *** Howeve, it should be bone in mind that coss-county HDI compaisons ae entiely contingent on the chosen weighting vecto v 0 and eveal little about the obustness of judgments as the weights ae vaied. Indeed, look at the twin examples of Austalia vesus Sweden and Ieland vesus Canada. The HDI of Austalia exceeds that of Sweden by about 0.006, and Ieland is highe than Canada by the same magin. 4 1 Note that if = 1, then δ = 0 and S 1 (w 0 ) = S 1 (v 0 ). d 5 The health index is based on life expectancy; the education index is based on enolment and liteacy ates; and the income index based on pe capita Goss Domestic Poduct. Fo detailed deivation of the dimension specific indices, see the technical note (UNDP, 2006, p. 394). www.ophi.og.uk 5

It is an easy matte to show that Austalia has a highe level of the composite indicato than Sweden fo all weighting vectos w S. On the othe hand, the pai-wise anking of Ieland and Canada can be easily evesed: fo example, the weighting vecto w' obtained fom v 0 by a 0.05 shift of weight fom each of the education and income dimensions to the health dimension is one such example. In symbols, let C be the composite indicato ove the D = 3 dimensions of health, education and income, and let x, y, x', and y' denote the espective achievement vectos fo Austalia, Sweden, Ieland and Canada. Then fo Austalia and Sweden we have C(x;v 0 ) > C(y;v 0 ) and this anking is neve evesed at any w S; while fo Ieland and Canada, C(x';v 0 ) > C(y';v 0 ) and this anking is evesed fo the altenative weighting vecto w' given above. In sum, C 0 compaisons that appea to be identical in Tables 1 and 2 can be diffeentially obust, and this in tun may have some beaing on ou intepetation of such compaisons. To diffeentiate among C 0 compaisons (such as those found in Table 2) we fomulate a binay elation ove achievement vectos in X that will indicate when an initial compaison is obust. Ou constuction begins with a set W S of easonable weighting vectos containing the initial vecto w 0. We say that x obustly dominates y given W, witten x C W y, if and only if C(x,w 0 ) > C(y,w 0 ) and C(x,w) C(y,w) fo all w W. In othe wods, the composite indicato is highe fo x than y when the weighting vecto is w 0, and this anking is neve evesed fo any othe weighting vecto in W. The elation C W is clealy tansitive, so that if x C W y and y C W z, then x C W z. Howeve, it will often be the case that x C W y does not hold despite x C 0 y being tue, since C W equies thee to be no evesal fo any vecto in W. As W expands, the likelihood of a evesal ises, and fewe obust compaisons can be made; in othe wods, if W W' then x C W' y implies x C W y, but not vice vesa. This pape exploes the obustness elation C W fo vaious specifications of the set W, povides chaacteizations of the elations, and offes insight on thei applicability. A. Full Robustness B. If W = S, the associated elation C W applies when an initial judgment x C 0 y is neve evesed at any configuation of weights. In this case we say that the compaison is fully obust and shall denote the elation C W by C 1. Requiing unanimity ove all of S is quite demanding and consequently C 1 is the least applicable among all obustness elations; howeve, when it applies, the associated anking of achievement vectos is maximally obust. The examples of Austalia vesus Sweden and Ieland vesus Canada fom Table 1 suggest a simple chaacteization of C 1. Notice that Austalia is highe than Sweden in each of the thee dimensions, and hence C 1 holds in this case. In contast, Ieland is highe than Canada in two dimensions and lowe in one, which is why C 1 does not apply (i.e., the anking is evesed when the weight is high enough on the evesed dimension). We have the following esult. Theoem 1 Let x C 0 y fo x, y X. Then x C 1 y if and only if x y. Poof Suppose that x C 0 y is tue. If x y holds, then clealy C(x;w) = w x w y = C(y;w) fo all w S, and thus x C 1 y. Convesely, if x C 1 y holds, then setting w = v d in C(x;w) C(y;w) yields x d y d fo all d, and hence x y. www.ophi.og.uk 6

In ode to check whethe a given anking x C 0 y is fully obust, one need only veify that the achievement levels in x ae at least as high as the espective levels in y. 6 The following coollay povides two altenative sets of sufficient conditions fo C 1. Coollay Let x, y X. Then x C 1 y holds if (i) x >> y o if (ii) w 0 >> 0 and x > y. Poof Both sets of conditions entail x y, hence by Theoem 1 we need only veify that both imply x C o y. If x >> y, then fo any w S we have w x > w y, hence x C o y. If w 0 >> 0 and x > y, it follows immediately that w 0 x > w 0 y, and hence x C o y. One inteesting implication of the Theoem 1 is that the elation C 1 is meaningful when vaiables ae odinal and no basis of compaison between them has been fixed. 7 Suppose that each vaiable x d in x is independently alteed by its own monotonically inceasing tansfomation f d (x d ) and let y = (f 1 (x 1 ),, f D (x D )) be the esulting tansfomed achievement vecto. 8 We can show that x C 1 x' implies y C 1 y' fo the espective tansfomed vectos y and y'. Indeed, if x C 1 x', we know that x C 0 x' holds by definition, while x x' is tue by Theoem 1. Tansfoming the vaiables yields y y' and thus (1) w y w y' fo all w S since w 0. By x C 0 x' it follows that w 0 x > w 0 x' is tue; thus, fo some d we have w 0 0 d x d > w d x d, and hence x d > x' d with w 0 d > 0. Though the tansfomation we obtain y d > y' d and this, when combined with w 0 d > 0 and (1), yields w 0 y > w 0 y', o y C 0 y'. By (1), then, we have y C 1 y'. In othe wods, if C 1 holds fo any given cadinalization of the odinal vaiables, it holds fo all cadinalizations. Note that while C 0 on its own is not meaningful in this context (since tansfoming vaiables can lead to y' C 0 y even though initially we had x C 0 x'), the fully obust elation C 1 has the popety that x C 1 x' if and only if y C 1 y', and hence is an appopiate technology fo odinal vaiables. Now etuning to the case of the HDI, we might ask how many of the 45 C 0 compaisons given in Table 2 ae fully obust. The answe, povided in Table 3, is that just fou compaisons exhibit vecto dominance of achievement vectos and hence C 1. This is pehaps not unexpected due to the naow diffeences in HDI values among the highest anked counties. The pictue fo the entie dataset eveals geate applicability fo the elation C 1 : fo the 177 counties, thee ae a total of 10,875 fully obust compaisons out of a possible 15,576 compaisons, implying that just unde 69.8% of all compaisons ae fully obust. Why ae thee so many fully obust HDI compaisons? We etun to this point below in Section V. *** Table 3 Hee *** B. Limited Robustness 6 Note that x C 0 y pecludes the possibility that x = y, and hence x C 1 y actually entails x > y. 7 Robets () meaningful. 8 The esulting function f:x R D defined by f(x) = (f 1 (x 1 ),, f D (x D )) is called a monotonically inceasing tansfomation below. www.ophi.og.uk 7

Wheeas the pevious section took W to be the entie simplex S, we now conside using the smalle simplex S of weighting vectos. This will lead to a less demanding obustness elation than C 1, but one that is moe geneally applicable. Recall that the simplex S is the convex hull of vetices { v 1,...,v D } located a fixed popotion of the way fom w 0 to the vetices of S. When w 0 = v 0, the esulting simplex S is a smalle vesion of S with v 0 at its cente; fo geneal w 0, the set S is a scaled down vesion of S that peseves the elative position of w 0. In eithe case, the the size of S is always the same fo fixed (0,1]. Substituting W = S in the definition of C W yields the th ode obustness elation, denoted hee by C and defined as follows: x C y if and only if x C 0 y and C(x;w) C(y;w) fo all w S. This elation etains all the popeties of the geneal obustness elation, and since the sets S ae nested fo a fixed w 0, we know that that x C y implies x C ' y wheneve > '. Now suppose that x C 0 y holds fo the pai x and y of achievement vectos. What additional conditions on x and y ae needed to ensue that x C y? One easy-toveify set of necessay conditions is fo C(x;w) C(y;w) at each vetex w = v d of S. Indeed, define x = ( x 1,..., x D ) whee x d = C(x; v d ) = v d x and let y be the analogous vecto deived fom y. Then the necessay condition can be stated as x y. The next theoem shows that, in fact, this is also sufficient. Theoem 2 Let x C 0 y fo x, y X. Then x C y if and only if x y. Poof We need only veify that x C 0 y and x y imply x C y. Pick any w S, and note that since S is the convex hull of its vetices, w can be expessed as a convex combination of v 1,...,v D, say w = α 1 v 1 + +α D v D whee α 1 + +α D = 1 and α d 0 fo d = 1,,D. But then C(x;w) = w x = α 1 v 1 x+ +α D vd x = α 1 x 1 + +α D x D, and similaly C(y;w) = α 1 y 1 + +α D y D ; theefoe x y implies C(x;w) C(y;w). Since w was an abitay element of S, it follows that x C y. Theoem 2 shows that to evaluate whethe a given compaison x C 0 y exhibits th ode obustness, one need only compae the associated vectos x and y. If each component of x is at least as lage as the espective component of y, then the compaison is obust accoding to C ; if x has at least one component lowe than the espective enty of y, then, the oiginal anking is not obust. Note that when = 1, we have x = x, and Theoem 2 educes to Theoem 1. 9 *** Table 4 Hee *** *** Table 5 Hee *** Table 4 illustates this appoach fo = 0.25 fo the ten highest HDI counties, with the final thee columns listing the enties of the associated vecto x. The table eveals a host of compaisons that can be made using C fo this value of. In paticula, the anking between Iceland and USA, which was detemined not to be fully obust, is th ode obust as is appaent by the dominance of the last thee enties v = (1-)w 0 + v d, it follows that {} 9 Note that since d x d = v d x = (1-)C 0 (x) + x d ; in othe wods, the vecto x is a convex combination of the vecto (C 0 (x),, C 0 (x)) of aveage achievements and x itself. www.ophi.og.uk 8

fo Iceland ove the espective enties fo USA. On the othe hand, the compaison between Ieland and Canada, which Table 1 showed was not fully obust, is also not obust in the pesent case, since the fouth column enty fo Canada is highe than the thid column enty fo Ieland. Table 5 lists all compaisons among the top ten counties that can be made using the obustness elation C fo = 0.25; fully 46.7% of the possible compaisons can now be made. Fo the entie list of 177 counties, 91.8% of the compaisons exhibit th ode obustness fo this level of. C. Gaphical Depiction Figue 3 uses data fom Table 1 to povide gaphical epesentations of the conditions associated with the vaious obustness elations. In each panel, the twodimensional simplex S is depicted in the hoizontal plane at the base of the gaph, as ae its thee vetices v d and the initial weighting vecto v 0 at its cente. The smalle simplex S and its vetices v d ae also epesented within S (fo the case = 0.25). Now suppose that a given county with achievement vecto x has been selected. Fo any weighting vecto w in the simplex, the level of the composite indicato C(x;w) is gaphed as the height above the vecto w. Thus, the heights at v 1, v 2, v 3, and v 0 ae, espectively, x 1, x 2, x 3, and the HDI. The lineaity of C in w ensues that these points and the emaining C(x;w) values fom a tilted achievement simplex with vetices as high as the dimensional achievements and a cente as high as the county s HDI level. *** Figue 3 Hee *** Compaisons fo thee pais of counties - Austalia and Sweden, Iceland and USA, and Ieland and Canada - ae depicted. In Panel 1 the achievement simplex of Austalia is completely above the achievement simplex fo Sweden, eflecting the vecto dominance of the espective achievement vectos in Table 1. Austalia has the highe HDI and it is clea fom the gaph that thee is no weighting vecto fo which it has a lowe composite indicato level than Sweden. This is an example whee the elation C 1, o complete obustness, holds. The second panel depicts a athe diffeent scenaio fo Iceland and USA: the achievement simplexes intesect and C 1 cannot hold. Iceland pefoms bette than the USA in tems of both health and education, but USA s achievement is highe fo income. Moe weight on health and education makes Iceland s level of the composite indicato highe than that of the USA, wheeas moe weight on income makes USA s level highe. Dominance does hold if we estict consideation to the smalle simplex S. When the intesection is pojected down to S, the esulting (dashed) line does not coss S and hence all weights in S follow the oiginal HDI anking in selecting Iceland above USA. Indeed, Table 4 confims that Iceland has highe levels of the composite indicato than USA at each of the vetices of S. Consequently, while C 1 does not hold, C cetainly does fo = 0.25. The final panel depicts the case of Ieland and Canada, which has the same HDI diffeence as Austalia and Sweden and intesecting achievement simplexes like Iceland and USA, but has diffeent obustness chaacteistics than each. While Ieland s education and income vaiables ae highe than Canada s, the health index www.ophi.og.uk 9

has the opposite oientation, and C 1 cannot hold. If we poject the intesection of the espective achievement simplexes on S, we obtain a dashed line that cuts S, implying that C does not hold. The Ieland-Canada compaison is not obust even fo = 0.25. This is also evident fom Table 4 since Ieland has highe levels of the composite indicato at two of the vetices of S (namely, v 2 and v 3 ) and a lowe level at the emaining one ( v 1 ). 4. Measuing Robustness Up to now, ou method of evaluating the obustness of a compaison x C 0 y has been to fix a set S of easonable weighting vectos and then to confim that the initial anking is not evesed at any membe of S, in which case the associated th ode obustness elation C applies. Theoem 2 povides simple conditions fo checking whethe x C y holds. The pesent section augments this appoach by fomulating a obustness measue that associates with any anking x C 0 y a numbe * between 0 and 1 to indicate its level of obustness. We constuct * using two statistics - one that might be expected to move in line with obustness and anothe that is likely to wok against it. The fist of these is Δ 0 = C(x;w 0 ) C(y;w 0 ) > 0, o the diffeence between the composite value of x and the composite value of y at the initial weighting vecto w 0. Intuitively, Δ 0 is an indicato of the stength of the dominance of x ove y at the initial weighting vecto. The second is Δ m = max w S [C(y;w) C(x;w), 0], o the maximal contay diffeence between the composite values of y and x. Note that when the oiginal compaison is fully obust, then C(y;w) C(x;w) 0 fo all w S and thee is no contay diffeence. Consequently Δ m = 0. On the othe hand, when the compaison is not fully obust, then C(y;w) C(x;w) > 0 fo some w S, and hence Δ m = max w S [C(y;w) C(x;w)] > 0. The quantity Δ m is the wost-case estimate of how fa the oiginal diffeence could be evesed at some othe weighting vecto. The measue of obustness we popose is given by * = Δ 0 /(Δ 0 +Δ m ). Notice that when the initial compaison x C 0 y is fully obust, then Δ m = 0 and hence * = 1. Altenatively, suppose that the initial compaison is not fully obust so that Δ m > 0. Then it is clea that * is stictly inceasing in the magnitude of the initial compaison Δ 0, and stictly deceasing in the magnitude of the contay wost-case evaluation Δ m. In addition, if Δ 0 tends to 0 while Δ m emains fixed, the measue of obustness * will also tend to 0. These chaacteistics accod well with an intuitive undestanding of how Δ 0 and Δ m affects obustness. Pactical applications of * may be hampeed by the fact that Δ m equies a maximization poblem to be solved, namely, max w S [C(y;w) C(x;w)]. Howeve, by the lineaity of C(y;w) C(x;w) = (y x) w in w, the poblem has a solution at some vetex v d of S. At the vetex w = v d the diffeence C(y;w) C(x;w) becomes y d x d, and so Δ m can be calculated as Δ m = max d (y d x d ), o the maximum coodinate diffeence between y and x. The measue * can be eadily deived using this equivalent definition fo Δ m. Fo example, ecall the case of Ieland and Canada, whose espective achievement vectos x' and y' ae found in Table 1. The initial HDI diffeence Δ 0 = C(x';v 0 ) - C(y';v 0 ) is 0.956 0.950 = 0.006. The maximal coodinate www.ophi.og.uk 10

diffeence y' d x' d is in dimension d = 1 so that Δ m = 0.919 0.882 = 0.037. Aggegation gives us an * of 0.14. In contast, the compaison of Austalia with Sweden yields the same HDI diffeence Δ 0 = 0.006, but Δ m = 0 since all dimensional diffeences suppot the initial anking of Austalia ove Sweden; hence * = 1. Finally, the example of Iceland and USA poduces Δ 0 = 0.012 and Δ m = 0.031 and hence a obustness level of about * = 0.28. Now what is the elationship between the obustness measue * and the elations C developed in the pevious section? The following theoem povides the answe. Theoem 3 Suppose that x C 0 y fo x, y X and let * be the obustness level associated with this compaison. Then the th ode obustness elation x C y holds fo 0 < * and does not fo * < 1. Poof Let x C 0 y and suppose that 0 < *. By definition of *, we have Δ 0 /(Δ 0 +Δ m ) and hence Δ m (1-)Δ 0. Pick any d = 1,,D. Then using the definitions of Δ 0 and Δ m we see that (y d x d ) (1-)(w 0 x - w 0 y) and hence v d y + (1-)w 0 y v d x + (1-)w 0 x. Consequently, v d y v d x, and since this is tue fo all d, it follows that x y and hence x C y by Theoem 2. Altenatively, suppose that x C 0 y and yet * < 1. Then (1-)Δ 0 < Δ m so that (1-)(w 0 x - w 0 y) < (y d x d ) fo some d, and hence v d y > v d x o y d > x d fo this same d. It follows, then, that x y cannot hold, and neithe can x C y by Theoem 2. Theoem 3 shows that the measue of obustness * is closely elated to the obustness elations C. Indeed, * is the lagest fo which x C y holds o, equivalently, fo which S has no weighting vecto that eveses the initial anking. Panel 1 of Figue 4 depicts the dashed line of intesection whee Iceland and USA have the same value of the composite indicato. Also depicted ae the simplexes coesponding to thee values of. The smallest simplex ( = ) contains only weighting vectos that yield a stictly highe value fo Iceland. The lagest simplex ( = ) is cut by the dashed line and hence it contains a egion whee USA has highe values. The middle simplex ( = *) contains vectos fo which Iceland has the highe value, and a single vecto (the vetex that just touches the dashed line) fo which the values ae the same. A smalle would leave oom fo the simplex to expand without evesing the initial compaison; a lage would lead to a evesal. Consequently, = * is the obustness level of the compaison. Panel 2 illustates the analogous constuction fo a case wee initial weights ae not equal. Note that the * value found hee is lage, showing that * may well depend upon the initial weighting vecto w 0. *** Figue 4 Hee *** *** Table 6 Hee *** To summaize, we have defined a measue of obustness *, shown how it can be calculated in pactice, and povided an altenative intepetation in tems of the C elations and thei associated simplexes. Table 6 illustates this methodology fo the ten-county HDI example via a obustness pofile that lists the obustness value * (in pecentage tems) fo each of the 45 possible compaisons. This includes the www.ophi.og.uk 11

infomation given in Table 3 (which highlights the fou enties with * = 100%) and Table 5 (which depicts the 21 compaisons with * 25%), and can easily identify the C compaisons fo any given. Note that the aveage value of * in Table 3 is only 35%, which eflects the fact that the HDI levels ae quite simila fo the high HDI counties and so the Δ 0 values ae smalle. The next section will apply the methods to datasets associated with the HDI and othe well-known composite indicatos to evaluate the applicability of the C elations in pactice. 5. The Pevalence of Robust Compaisons The focus now shifts fom individual compaisons to the entie collection of compaisons associated with a given dataset ˆX and an initial weighting vecto w 0. The fist question is how to judge the oveall obustness of the dataset. One option would be to use an aggegate measue (such as the mean) that is stictly inceasing in each compaison s obustness level. Howeve, athe than settling on a specific measue we use a pevalence function based on the entie cumulative distibution of obustness levels, and employ a citeion analogous to fist ode stochastic dominance to indicate geate obustness. We then apply the methodology to seveal datasets and investigate how vaious changes affect the pevalence function. We begin with an initial weighting vecto w 0 and a dataset ˆX containing ˆn obsevations. Without loss of geneality, we enumeate the elements of ˆX as x 1, x 2, ˆn, x whee C 0 (x 1 ) C 0 (x 2 ˆn ) C 0 ( x ). The analysis can be simplified by assuming that no two obsevations in ˆX have the same composite value, so that C 0 (x 1 ) > C 0 (x 2 ˆn ) > > C 0 ( x ). 10 Thee ae ˆk = ˆn ( ˆn -1)/2 odeed pais of obsevations x i and x j with i < j, and each compaison x i C 0 x j has an associated obustness level * ij. Let P = [ * ij ] epesent the obustness pofile of ˆX (given w 0 ), which lists the level of obustness * ij fo evey odeed pai in a manne simila to Table 6. The mean obustness level in pofile P is given by = Σ i Σ j>i * ij / ˆk ; it is the aveage level of obustness of the ˆk many compaisons. Of couse, a highe mean level does not necessaily ensue that the pevalence of C is highe fo any given. An altenate appoach is to summaize obustness levels in a way that eflects the entie distibution, and not just the mean. Fo any given dataset ˆX and initial weighting vecto w 0, define the pevalence function p:[0,1] [0,1] to be the function which associates with each [0,1] the shae p() [0,1] of the ˆk compaisons whose obustness levels ae at least. In othe wods, p() is the popotion of compaisons fo which the C elation applies. 11 Suppose that p and q ae the pevalence functions fo ˆX (given w 0 ) and Ŷ (given u0 ), espectively. We say that ˆX has geate obustness than Ŷ if p() q() fo all [0,1], with p() > q() fo some [0,1]. In wods, no matte the level of obustness, the shae of all compaisons that exhibit th ode obustness is no lowe fo ˆX than Ŷ, and fo some it is highe. 10 This is tue fo each of the examples pesented below. 11 At = 0 the complete elation C 0 is used and hence p(0) = 1. www.ophi.og.uk 12

The two ae said to have the same obustness if thei pevalence functions ae the same. A. Some Examples Figue 5 depicts the pevalence functions obtained fom seveal datasets associated with well-known composite indicatos. The fist two ae fom the 1998 and 2004 Human Development Index 12, which uses equal weights acoss thee dimensions (health, education and income) to ank 177 counties. Next is the 2008 Index of Economic Feedom (IEF) ceated by the Wall Steet Jounal and the Heitage Foundation to measue the degee of economic feedom acoss counties 13. Thee ae 10 dimensions, spanning the spectum fom business feedom to labo feedom, all weighted equally in w 0. The IEF dataset coves 157 counties fo the yea 2007. *** Figue 5 Hee *** The final indicato is the 2008 Envionmental Pefomance Index (EPI), developed by Yale Univesity, Columbia Univesity, the Wold Economic Foum and the Joint Reseach Cente of the Euopean Commission to complement the envionmental tagets of the UN Millennium Development Goals. The EPI is a composite indicato whose vaiables may be viewed at vaious levels of aggegation 14. Fo puposes of illustation, we conside thee vesions hee. EPI 10 has ten vaiables and the initial weights ae not all equal. EPI 6 combines vaiables (and initial weights) to obtain a six vaiable vesion with unequal weights. EPI 2 aggegates futhe to obtain two vaiables with equal weight on each. As the initial weighting vectos ae consistent, each vesion of the EPI poduces identical values at its espective initial weighting vecto; but due to the diffeent numbes of vaiables, each has distinct obustness chaacteistics. 15 The EPI dataset coves 149 counties duing the yea 2007. Seveal initial obsevations can be made fom the pevalence functions given in Figue 5. Each gaph is downwad sloping, eflecting the fact that as ises, the set S expands, and hence the numbe of compaisons that can be made by C is lowe (o no highe). As falls to 0, each function ises to 100% compaability fo C ; in the 12 Note that the Human Development Indices fo the yeas 1998 and 2004 ae obtained fom UNDP (2000 and 2006), espectively. 13 The ten dimensions of economic feedoms ae Business Feedom, Tade Feedom, Fiscal Feedom, Govenment Size, Monetay Feedom, Investment Feedom, Financial Feedom, Popety ights, Feedom fom Couption, Labo Feedom. Fo each dimension, the scoe is nomalized between zeo and hunded. The final scoe is obtained by simple aveage of these scoes. 14 EPI is composed of twenty five-dimensions of pefomance on envionment. Howeve, at the objective level all dimensions ae summaized in two categoies with equal weights: envionmental health and ecosystem vitality. At the policy level, all twenty-five dimensions ae summaized into six dimensions at. At the policy level, the weight vecto used is (0.5, 0.025, 0.075, 0.075, 0.075, 0.25). Futhe, these six dimensions ae sub-categoized into ten dimensions with the weight vecto (0.25, 0.125, 0.125, 0.025, 0.075, 0.075, 0.025, 0.025, 0.025, 0.25). To have detailed infomation on indicatos, see http://epi.yale.edu/methodology. 15 Fo consistent initial weighting vectos, the weight on an aggegated vaiable is sum of the weights on the vaiables that wee aggegated. See the next subsection. www.ophi.og.uk 13

othe diection, the value of p() at = 1 is the pecentage of the compaisons that ae fully obust. Thee is a wide vaiation in p(1) acoss datasets. It is clealy highest fo the HDI examples, with p(1) being about 69.8% in 2004 and 73.2% in 1998; it is 47.4% fo the two vaiable EPI; and it is much lowe fo the emaining indicatos 16 (4.2% and 1.5% in the case of EPI 6 and EPI 10, espectively, and 6.5% fo the EFI). Fo between 0 and 1, the HDI compaisons ae also moe obust than the compaisons of the EPI and the EFI, and of the two HDI datasets, 1998 exhibits geate obustness than 2004. Fo the EPI examples, a highe level of aggegation and hence a lowe numbe of vaiables, leads to geate obustness. Howeve, EPI 2 is still less obust than eithe of the HDIs, which have thee vaiables. The shapes of the p() functions ae diffeent, with some being essentially linea, and othes exhibiting ponounced cuves. Dawing on these examples, we now examine the pevalence of obustness fom a moe theoetical pespective. What tansfomations allow the esulting datasets to be compaed in tems of obustness? B. Fixed Robustness and Tansfomations We begin with tansfomations of the data that leave p() unchanged and hence yield pais of datasets with the same obustness popeties. A monotonically inceasing tansfomation of X is a function f:x R D that can be witten as f(x) = (f 1 (x 1 ),, f D (x D )) whee each function f d (x d ) is monotonically inceasing; a commonslope affine tansfomation of X has the additional popety that each function f d (x d ) can be witten as f d (x d ) = αx d + β d fo some α > 0 and β d in R. We say that Ŷ is obtained fom ˆX by a common-slope affine tansfomation (espectively, by a monotonically inceasing tansfomation) if Ŷ = {f(x): x ˆX } fo some tansfomation f having the appopiate popety. Applying a monotonically inceasing tansfomation to a dataset peseves the odeings of achievements within each dimension, but can disupt the weighted aveages acoss dimensions. In paticula, it is possible that C(x';w 0 ) > C(x;w 0 ) and C(y';w 0 ) < C(y;w 0 ) whee y' and y ae tansfomations of x' and x, espectively, which implies that the obustness pofiles of Ŷ and ˆX can be athe diffeent fo the same w 0. On the othe hand, if we estict consideation to common-slope affine tansfomations, we see that C(y;w) = w y = αw x + w β whee β = (β 1,, β D ), and hence C(x';w) C(x;w) if and only if C(y';w) C(y;w), whee y' and y ae the espective tansfomations of x' and x. In this case Ŷ and ˆX have the same obustness pofile and hence the same pevalence function p() given w 0. So, fo example, if evey dimension is scaled up o down in the same popotion, this will leave p() unchanged, as will simply adding a diffeent constant to each dimension. On the othe hand, multiplying each dimension by a diffeent positive constant altes the implicit weighting acoss dimensions, potentially changing the ankings of tansfomed obsevations. Using an abitay monotonic inceasing tansfomation likewise can alte ankings and lead to diffeent pevalence functions fo the tansfomed dataset. Note, though, that fully obust compaisons ae peseved unde a monotonic 16 Fo EPI 6 and EPI 10, we also calculate the pevalence function using initial equal weights. We find the dominance elation to hold between EPI 6 and EPI 10 by EPI 6 is moe obust that EPI 10 fo all. The elationship is explained in Figue 6. www.ophi.og.uk 14

tansfomation (as noted in the discussion following Theoem 1), and hence the pevalence p(1) of full obustness does not change. These esults ae summaized in the following theoem. 17 Theoem 4 Suppose that the initial weighting vecto is fixed. If Ŷ is obtained fom ˆX by a monotonically inceasing tansfomation, then Ŷ and ˆX shae the same pevalence value p(1). If Ŷ is obtained fom ˆX by a common-slope affine tansfomation, then they shae the same pevalence function p(). 18 In the example of the HDI, the nomalized income, education and health vaiables used to constuct index values ae actually monotonic tansfomations of undelying vaiables involving a nonlinea function in the case of income, and affine tansfomations with diffeent slopes acoss the thee vaiables. Consequently, the specific shapes of the tansfomations can influence HDI compaisons as well as thei measued obustness levels. Howeve, as indicated in Theoem 4, these tansfomations do not influence fully obust compaisons and p(1). If one esticted consideation to C 1 compaisons, thee would be no need to select the ight tansfomations o even to tansfom vaiables at all: one could use the oiginal income, education and health vaiables diectly. A second fom of tansfomation eplaces each vaiable in the achievement vecto with one o moe copies of that vaiable. A eplicating tansfomation of X is a function f:x R D' fo some D' > D such that f(x) = (f 1 (x 1 ),, f D (x D )), whee each f d (x d ) is the k d -fold eplication (x d,x d,,x d ) fo some intege k d 1. We say that that Ŷ is obtained fom ˆX by a eplicating tansfomation if Ŷ = {f(x): x ˆX } fo some tansfomation f of this type. Tansfomed achievement vectos have highe dimension D' and, consequently, the associated weighting vectos must be adjusted to account fo this. Now, which initial weighting vecto u 0 fo Ŷ would coespond to the oiginal w 0 fo ˆX 0? One option is to divide the weight w d equally among the associated dimensions in u 0 ; howeve, it tuns out that any allocation of the weight w 0 d acoss its associated dimensions will do. We say u 0 is consistent with w 0 if, fo each d = 1,, ˆn the weight w 0 d on x d is equal to the sum of the k d enties in u 0 associated with f d (x d ) = (x d,x d,,x d ). So fo example, if D = 2 and f eplicates each enty two times, then w 0 = (½, ½) is consistent with u 0 = (½, 0, ¼, ¼). We have the following esult. Theoem 5 If Ŷ is obtained fom ˆX by a eplicating tansfomation, and u 0 is consistent with w 0, then Ŷ and ˆX have the same pevalence function p(). Poof Suppose that y is a eplicated achievement vecto associated with x, so that y = f(x) fo a eplicating tansfomation f. Given the initial weighting vecto w 0 and a consistent weighting vecto u 0, it is clea that C(y;u 0 ) = u 0 f(x) = w 0 x = C(x;w 0 ). Now let (0,1] and select any d = 1,,D along with an index value d' of one of its copies. Let v d denote the dimension d vetex of the simplex S in R D and let v d denote the dimension d' vetex of the simplex S in R D'. It is clea that C(x; v d) = v d x = (1-17 The esult on monotonic tansfomations would be tue even if the initial weighting vectos ae diffeent just so both ae stictly positive. The ole played by common-slope affine tansfomations is simila to assumptions used in social choice theoy: see Blackoby, Donaldson, and Weymak (1984). 18 The fist pat of Theoem 4 will geneate same pevalence function p(1) even if use the dominance citeion poposed by Chechye, Ooghe, and Puyenboeck (2005). www.ophi.og.uk 15

)C(x;w 0 ) + x d = (1-)C(y;u 0 ) + y d' = v d y = C(y; v d ). Hence, whee y' and y ae the espective tansfomations of x' and x, we have (i) C(x';w 0 ) C(x;w 0 ) if and only if C(y';u 0 ) C(y;u 0 ), and (ii) C(x'; v d) C(x; v d) if and only if C(y'; v d ) C(y; v d ). Since (ii) holds fo each d and evey associated d', it follows fom Theoem 2 that x'c x if and only if y'c y, and p() is the same fo both. In othe wods, appending copies of one o moe existing vaiables leaves the compaisons and the obustness popeties of a dataset unaffected, as long as the effective weight on each vaiable is unchanged. As an example, conside what would happen if the education vaiable in an HDI dataset wee eplicated to obtain a fou vaiable dataset. Using equal weights of ¼ fo the fou dimensional dataset would likely alte ankings since this would, in effect, incease the aggegate weight on education. Howeve, if the total weight on the two education vaiables is maintained at ⅓, say whee each vaiable eceives a weight of 1 6, then all compaisons and obustness levels would be the same as befoe. One implication of this is that the numbe of vaiables pe se does not have an independent impact on a dataset s obustness. In contast, the empiical evidence povided by Figue 5 does seem to suggest that a geate numbe of vaiables is associated with lowe obustness. The evidence is paticulaly stiking fo the thee EPI examples, whee the aggegation of vaiables, and hence the decease in the numbe of vaiables, clealy leads to inceased obustness even though they use the same undelying data. Is this due to the deceased numbe of vaiables? Let us examine how EPI 6 is constucted fom EPI 10. The fist and fifth vaiables in EPI 6 ae each obtained by combining thee distinct vaiables in EPI 10 (namely, vaiables 1-3 and vaiables 7-9), while the emaining vaiables ae unchanged. Weights fom the initial weighting vecto u 0 fo EPI 10 ae used to constuct each new vaiable in EPI 6 as a weighted aveage of the souce vaiables fom EPI 10, and the weight on the new vaiable is the sum of the coesponding weights in u 0. The new w 0 is thus consistent with u 0. Now conside a ten vaiable eplication of EPI 6 that epeats vaiable 1 thee times and vaiable 5 thee times and let the initial weighting vecto be u 0. By Theoem 5, this intemediate dataset has pecisely the same obustness pofile and pevalence function as EPI 6. It is not the numbe of vaiables that is diving the obseved decease in obustness. Instead, its souce is found in the tansfomation fom the intemediate dataset to EPI 10, by which the pefectly coelated tiplets ae conveted to vaiables that ae less positively associated. The fall in obustness is due to disageements among the new vaiables, athe than the highe numbe of vaiables pe se. Association among vaiables is likely the key dive of obustness, and this is exploed futhe in the next section. C. Robustness and Positive Association What factos geneally lead to geate obustness? At an intuitive level, the possibility of fully obust compaisons is elated to the degee of coelation o positive association among the dimensional vaiables. Fo example, if two of the achievements ae pefectly negatively coelated, so that when one ises, the second falls, then it is impossible fo vecto dominance and hence C 1 to hold. On the othe www.ophi.og.uk 16