Centre de Referència en Economia Analítica

Centre de Referèni en Eonomi Anlíti Brelon Eonomis Working Pper Series Working Pper nº 69 Regions of Rtionlity: Mps for ounded gents Roin M. Hogrth nd Ntli Kreli Otoer, 005

Regions of rtionlity: Mps for ounded gents Roin M. Hogrth & Ntli Kreli ICREA & Universitt Pompeu Fr, Brelon H. E. C., Lusnne, Switzerlnd Otoer 005 The uthors re grteful for feedk reeived t workshops t the Center for Deision Reserh t the University of Chigo, Universitt Pompeu Fr, Crnegie-Mellon University, the Hs Shool of Business t the University of Cliforni, Berkeley, nd Toulouse University. This reserh ws finned prtilly y grnt from the Spnish Ministerio de Eduión y Cieni. For orrespondene, plese ontt Roin M. Hogrth t Universitt Pompeu Fr, Deprtment of Eonomis nd Business, Rmon Tris Frgs 5-7, 08005, Brelon, Spin. (Tel: +34 93 54 56, Fx + 34 93 54 746). Emil: roin.hogrth@upf.edu, ntli.kreli@unil.h

Astrt An importnt prolem in desriptive nd presriptive reserh in deision mking is to identify regions of rtionlity, i.e., the res for whih simple, heuristi models re nd re not effetive. To mp the ontours of suh regions, we derive proilities tht models identify the est of m lterntives (m > ) hrterized y k ttriutes (k > ). The models inlude single vrile (lexiogrphi), vritions of elimintion-y-spets, equl weighting, hyrids of the preeding, nd models exploiting dominne. We ompre ll with multiple regression. We illustrte the theory with twenty simulted nd four empiril dtsets. Fits etween preditions nd reliztions re exellent. However, the terrin mpped y our work is omplex nd no single model is est. We further provide n overview y regressing the performne of the different models on ftors hrterizing environments. We onlude y outlining how our work n e extended to exploring the effets of different loss funtions s well s suggesting further topis for future reserh. Keywords: Deision mking, Bounded rtionlity, Lexiogrphi rules, Choie theory. JEL lssifition: D8, M0.

In his utoiogrphy, Herert Simon (99) used the metphor of mze to hrterize person s life. In this metphor, people re ontinully fed y hoies involving two or more lterntives, the outomes of whih nnot e perfetly predited from the informtion ville prior to hoosing. Extending this metphor, the mze of hoies person fes n e thought of s journey tht rosses different regions vrying in the types of questions posed. If endowed with unounded rtionlity, one ould simply lulte the optiml responses for ll deisions. However, following Simon s insights, the ounded nture of humn ognitive pities neessrily leds to following stisfiing mehnisms. Fortuntely, stisfiing does not imply unstisftory outomes if the type of response used is pproprite to the region in whih hoie is exerised. But it lso rises the issue of fing the onsequenes of inpproprite hoies. In this pper, we hrterize the mze of hoies tht people fe s involving different regions of rtionlity where suess depends on identifying deision rules tht re pproprite to eh region. In some regions, for exmple, the simplest rndom hoie rule might e suffiient (e.g., when hoosing lottery tiket). In other regions, returns to omputtionlly demnding lgorithms re potentilly importnt (e.g., plnning prodution in n oil refinery). Wht people need therefore is knowledge or mps tht indite the demnd for rtionlity in different regions. In prtiulr, sine ttention is the sre resoure (Simon, 978), it is ritil to know wht nd how muh informtion should e sought to mke deisions in different regions. The purpose of this pper is to ontriute to defining mps tht hrterize regions of rtionlity for ommon deisions prolems. This topi is importnt for oth desriptive nd presriptive resons. For the former, there is gret need to As Simon (99) points out, this metphor lso underlies his lssi (956) pper on wht n orgnism needs to e le to hoose effetively in given environments. 3

understnd the onditions under whih simple, oundedly rtionl deision rules re nd re not effetive (see elow). At the sme time, this knowledge is ritil for presriing when people should use suh rules, i.e., s deision ids. Speifilly, we onsider deisions etween two or more lterntives sed on informtion tht is proilistilly relted to the riterion of hoie. The struture of these tsks n e oneptulized s involving either multiple-ue predition or multi-ttriute hoie nd, s suh, is ommon. In ll ses, we onstrut theoretil models tht predit the effetiveness in different regions of severl, simple hoie rules or heuristis (see elow) therey mpping the ontours in whih the vrious models re more or less suessful. Following Simon s initil insights, the interest in desriing the implitions of simple models of deision mking hs grown exponentilly over the lst five dedes (see, e.g., Conlisk, 996; Goldstein & Hogrth, 997; Khnemn, 003; Koehler & Hrvey, 004). An importnt soure of ontroversy in this reserh hs entered on the extent to whih the simple rules or heuristis tht people use for mking deisions re effetive. In prtiulr, gret interest ws stimulted y reserh on so-lled heuristis nd ises (Khnemn, Slovi, & Tversky, 98) tht demonstrted how simple (or less thn fully rtionl) proesses produe outomes tht devite from normtive presriptions. Similrly, muh work demonstrted tht simple, sttistil deision rules hve superior preditive performne reltive to unided humn judgment in wide rnge of tsks (see, e.g., Dwes, Fust, & Meehl, 989; Kleinmuntz, 990). An lterntive view is tht people possess repertoire of oundedly rtionl deision rules tht they pply in speifi irumstnes (Gigerenzer & Selten, 00). Thus, heuristis n lso produe pproprite responses. Speifilly, Gigerenzer nd 4

his ollegues hve demonstrted how wht they ll fst nd frugl rules n rivl the preditive ility of omplex lgorithms (Gigerenzer, Todd, & the ABC Reserh Group, 999). In their terms, ounded rtionlity n produe eologilly rtionl ehvior, i.e., ehvior tht is pproprite in its nihe ut does not ssume n underlying optimiztion model. Wht is unler from this work, however, is where these nihes re loted in the regions of rtionlity. Reviewing the empiril evidene, it is ler tht there re osions when heuristi rules violte normtive presriptions s well s osions when simple rules led to surprisingly suessful outomes. The role of theory, therefore, is to speify the irumstnes in whih oth kinds of results our or, to use the metphor of this pper, to mp the regions of rtionlity. Our gol is to illuminte this issue nd our pproh is theoretil. It involves speifying nlytil models for simple proesses tht n e used for either multittriute hoie or multiple-ue predition. Speifilly, we derive proilities tht these models will orretly selet the est of m lterntives (m > ) sed on k ttriutes or ues (k > ). We lso ompre their effetiveness with optimizing nd nïve enhmrks. The theoretil development enles the ssessment of importnt environmentl ftors suh s differentil ue vlidities, inter-orreltions of ttriutes, whether ttriutes/ues re mesured y ontinuous or inry vriles, levels of error in dt, nd the intertions etween these ftors. This pper is orgnized s follows. In setion I, we riefly review relevnt literture. Next, in setion II we speify the models we exmine. In setion III, we onsider models sed on ontinuous vriles nd derive onditions for hoosing the est of three lterntives using single vrile prior to generlizing the numer of lterntives nd the use of different models. In setion IV, we derive nlogous 5

onditions for models sed on inry ttriutes or ues. In setion V, we test our theories on twenty simulted nd four empiril dtsets nd find exellent fits etween preditions nd reliztions. We lso provide n overview y regressing the performne of the different models on ftors hrterizing environments. Our results emphsize tht reltive model performne is omplex funtion of severl ftors nd tht theoretil models re needed to understnd this omplexity, i.e., to mp the regions of rtionlity. For exmple, we identify regions where different models do nd do not exhiit similr performne. At the sme time, our results re onsistent with some generl trends tht hve een demonstrted previously in simultions (e.g., effets of inter-orreltion mong preditor vriles). We lso identify new regions where less is more (i.e., preditions re improved if less informtion is used). Finlly, in setion VI we provide onluding omments s well s suggestions for further reserh. I. Evidene on the preditive effetiveness of simple models Interest in the effiy of simple models for deision mking hs existed for some time with, in prtiulr, numerous empiril demonstrtions of how models sed on simple equl (or unit) weighting shemes predit s well s or more urtely thn more omplex lgorithms suh s multiple regression (see, e.g., Dwes & Corrign, 974; Dwes, 979). Gigerenzer nd Goldstein (996) hve further shown how simple, non-ompenstory lexiogrphi model tht uses inry ues ( tke the est or TTB) is surprisingly urte in prediting the etter of two lterntives ross severl empiril dtsets nd outperforms the ompenstory, equl weighting (EW) model (Gigerenzer et l., 999). 6

Other studies hve used simultion. Pyne, Bettmn nd Johnson (993), for exmple, explored trdeoffs etween effort nd ury. Using ontinuous vriles nd weighted dditive model s the riterion, they demonstrted the effets on simple model performne of two importnt environmentl vriles, dispersion in the weighting of vriles nd the extent to whih hoies involved dominne. (See lso Thorngte, 980). Bsed on oneptul onsidertions, Shnteu nd Thoms (000) defined environments s friendly or unfriendly to different models nd lso demonstrted these effets through simultions. More reently, Fsolo, MClellnd, nd Todd (in press) exmined multittriute hoie in simultion using ontinuous vriles (involving options hrterized y six ttriutes). Their gol ws to ssess how well hoies y models with differing numers of ttriutes ould mth totl utility nd, in doing so, they vried levels of verge inter-orreltions mong the ttriutes nd types of weighting funtions. Results showed importnt effets for oth. With differentil weighting, one ttriute ws suffiient to pture t lest 90% of totl utility. With positive interorreltion mong ttriutes, there ws little differene etween equl nd differentil weighting. With negtive inter-orreltion, however, equl weighting ws sensitive to the numer of ttriutes used (the more, the etter). Despite these empiril demonstrtions involving simulted nd rel dt, reserh to dte hs generlly lked theoretil models for understnding how hrteristis of models intert with those of environments. Some work hs, however, onsidered speifi ses. Einhorn nd Hogrth (975), for exmple, provided theoretil rtionle for the effetiveness of equl weighting reltive to multiple regression. Mrtignon nd Hoffrge (999; 00) nd Ktsikopoulos nd Mrtignon (003) explored the onditions under whih TTB or equl weighting 7

should e preferred in inry hoie. Hogrth nd Kreli (004; in press) nd Buells, Crrso, nd Hogrth (005) hve exmined why TTB nd other simple models perform well with inry ttriutes in error-free environments. And, Hogrth nd Kreli (005) provided theoretil nlysis for the speil se of inry hoie with ontinuous ttriutes. II. Models onsidered Wheres the essene of fully rtionl models involves hoosing or prediting y optimlly omining ll relevnt evidene, heuristi models re hrterized y the use of limited susets of the sme informtion nd/or simplifying omintion rules (e.g., equl weighting of vriles). The heuristi models we exmine (see Tle ) reflet these onsidertions nd n e lssified into three tegories: (A) models sed on single vriles or susets of the ville informtion; (B) equl weighting models; nd (C) hyrid models tht omine hrteristis of the two preeding tegories. In ddition, we onsider lower nd upper enhmrk models: (D) simple models tht exploit dominne (see omments elow); nd (E) multiple regression (see lso omments elow). We further exmine how the type of dt ffets model performne y inluding, where possile, versions of the models sed on oth ontinuous nd inry ttriutes/ues. Generlly speking, we would expet models sed on ontinuous vriles to outperform their inry ounterprts. However, wht is not ler priori is the size of suh differenes nd how these might vry under different In our simultions nd empiril work, we generte inry vriles y medin splits of the ontinuous vriles nd in this mnner mke diret omprisons etween results sed on inry nd ontinuous vriles. 8

onditions. We indite the use of the two kinds of dt for the sme models y suffixes: for ontinuous, nd for inry. Sine most of the models we onsider hve een onsidered in the literture (see previous setion), we limit disussion here to mking few links. First, the DEBA model (numer 3) is deterministi version of Tversky s (97) elimintiony-spets (EBA) model. For inry hoie, this model is identil to the TTB model of Gigerenzer nd Goldstein (996). Vriles used s ttriutes/ues for this model re inry in nture nd, lthough the mount of informtion onsulted y this model for eh hoie vries ording to the hrteristis of the lterntives, mny deisions re sed on single ttriute. In the ontinuous se, this is est mthed y the single vrile model (SV, numer ) whih is equivlent to the lexiogrphi model investigted y Pyne et l. (993). Seond, with inry vriles s ues/ttriutes, the EW model predits frequent ties etween lterntives. However, rther then resolving suh hoies t rndom, we use hyrid models tht exploit prtil knowledge. Speifilly, EW/DEBA nd EW/SV re models tht, first, ttempt to hoose ording to EW. If this results in tie, DEBA or SV is used s tie-reker (see lso Hogrth & Kreli, in press). Third, it is illuminting to ompre the performne of simple heuristis with enhmrks. For lower or nïve enhmrks, we inlude two models tht simply exploit dominne, Domrn (DR), numers 8 nd 9. (Simply stted, hoose n lterntive if it domintes the other(s). If not, hoose t rndom.) As n upper or normtive/sophistited enhmrk, we use multiple regression (models 0 nd ). 3 3 We re fully wre tht multiple regression is not neessrily the optiml model for ll tsks. 9

It is importnt to emphsize tht the models differ in the demnds they mke on ognitive resoures, speifilly on prior knowledge nd the mount of informtion to e proessed. We therefore indite, on the right of Tle, differentil requirements in terms of prior informtion, informtion to onsult, lultions, nd numers of omprisons to e mde (minimum to mximum). For exmple, Tle shows tht the EW nd DR models require no prior informtion other thn the signs of the zero-order orreltions etween the ues nd the riterion (this is minimum requirement). On the other hnd, the lexiogrphi, DEBA, nd hyrid models need to know whih ue(s) is(re) most importnt. Aginst this, the lexiogrphi nd DEBA models do not neessrily use ll ues nd require no lultions. The ost of DR models lies minly in the numer of omprisons tht hve to e mde. ------------------------------------------------ Insert Tle out here ------------------------------------------------ In this pper, we onentrte on ury or the proilities tht models mke pproprite hoies/preditions. However, nd s demonstrted y Pyne et l. (993), it is importnt to er in mind tht heuristi models differ in their informtion proessing osts. Our gol is to develop theoretil models tht predit model performne ross different environments. However, sed on the hrteristis of the models, two hypotheses n e suggested. First (s noted ove), we would expet models sed on ontinuous vriles to outperform their inry ounterprts. Seond, models tht resolve ties of other models would e expeted to e more urte thn the ltter. Hene DEBA should e more urte thn SV, nd EW/DEBA nd 0

EW/SV more urte thn EW. However, whether DEBA is more urte thn SV will depend on environmentl hrteristis. A priori, three types of environmentl vriles n e expeted to ffet solute nd reltive model performne. These re, first, the distriution of true ue vlidities 4 (i.e., how the environment weights different vriles, f., Pyne et l., 993); seond, the level of redundny or inter-orreltion mong the ues; nd third, the level of noise in the environment (i.e., its inherent preditility). Of these ftors, inresing noise will undoutedly derese performne of ll models nd, y extension, differenes etween the models. However, prt from this min effet, it is diffiult to intuit how ll other ftors will omine to determine solute nd reltive model performnes. To hieve this, we need to develop pproprite theory for eh of our models. III. Models with ontinuous vriles Choosing the est using single vrile (SV). For expository resons, we onsider first the se of seleting the est of three lterntives using single vrile (SV). Speifilly, imgine hoosing from distriution hrterized y two orrelted rndom vriles, one of whih is riterion, Y, nd the other n ttriute, X. Furthermore, ssume tht lterntive A is preferred over lterntives B nd C if y > y nd y > y. 5 Now, imgine tht the only informtion out A, B, nd C re the vlues tht they exhiit on the ttriute, X. Denote these speifi vlues y x, x, nd x, respetively. Without loss of generlity, ssume tht x > x nd x > x nd tht the deision rule is to hoose the lterntive with the lrgest vlue of X, i.e., in 4 The ue vlidity for prtiulr ue/ttriute is defined y its orreltion with the riterion. 5 We denote rndom vriles y upper se letters, e.g., Y nd X, nd speifi vlues or reliztions y lower se letters, e.g., y nd x. As n exeption, we use lower se Greek letters to denote rndom error vriles, e.g., ε.

this se A. The proility tht A is in ft the orret hoie n therefore e hrterized y the joint proility tht Y > Y given tht x > x nd Y > Y onditioned on x > x, in other words, {( Y Y X = x > X = x ) ( Y > Y X = x > X x )} P > =. To determine this proility, ssume tht Y nd X re oth stndrdized norml vriles, i.e., oth re N(0,). Moreover, the two vriles re positively orrelted (if they re negtively orrelted, simply multiply one y -). Denote the orreltion y the prmeter, ( >0). Given these fts, it is possile to represent Y, Y, nd Y y the equtions: Y = X + ε () Y = X + ε () nd Y = X + ε (3) where ε, ε nd, ε re normlly distriuted error terms, eh with men of 0 nd vrine of ( ), independent of eh other nd of X, X, nd X. Using equtions (), (), nd (3) the differenes etween Y nd Y, on the one hnd, nd Y nd Y, on the other, n e written s nd Thus, Y >Y nd Y > Y if nd Y Y ( X X ) + ( ε ε ) Y = (4) ( X X ) + ( ε ε ) Y = (5) ( X X ) > ε ε (6) ( X X ) > ε ε (7)

{( Y Y X = x > X = x ) ( Y > Y X = x > X x )} P > = n now e refrmed s the proility tht oth the right hnd side of (6) is smller thn (X - X ) nd the right hnd side of (7) is smller thn (X X ). As n e seen, these ltter terms re the produts of, the orreltion etween Y nd X, nd the differenes etween X nd X, nd X nd X. In other words, the lrger the orreltion etween Y nd X, nd the lrger the differenes etween X nd X, nd X nd X, the greter P {( Y Y X = x > X = x ) ( Y > Y X = x > X = x )} P > = {( ε < ( x x )) ( ε ε < ( x x ))} ε (8) To determine this proility, we mke use of the fts tht the differenes etween the error terms, (ε - ε ) nd (ε - ε ), re oth normlly distriuted with mens of 0 nd vrines of ( ). Stndrdizing (ε - ε ) nd (ε - ε ), we n reexpress eqution (8) s P {( ε < ( x x )) ( ε ε < ( x x ))} ε = P z < ( x x ) z < ( x x ) ( ) ( ) (9) where z nd z re stndrdized norml vriles with mens of 0 nd vrines of. Moreover, z nd z jointly follow ivrite norml distriution. Therefore, the trget proility (9) n e written s l l z ( ) e dz dz πσ z σ z (0) z zz z where z = + ; σ σ σ σ z z z z l ( x x ) = ; ( ) l ( x x ) = ; ( ) σ σ z = = z ; nd = σ. z,z 3

In Appendix A, we show tht = ½. Thus, we n write {( Y Y X = x > X = x ) ( Y > Y X = x > X x )} P > = = l l π 3 ( z z z + z ) dz dz 3 e () Figure illustrtes the proilities of SV orretly hoosing the est of three lterntives for different vlues of (X - X ) nd (X - X ). In the pnel on the left, (), (x - x ) is held onstnt t low vlue of 0.3; on the right, (), it is held onstnt t high vlue,.0. The lines in the figures reflet the effets of omining these fixed levels with different vlues of (x - x ), from 0. to 3.0. As n e oserved, if one of the two differenes, (x - x ) or (x - x ), is smll, the proility of orret hoie vries etween 0.4 nd 0.5. However, s oth grow lrger, so do the orresponding proilities. ---------------------------------------------- Insert Figure out here ---------------------------------------------- To generlize the ove, ssume tht there re m (m > 3) lterntives from whih to hoose nd tht eh hs speifi X vlue, x l, l =,., m. Without loss of generlity, ssume tht x hs the lrgest vlue nd we wish to know the proility tht the orresponding lterntive hs the lrgest vlue on the riterion. Generlizing from the ove, this proility n e lulted using properties of the multivrite norml distriution nd, in this se, n e written, * d... * dm ϕ ( z µ, V z z ) dz... dz m = * d... d * m V z ( π ) / z ( m ) / e z V z dz... dz m () 4

where d * i di = for i =, m, the elements of z = z, z,..., z ) re jointly ( ) ( m distriuted norml vriles, with mens of zero nd vrines of one, nd V z is the inverse of the (m-) x (m-) vrine-ovrine mtrix where eh digonl element is equl to nd ll off-digonl elements equl ½ (see Appendix A). In Appendix B we derive the nlytil expression for the proility of seleting the optiml hoie mong four lterntives y using just one vrile. For inry hoie, tht is, when m =, nlogous derivtions led to similr expressions to those shown ove (see Hogrth & Kreli, 005). Overll proilities. The proilities given ove re those ssoited with prtiulr oservtions, i.e., tht A is lrger thn B nd C given tht speifi vlue, x, exeeds speifi vlues x nd x. However, it is lso instrutive to onsider the overll expeted ury of SV, i.e., the overll proility tht SV mkes the orret hoie when smpling t rndom from the popultion of lterntives. Overll, SV n mke suessful hoies in three wys: seleting A when x is igger thn x nd x ; seleting B when x is igger thn x nd x ; nd seleting C when x is igger thn x nd x. The proilities of these events re, {( ( X X ) ( X > X )) (( Y > Y ) ( Y Y ))} P > >, {( ( X X ) ( X > X )) (( Y > Y ) ( Y Y ))} P > >, nd {( ( X > X ) ( X > X )) (( Y > Y ) ( Y Y ))} P > respetively, nd the overll proility is the sum of the three terms. However, sine eh of the terms is equl to the others, the sum n e re-expressed s {( ( X > X ) ( X > X )) (( Y > Y ) ( Y Y ))} 3 P >. 5

To derive nlytilly the overll proility of orret hoie y SV when smpling t rndom from the underlying popultion of lterntives, the ltter expression should e integrted ross ll possile vlues tht n e tken y D = X X > 0, nd D = X X > 0. Tht is * * d d 3 ( d d, Vd ) ( z z, Vz ) dzdz dddd 0 ϕ µ 0 ϕ µ (3) where z = ( z, z ), / V z =, (, ) / d = d d, V d =, d * d =, ( ) nd d * d =. (In Tle 3, disussed elow, we generlize these formuls for ( ) hoosing one of m lterntives.) Equl weighting (EW) nd multiple regression (MR). Wht re the preditive uries of models tht mke use of severl, k, ues or vriles, k >? We onsider two models tht hve often een used in the literture. One is equl weighting (EW see Dwes & Corrign, 974; Einhorn & Hogrth, 975). The other is multiple regression (MR). To nlyze these models, ssume tht the riterion vrile, Y, n e expressed s funtion Y = f(x, X,..,X k ) (4) where the k preditor vriles re multivrite norml, eh with men of 0 nd stndrd devition of. For EW, the predited Y vlue ssoited with ny vetor of oserved x s is equl to k x j k j= or x. Similrly, the nlogous predition in MR is k given y j x j or ŷ where the j s re estimted regression oeffiients. In using j= these models, therefore, the deision rules re to hoose ording to the lrgest x for EW nd the lrgest ŷ vlue for MR. 6

How likely re EW nd MR to mke the orret hoie? Following the sme rtionle s the single vrile (SV) se, we show in Tle the formuls used in deriving the nlogous proilities for EW nd MR (s well s SV) when hoosing the est of three lterntives using the properties of the ivrite norml distriution. These re the initil equtions (orresponding to equtions,, nd 3), the ury onditions (orresponding to equtions 6 nd 7), the relevnt error vrines, nd finlly the upper limits of integrtion, i.e. l nd l, used to lulte proilities when pplying eqution (). ---------------------------------------------- Insert Tles & 3 out here ---------------------------------------------- Similrly, when lulting the proilities of hoosing orretly etween four or more lterntives for EW nd MR, we n pply the multivrite norml distriution in nlogous fshion to tht of SV (f. eqution ove nd Appendix B). In Tle 3, we present the formuls for the overll expeted ury of EW nd MR in given environment or popultion, nlogous to those for SV, i.e. to eqution (3), for hoosing one of m lterntives. In prtiulr, we present the elements tht re speifi for different models, suh s the vrine-ovrine mtrix, V d, nd the upper integrtion limits of integrtion, d. * i IV. Models with inry vriles To disuss expeted preditive performne of models sed on inry vriles, we first ssume tht the dependent vrile, Y, n e thought of s eing generted y liner model of the form 7

Y = + k j= γ jw j + ς (5) where W j = 0, re the inry vriles (j =,,k), the γ j re weighting prmeters nd ς is normlly distriuted error term (see lso elow). To derive theoretil preditions for models using inry vriles, we dopt similr pproh to tht used with ontinuous vriles. We therefore fous on issues tht differ etween the ontinuous nd inry ses. Choosing the est using single inry vrile (SV). Assuming tht w > w nd w > w, the proility tht SV hooses orretly etween three lterntives, A, B, nd C is P {( Y Y W = w > W = w ) ( Y > Y W = w > W = w )} >. To determine this proility, rell tht Y is stndrdized norml vrile N(0,). The inry vrile, W, however, only tkes vlues of 0 nd nd thus hs men of 0.5 nd stndrd devition, σ, of 0.5. 6 Denoting the orreltion etween Y nd W y yw, ( yw > 0), we n express Y y w or, simply, Y Y yw = SV + W + ς σ w = SV + yww + ς (6) whereς is normlly distriuted error term N( 0, ). 7 yw Proeeding in similr fshion to the ontinuous se, we otin the expression for the proility of SV prediting orretly: h h π 3 ( z z z + z ) dz dz 3 e (7) 6 Rell tht inry vriles re reted y medin splits of ontinuous vriles. 7 Sine E(Y) = 0, it follows tht the interept SV = yw. 8

where h ( w w ) yw = ; ( ) yw h ( w w ) yw =. ( ) yw Sine oth ( w w ) nd ( w ) w re equl to one, the two upper integrtion limits re the sme: yw h = h =. ( ) yw As n e seen, the only differene etween the theoretil expressions for the ontinuous nd inry ses lies in the formuls for the upper limits of integrtion. Therefore, generlizing the ove for hoies mong m (m > 3) lterntives is nlogous to tht for the ontinuous se. Following the sme rtionle, we n derive the formuls for the proilities for EW nd MR when hoosing the est of three lterntives using inry vriles. In Tle 4, we present the initil equtions for these models (orresponding to eqution 6), nd the upper limits of integrtion, i.e. h nd h, used to lulte proilities when pplying eqution (7). ------------------------------------------------- Insert Tle 4 out here ------------------------------------------------- Choosing the est using DEBA with inry ues. Rell tht this multi-stge model works in the following wy. At the first stge, lterntives with vlues of 0 for the most importnt ue re eliminted unless ll lterntives exhiit 0. If only one lterntive hs vlue of, it is seleted nd the proess termintes. If, however, more thn one lterntive remins, the sme proedure tkes ple with the remining lterntives exept tht the seond most importnt ue is used. The proess ontinues in the sme mnner through susequent stges, if neessry. It stops when either only one lterntive remins (i.e., the hosen lterntive) or, if there is more thn one 9

lterntive ut no more ues, hoie is determined t rndom mong the remining lterntives. The proility tht given lterntive ws hosen orretly y DEBA is the proility tht the sequene of deisions (or elimintions) mde y the model t eh stge is orret. Thus, sine t eh stge of the model deisions re mde onditionl on the preeding stges, the key prmeters in estimting these proilities re the prtil orreltions etween Y nd W j, j =,,k (i.e., ontrolling for previous stges). For the first stge, this is yw, for the seond yw.w, for the third, yw w on. 8 3. w, nd so For exmple, ssume tht there re three lterntives A, B, nd C nd tht A hs een hosen y proess wherey C ws eliminted t the first stge nd B t the third stge. Strting kwrds, onsider the deisions the model mkes t eh stge. Tht is, the proility tht DEBA orretly seleted A over B t the third stge, ontrolling for the elimintion of C t the first stge, is {( Y Y W = w > W = w ) ( Y > Y W = w > W w )} P > =. This proility 3 3 3 3 n e lulted y mking use of the pproprite prtil orreltions in this se, yw w nd yw 3. w nd dpting the single vrile equtions (e.g., the generl eqution 0 9 ). At the seond stge, the model mkes no deision. At the first stge, it elimintes C so we need to lulte dditionlly the proility tht A ould hve een orretly seleted only with informtion ville t this stge: {( Y Y W = w > W = w ) ( Y > Y W = w > W w )} P > =. This n e lso found through n dpted expression (0), using yw 8 For exmple, yw. w = yw ( )( ). yw yw w w ww. Importntly, the events 9 The terms tht need to e dpted in the expression (0) re the upper limits or integrtion nd σ z,z. 0

represented y the proility expressions for the first nd third stges re disjuntive. Therefore, the proility tht DEBA mkes the orret deision in this se is equl to the sum of the two expressions. Consider nother exmple involving three lterntives A, B, nd C. Assume tht DEBA elimintes C t the first stge nd t the third stge piks either A or B t rndom (this will hppen if A nd B re identil). Thus, the 0.5 proility tht DEBA mkes the orret deision t the third stge should e disounted y the proility tht C, eliminted t the first stge, is not etter thn A nd B. Tht is ( P {( Y > Y W = w > W = w ) ( Y > Y W = w > W = w )} ) 0.5 More generlly, the proility of DEBA mking the orret hoie hs to e lulted on se-y-se sis tking into ount, t eh stge, the proility tht the seleted lterntive should e hosen over the lterntive(s) eliminted t tht stge using the prtil orreltion of the ue pproprite to the stge. Moreover, the proility for eh se inludes the proilities of suessful deisions t eh stge. If t the finl stge, there re two or more lterntives, the pproprite rndom proility is djusted y the proility tht orret deisions were tken t previous stges (see, e.g., the exmple ove). 0 Choosing the est using the EW-SV model with inry ues (EW-SV). The first stge of this model uses EW. If single lterntive is hosen, the proility of it eing orret is found y pplying the formul for EW. If two or more lterntives re tied, seond stge onsists of seleting the lterntive fvored y the first ue. To lulte the proility tht this is orret, one needs to lulte the joint proility tht the seleted lterntive () is lrger thn the lterntives eliminted t. 0 In this setion, we hve only indited the generl strtegy for lulting relevnt proilities for DEBA. The detils involve repeted pplitions of the sme proility theory priniples pplied in mny different situtions (Kreli & Hogrth, in preprtion).

the first stge, nd () lrger thn the other lterntives onsidered t the seond stge. (To lulte these proilities, use is mde of the pproprite nlogs to eqution 7). Any ties remining fter stge two re resolved t rndom with orresponding djustment eing mde to the proility lultions. Choosing the est using the EW-DEBA model with inry ues (EW-DEBA). This model strts s EW. If EW hooses one lterntive, the proility of orret hoie of the model oinides with tht for EW. If two or more lterntives re tied, the DEBA model is used to hoose etween the remining lterntives nd proilities re lulted ordingly (see ove). V. Empiril evidene Our equtions provide ext theoretil proilities for ssessing performne of the different models in speified onditions, i.e., to mp the ontours of the regions of rtionlity. However, severl ftors ffet solute nd reltive performne levels of the models (e.g., ue vlidities, inter-orreltion mong vriles, ontinuous vs. inry vriles, error), nd it is diffiult to ssess their importne simply y inspeting the formuls. We therefore use oth simulted nd empiril dt to illuminte model performne under different onditions. Rel dt hve the dvntge of testing the theory in speifi, leit limited environments. Simulted dt, on the other hnd, filitte testing model preditions over wide rnge of environments. We first onsider the simulted dt. Simultion design nd method. The simultion design used for hoosing the est from two, three nd four lterntives is presented in Tle 5. Overll, we speified 0 different popultions tht re sudivided into four sets or ses A, B, C,

nd D eh of whih ontins five su-ses (leled,, 3, 4, nd 5). Sine, priori, severl ftors might e thought importnt, we would hve liked to vry these orthogonlly. However, orreltions etween ftors restrit implementing fully systemti design. We therefore vried some ftors t the level of the ses (A, B, C, nd D) nd others ross su-ses (i.e., within A, B, C, nd D). At the level of ses, A nd B involved three ues or ttriutes wheres ses C nd D involved five. Cses A nd C hd little or no inter-ue orreltion; ses B nd D hd moderte to high interorreltion. --------------------------------------------- Insert Tle 5 out here --------------------------------------------- Aross su-ses (i.e., from through 5 within eh of A, B, C, nd D), we vried: () the vriility of ue vlidities (mximum less minimum); () the vlidity of the first (i.e., most importnt) ue; (3) verge vlidity; nd (4) the orreltion etween y nd x. For ll, vlues inrese from the su-ses through 5. As onsequene, the R on initil fit for MR lso inreses ross su-ses. This implies tht the su-ses involve high levels of error wheres the su-ses 5 re, in priniple, quite preditle environments. Su-ses, 3, nd 4 fll etween these extremes. To ondut the simultion, we defined 0 sets of stndrdized multivrite norml distriutions with the prmeters speified in Tle 5 nd generted smples of size 40 from eh of these popultions. The oservtions in eh smple were split t rndom on 50/50 sis into fitting nd predition su-smples nd model prmeters were estimted on the fitting su-smple. Two, three or four lterntives (s pproprite) were then drwn t rndom from this su-smple nd, using the estimted model prmeters, proilities of orretly seleting the est of these 3

speifi lterntives were lulted. This ws then ompred to wht tully hppened, tht is, on fitting sis. Next, lterntives were drwn t rndom from the predition su-smple, relevnt proilities lulted using the prmeters from the fitting su-smple, nd preditions ompred to reliztions. This exerise ws repeted 5,000 times (for eh of the hoies involving two, three, nd four lterntives). The ove desries the proedure used for ontinuous dt. For models using inry dt, we followed extly the sme proedures exept tht preditor vriles only took vlues of 0 or. Speifilly, sine we were smpling ontinuous normlized vriles, we reted inry vriles y medin splits (i.e., inry vriles were set to 0 for negtive vlues of ontinuous vriles nd for nonnegtive vlues). Thus, if one estimtes the prmeters of the 0 popultions for the inry dt, the estimtes differ systemtilly from their ontinuous ounterprts shown in Tle 5 (they re smller). However, we do not disply the inry prmeter estimtes sine the prmeters in Tle 5 represent the proess tht generted the dt. Simultion results. Tles 6, 7, nd 8 present the results of the simultions for the hoie of est of two, three, nd four lterntives, respetively. Figure presents some seleted outomes from the se involving est of three (Tle 7). Results reported here re limited to preditions nd reliztions for the holdout smples (i.e., tests of ross-vlidtion). A possile ritiism of our preditive tests of the single vrile models (SV, SV, nd DEBA) is tht we did not use the smpling proess to determine the most importnt vrile (for SV nd SV) nor the rnk orders of the ue vlidities (for DEBA). Insted, we endowed the models with the pproprite knowledge. However, in susequent simultions we hve found tht with smple sizes of 0 (s here) the net effet of filing to identify the most importnt vrile is quite smll. Similrly, s long s DEBA orretly identifies the most importnt vrile, net differenes re lso smll (Hogrth & Kreli, in press). As might e expeted y exmining the suess of the models on ross-vlidtion, the fitting exerise produed lmost perfet mthes etween smples nd models. 4

To simplify reding the tles, note tht reliztions re underlined (e.g., 6), nd tht the lrgest reliztion for eh popultion (i.e., per olumn) is presented in old (e.g., 65). In ddition, when MR is the lrgest, we lso denote the seond lrgest in old. We further show men reliztions for eh olumn nd row in the tles. The olumn mens thus represent the verge reliztions of ll models within speifi popultions wheres the row mens hrterize verge model performne ross popultions. We first note tht, with the exeption of multiple regression (MR), the mth etween model preditions nd reliztions is quite lose in ll ses. MR mkes lrge errors for ses C nd D nd prtiulrly when the differenes etween mximum nd minimum ue vlidities re smllest (su-ses nd ). Although we used djusted R in mking preditions, the djustment ws insuffiient in these situtions. 3 ----------------------------------------------------------- Insert Tles 6, 7, 8 nd Figure out here ----------------------------------------------------------- Qulittively, the reltive effetiveness of the models is quite similr whether one looks t the results for est of two, three, or four. Wht hnges, of ourse, is the generl level of performne whih diminishes s the numer of lterntives inreses. This n e seen y ompring the olumns of men reliztions of Tles 6, 7, nd 8, i.e., t the extreme right hnd sides of the tles. Within eh tle, n initil, overll impression is the lk of lrge differenes etween the performnes of the different models. However, there re systemti effets. 3 One n lso rgue with some justifition tht the rtio of oservtions to preditor vriles is too smll to use multiple regression (prtiulrly for ses C nd D). However, we re prtiulrly interested in oserving how well the different models work in environments where there re not mny oservtions. 5

First, onsider whether preditor vriles re inry or ontinuous. The use of inry s opposed to ontinuous vriles implies loss of informtion. As suh, we expeted tht models sed on ontinuous vriles would predit etter thn their inry ounterprts. Indeed, this is lwys the se in three diret omprisons: SV vs. SV, EW vs. EW, nd MR vs. MR. Speifilly, note tht SV nd EW re oth hndipped reltive to SV nd EW in tht they neessrily predit mny ties tht re resolved t rndom. Thus, so long s the knowledge in SV nd EW implies etter thn rndom preditions, models sed on ontinuous vriles re fvored. On the other hnd, the performne of DR domintes DR for ses A, C, nd D with performne eing quite similr for se B (for est of two, three, nd four). It would pper tht DR exploits more ses of pprent dominne thn DR whih (s onsequene) deides more hoies t rndom. Thus, to the extent tht the dditionl dominne ses deteted y DR reltive to DR hve more thn rndom hne of eing orret, DR outpredits DR. Wheres the DR models represent nïve seline strtegies, we elieve this finding is importnt euse it demonstrtes how simple strtegy n exploit the struture of the environment suh tht more informtion (in the form of ontinuous s opposed to inry vriles) does not improve performne ( so-lled less is more effet, Goldstein & Gigerenzer, 00; Hertwig & Todd, 003). Seond, models tht resolve ties perform etter thn their ounterprts tht re unle to do so (e.g., DEBA vs. SV, nd EW/DEBA nd EW/SV vs. EW). However, in the presene of redundny, these differenes re quite smll (i.e., for ses B nd D). More interesting is the omprison etween SV nd DEBA. The former uses single, ontinuous vrile. The ltter relies hevily on one inry vrile ut n lso use others depending on irumstnes. It is thus not ler 6

whih strtegy tully uses more informtion. However, one gin, hrteristis of the environment determine whih strtegy is more suessful. SV domintes DEBA in se B s well s for muh of ses A nd D. On the other hnd, DEBA domintes SV in se C. In Figure, we hve hosen to illustrte the performne of five models ross ll the environments for the hoie of est of three (i.e., dt from Tle 7). Two of these models, DR nd MR re depited euse they represent, respetively, nïve nd sophistited enhmrks. The other models, SV, DEBA, nd EW re quite different types of heuristis (see Tle ). Both SV nd DEBA require prior knowledge of wht is importnt (DEBA more so thn SV). However, they use little informtion nd neither involves ny omputtion. (DEBA, it should e relled, lso opertes on inry dt.) EW, on the other hnd, does not require knowledge of differentil importne of vriles ut does use ll informtion ville nd needs some omputtionl ility. In interpreting Figure, it is instrutive to rell tht ses A nd C (on the left) represent environments with low redundny wheres ses B nd D (on the right) hve higher levels of redundny. Also within eh se, the mount of noise in the environment dereses s one moves from su-se (on the left) to su-se 5 (on the right). As expeted, in the noisier environments (su-ses ), the performnes of ll models re degrded suh tht differenes re smll. However, s error dereses (i.e., moving right towrd su-ses 5), model performnes vry y environmentl onditions. With low redundny (ses A nd C), there pper to e lrge differenes in model performne. However, in the presene of redundny (ses B nd D), there re two distint lsses of models: SV nd MR hve similr 7

performne levels nd re superior to the others. We further note tht SV is most effetive in se B nd lso does well s environmentl preditility inreses in ses A nd D. DEBA is never the est model ut performs quite dequtely in se C where EW hs the est performne. Of the enhmrk models, DR generlly lgs ehind the other models (s would e expeted). Finlly, lthough MR is typilly one of the etter models, it does not dominte in ll environments. To highlight regions of rtionlity, Tle 9 represents the dt from Figure in nother mnner (think of Tle 9 s mp!). Speifilly, the performnes of SV, DEBA, EW, nd DR re ompred to the normtive enhmrk of MR (y deduting the performne of eh of the former from the ltter). Thus positive (negtive) entries in Tle 9 indite the mount y whih the performne of MR exeeds (flls short of) those of other models. Three ses re indited: in the shded res, MR exeeds other models y t lest 5; in the unshded res, the stndrd font (e.g., 4) indites tht the MR dvntge is smll (< 5) ut positive; nd old, underlined font (e.g., -) denotes tht MR hs no dvntge over the other model. The tle depits reltive model performne y hrteristis of regions, e.g., redundny nd noise (tht is, inter-ue orreltion nd error, respetively). ----------------------------------------- Insert Tle 9 out here ----------------------------------------- For exmple, rell tht ses A nd C involve low redundny wheres this is not true of ses B nd D. Also, for ll ses, s one moves from su-ses through 5, environments involve less noise. With this in mind, one n ttriute the smller differenes etween models in su-ses nd s eing minly due to noise. Interestingly, the reltive suess of SV in ses B nd D (ompred to A nd C) seems to e the effet of inter-orreltion etween the other ttriutes, i.e., 8

redundny. Tle 9 is only presented s n illustrtion. The dt from Tles 6, 7, nd 8 n lerly e used to rete mps tht highlight different spets of the deision mking terrin. A further wy of summrizing ftors tht ffet the performne of the different models is to onsider the regression of model performne on sttistis tht desrie the hrteristis of the 0 simulted environments (Tle 5). In other words, onsider regressions for eh of our eleven models of the form P i = Zδ + τ (8) i i where P i is the performne reliztion of model i (i =,, ); Z is the (0 x 3) x s mtrix of independent vriles (sttistis hrterizing the dtsets where there re three hoie situtions, i.e., est of two, three, or four lterntives); δ i is the s x vetor of regression oeffiients; nd τ i is normlly distriuted error term with onstnt vrine, independent of Z. To hrterize the environments or dtsets (the Z mtries), we hose the following vriles: vriility of ue vlidities (mx less min), the vlidity of the most importnt ue ( r ), the vlidity of the verge of the ues ( r ), verge interorreltion of the ues, verge vlidity of the ues, numer of ues, nd R for MR nd MR. 4 We lso used dummy vriles to model the effets of hoosing etween different numers of lterntives. Dummy ptures the effet of hoosing from three s opposed to two lterntives, nd Dummy the dditionl effet of hoosing from four lterntives. Results of the regression nlyses re summrized in Tle 0. 4 We only used R s n independent vrile for MR nd MR euse we thought it would e pproprite for these models. For the other models, however, it ws deemed more illuminting to hrterize performne y the other mesures (R nd these other mesures re orrelted in different wys). It should lso e noted tht we used the sme sttistis (sed on ontinuous vriles) to hrterize the environments for models using oth ontinuous nd inry vriles on the grounds tht the underlying environments were sed on ontinuous vriles. 9

--------------------------------------------------- Insert Tle 0 out here --------------------------------------------------- We used step-wise proedure with entry (exit) thresholds for the vriles of <.05 (>.0) for the proility of the F sttisti. All oeffiients for the models shown in Tle 0 re sttistilly signifint (p <.00) nd ll regressions fit the dt well (see R nd estimted stndrd errors t the foot of Tle 0). The onstnt term is firly high ross ll models nd mesures the level of performne tht would e expeted of the models in inry hoie sent informtion out the environment (pproximtely 50, i.e., from 4 to 65). Dummy indites how muh suh performne would fll when hoosing etween three lterntives (etween nd 4), nd Dummy shows the dditionl drop experiened when hoosing mong four lterntives (etween 6 nd 9). For SV nd SV, only one other vrile is signifint, the orreltion etween the single vrile nd the riterion. This mkes intuitive sense s does the ft tht the regression oeffiient is lrger with ontinuous s opposed to inry vriles (50 vs. 9). The DEBA nd EW models re ll hevily influened y the orreltion etween the riterion nd x. Rell, however, tht this orreltion is itself n inresing funtion of verge ue vlidity nd the numer of ues ut deresing in the inter-orreltion etween ues (see the formul in footnote to Tle ). Thus, eteris prius, inresing inter-orreltion etween the ues redues the solute performne levels of these models. DEBA differs from the EW models in tht the orreltion of the most vlid ue is signifint preditor. This mthes expettions in tht DEBA relies hevily on the vlidity of the most importnt ue wheres EW weights ll ues eqully. (We lso note tht the SV models weight the most vlid ue more hevily thn DEBA.) As to the Domrn models, the interprettion of the signs of 30

ll oeffiients is not ovious. Finlly, for MR it omes s little surprise tht R should e so importnt lthough this vrile is less slient for MR. A possile surprise is tht vriility in ue vlidities (mximum less minimum) ws not signifint ftor for most models. One might hve thought, priori, tht suh dispersion would hve een importnt for DEBA (f., Pyne et l., 993). However, this is not the se nd is onsistent with theoretil nlyses of DEBA tht show tht its ury is reltively roust to different weighting funtions (Hogrth & Kreli, in press; Buells, Crrso, & Hogrth, 005). Finlly, wheres the regression sttistis pint n interesting piture of model performne in the prtiulr environments oserved, we ution ginst overgenerliztion. We only oserved restrited rnges of the environmentl sttistis (i.e., hrteristis) nd thus nnot omment on wht might hppen eyond these rnges. Our pproh, however, does suggest wy to illuminte model x environment intertions. To summrize, ross ll 0 environments tht, inter li, re sujet to different levels of error, the reltive performnes of the different models were not seen to vry gretly when fed with the sme tsks (e.g., hoose est of three lterntives). However, there were systemti differenes due to intertions etween hrteristis of models nd environments. Thus, wheres the dditionl informtion ontined in ontinuous s opposed to inry vriles enefits some models, e.g., SV nd EW, it n e detrimentl to others, e.g., DR. Seond, models vried in the extent to whih they were ffeted y speifi environmentl hrteristis. SV models, for exmple, depend hevily on the vlidity of the most vlid ue wheres this only ffets EW models through its impt on verge ue vlidity. Interestingly, the vlidity of the verge of the ues ws seen to hve more impt on the performne 3

of DEBA thn the vlidity of the most vlid ue. Averge inter-orreltion of preditors or redundny tends to redue performne of ll models (exept SV). Overll, results do mth some generl trends noted in previous simultions (Pyne et l., 993; Fsolo et l., in press); however, ptterns re not simple to desrie. The vlue of our work, therefore, is tht we now possess the mens to mke preise preditions for vrious simple, heuristi models in different environments. Tht is, given speified environments, we n predit priori oth levels of model performne nd whih models will e more or less effetive. An importnt environmentl ftor we did not vry ws the impt of different distriutions of speifi kinds of lterntives. Insted, we simulted rndom drwings of lterntives given the popultion hrteristis defined in Tle 5. We did not, for exmple, skew the smpling proess to inlude or exlude disproportionte numers of, sy, dominting or dominted lterntives (f., Pyne et l., 993). As we hve rgued elsewhere (Hogrth & Kreli, 004), the distriution of lterntives n hve importnt effets on oth the generl level of performne hieved y models s well s reltive performne (some distriutions, for exmple, re reltively friendly or unfriendly to speifi models, Shnteu & Thoms, 000). On the other hnd, sine our methodology n mke speifi preditions for eh se enountered, it n esily hndle the effets of smpling from different distriutions of lterntives. Empiril dt. We used dtsets from three different res of tivity. The first involved performne dt of the 60 leding golfers in 003 lssified y the Professionl Golf Assoition (PGA) in the USA. 5 From these dt (N = 60), we exmined two dependent vriles: ll-round rnking nd totl ernings. The first 5 These dt were otined from the wepge http://www.pgtour.om/stts/leders/r/003/0. They re performne sttistis of golfers in the min PGA Tour for 003. 3