Understanding Reasoning Using Utility Proportional Beliefs

Understandng Reasonng Usng Utlty Proportonal Belefs Chrstan Nauerz EpCenter, Maastrcht Unversty c.nauerz@maastrchtunversty.nl Abstract. Tradtonally very lttle attenton has been pad to the reasonng process that underles a game theoretc soluton concept. When modelng bounded ratonalty n one-shot games, however, the reasonng process can be a great source of nsght. The reasonng process tself can provde testable assertons, whch provde more nsght than the ft to expermental data. Based on Bach and Perea s [1] concept of utlty proportonal belefs, we analyze the players reasonng process and fnd three testable mplcatons: (1) players form an ntal belef that s the bass for further reasonng; (2) players reason by alternatngly consderng ther own and ther opponent s ncentves; (3) players perform only several rounds of delberate reasonng. Keywords: Epstemc game theory, nteractve epstemology, soluton concepts, bounded ratonalty, utlty proportonal belefs, reasonng. 1 Introducton Most of the ongong research n game theory focuses on the predcton of players choces. In ths paper, we want to tae another perspectve and loo at the players reasonng process rather than ther decsons. Ths perspectve can be especally helpful to understand expermental data from one-shot games wthout opportuntes for learnng or coordnatng. Exstng bounded ratonalty concepts (e.g. Quantal Response Equlbrum (QRE) [5] or Cogntve Herarchy Models (CHM) [3]) focus manly on the predcton of emprcal frequences rather than accurately mmcng players reasonng process. Therefore, these concepts do not provde a clear ratonale for players selectng certan choces. The underlyng dea of a concept mght gve a hnt but no a clear nsght. Most often, these models are evaluated by comparng ther ft to the data. Ths method, however, tells us lttle about the valdty of certan features of the models. A deeper nsght nto the reasonng process could help to better understand a concept s characterstcs and therefore whch features wor well and whch do not. Therefore, a concept that maes clear assertons about the reasonng process can be tested much more rgorously. In fact, a good concept should present clear assertons about the reasonng process that can be tested ndvdually and n ther nteracton wth each other. In ths paper, we dscuss a soluton concept that s based on a general dea and provdes detaled assertons about the reasonng process that can be tested ndvdually. 2 Utlty Proportonal Belefs Bach and Perea [1], henceforth BP, suggest a concept for bounded ratonalty that bulds up on a smple dea: the dfferences n probabltes a player assgns to hs opponent s

choces should be equal to the dfferences n the opponent s utltes for these choces. BP, formalze the soluton concept usng the type-based approach to epstemc game theory. Here we wll only ntroduce the man defnton and focus on the two player case. For amoreformaltreatmentconsultbp.however,beforestatngthedefntonofutlty proportonal belefs we need to ntroduce some further notaton. By I = {1, 2} we denote the set of players, by C we denote player s fnte choce set, by T we denote player s set of types, and by U : C C! R we denote player s utlty functon. The best and the worst possble utltes of player are denoted as ū := c2c u (c) and u := mn c2c u (c). (b (t ))(c t ) gves the probablty that player s type t assgns to s choce c gven that s of type t, where t 2 T and t 2 T. Defnton 1. Let, 2 I be the two players, and 2 R such that 0. Atypet 2 T of player expresses -utlty-proportonal-belefs, f (b (t ))(c t ) (b (t ))(c 0 t )= for all t 2 T (t ), for all c, c 0 2 C. ū u (u (c,t ) u (c 0,t )) (2.1) The defnton drectly corresponds to the dea of utlty proportonal belefs: the dfference n probabltes player assgns to the opponents choces s equal to the dfference of the utltes tmes the proportonalty factor /(ū u ).BPgveanntutventerpretaton of as measure of the senstvty of a player s belefs to dfferences n the opponents utltes. Note that there exsts an upper bound for the called. It s the mum value of for whch equaton (2.1) yelds well-defned probablty measures. The lower lmt of s 0. The concept of common belef n -utlty-proportonal-belefs requres that both players entertan utlty proportonal belefs, that both players beleve ther opponent holds utlty proportonal belefs, that both players beleve ther opponents beleve that ther opponents do so, and so on. BP ntroduce an algorthm to fnd exactly those belefs that are possble under common belef n -utlty-proportonal-belefs. The algorthm teratvely deletes belefs so that only the belefs, whch are possble under common belef n -utlty-proportonal-belefs, survve. BP show n ther Theorem 2 that belefs are unque n the two player case. By usng ther Lemma 4 we fnd an explct expresson for the unque belefs under common belef n -utlty-proportonal-belefs nstead of usng ther algorthm. Ths expresson reveals clues about the reasonng process players mght go through to obtan utlty proportonal belefs. 3 Reasonng Process To ntroduce the formula for the player s belefs some more notaton needs to be fxed. We denote the number of choces of player by n = C and the number of choces for player by m = C.Moreover,letN = {1,...,n} and M = {1,...,m}. The n 1 vector n wth n =( 1 n,..., 1 n ).LetC {c 1,...,cn } and C = {c 1,...,cm } so that we can denote player s n m utlty matrx by 2 3 U norm u 1 u 6 4 U (c 1,c1 ) U (c 1,cm ).. U (c n,c1 ) U (c n,cm ) The m m matrx Z m has m 1 m on the dagonal and 1 m off the dagonal. Intutvely, the centerng matrx subtracts the mean from the columns of a matrx when left multpled. 7 5.

We defne the matrx G := Z m U norm snce t wll be useful to develop a more ntutve understandng. By left-multplyng the normalzed utlty matrx U norm wth the centerng matrx Z m, one obtans a matrx where for every element the average of ts column has been subtracted. Note that the rows of U norm correspond to s choces and the columns to s choces. The same holds for the matrx Z m U norm,onlythatnoweachelement represents the relatve goodness of a choce gven an opponent s choce. Therefore, the matrx G gves the goodness of a choce gven a belef about the opponent s choce, scaled by the senstvty to the opponents dfferences n utlty. Now we can state the formula for s belefs about s choces under common belef n -utlty-proportonal-belefs: 1X (G G ) ( m + G n ) (3.1) =0 =( m + G n )+G G ( m + G n ) +G G [G G ( m + G n )] +, where s a m 1 vector wth the probabltes that player assgns to player s choces. We see that the expresson ( m + G n ) s repeated several tmes. In the second term, ths expresson s then adusted by left multplyng the matrces G G. In the thrd term the second term s adusted by left multplyng G G, and so on. Therefore, we call ntal ( m + G n ) the ntal belef,. It shows how player constructs her belefs about player wthout tang nto account that player reasons about her. Player starts off by assgnng equal probablty to her opponent s choce combnatons. Then she adusts her belef by addng the term G n, whch represents the goodness of s choces when assgns equal probablty to all of s choces. To emphasze the reasonng process, we defne as the belef that player holds after the th reasonng step, 0 ntal := := ntal + G G ( 1 ), such that lm!1 holds. To obtan a more ntutve understandng we rewrte as follows m + G ( n + G ( 1 )). We see that frst player taes s perspectve, whch s reflected n the expresson n + G ( 1 ). Hereplayer forms a belef about player gven s belef about from the prevous reasonng step. Frst assgns equal probablty to all of s choces. Then she corrects these belefs by the goodness of s choces gven s belef about from the prevous reasonng step. The result s a new belef of about. Thenplayer taes her own perspectve and assgns equal probablty to all of s choces. These probabltes are then agan corrected by the goodness of s choces gven the new belef of about. The process then contnues n the same fashon for the subsequent reasonng steps. It s also mportant to note that later reasonng steps wll be less mportant for the fnal belef than earler ones. Defne wth 2 [0, 1) and note that G = Z m U norm, so that (3.1) can be wrtten as 1X ( =0 Z m U norm Z n U norm ) ntal. Snce, 2 [0, 1),latertermsn P 1 =0 ( Z m U Z n U ) wll be smaller than earler ones and therefore less mportant for the fnal belef. Ths has also an mportant

mplcaton for the meanng of the proportonalty factor : the lower the value of the fewer steps of reasonng a player wll undergo to approxmate the fnal belef wthn a reasonable bound. The same holds true for her opponent s proportonalty factor. 4 Connectons to Psychology These features correspond closely to fndngs n the psychology lterature. In hs boo Thnng, Fast and Slow Kahneman [4] advocates the dea of reasonng n two dstnct ways. He calls the two modes of thnng System 1 and System 2, accordng to Stanovch [6]. Note that the word system should not ndcate an actual system but only serves as label for dfferent modes of thnng. System 1 s an automatc and mostly unconscous way of thnng that demands lttle computatonal capacty. System 2 descrbes the dea of delberate reasonng. It comes nto play when controlled analytcal thnng s needed. Table 1 summarzes the propertes of the two systems accordng to [6]. System 1 System 2 assocatve holstc relatvely undemandng of cogntve capacty relatvely fast acquston by bology, exposure, and personal experence rule-based analytc demandng of cogntve capacty relatvely slow acquston by cultural and formal tuton Table 1. Propertes of System 1 & 2 The concept of System 1 descrbes the unconscous frst reacton to a stuaton, whch happens almost mmedately and wthout demandng a lot of cogntve resources. Moreover, Kahneman [4] argues that the belefs formed by System 1 are the bass for conscous reasonng wthn System 2. Ths s consstent wth our fndngs snce the ntal belef does not tae nto account any strategc nteracton. Ths belef can be seen as an automatc ntal reacton to the game. The delberate reasonng process descrbed above can be magned as beng executed by System 2 usng the fndngs of System 1, or n ths case the ntal belef. Tang another player s perspectve taes delberate reasonng and can hardly be done automatcally. Fnally, we showed that the fnal belef can be approxmated wth fntely many steps of reasonng. Ths feature s closely related to the problem of lmted worng memory. Baddeley [2] defnes worng memory as "... [A] bran system that provdes temporary storage and manpulaton of the nformaton necessary for such complex cogntve tass as language comprehenson, learnng, and reasonng." Snce worng memory s crtcal for reasonng, however bounded, human bengs can only perform a lmted number of reasonng wthout the support of tools. Therefore, a model resemblng human reasonng should not predct an nfnte amount of reasonng steps. References 1. Bach, C.W., Perea, A.: Utlty proportonal belefs. Internatonal Journal of Game Theory pp. 1 22 (2014)

2. Baddeley, A.: Worng memory. Scence 255(5044), 556 559 (1992) 3. Camerer, C.F., Ho, T.H., Chong, J.K.: A cogntve herarchy model of games. The Quarterly Journal of Economcs 119(3), 861 898 (2004) 4. Kahneman, D.: Thnng, fast and slow. Macmllan (2011) 5. McKelvey, R.D., Palfrey, T.R.: Quantal response equlbra for normal form games. Games and economc behavor 10(1), 6 38 (1995) 6. Stanovch, K.E., West, R.F., et al.: Indvdual dfferences n reasonng: Implcatons for the ratonalty debate? Behavoral and bran scences 23(5), 645 665 (2000)