A Behaviouristic Model of Signalling and Moral Sentiments Johannes Zschache, University of Leipzig Monte Verità, October 18, 2012
Introduction Model Parameter analysis Conclusion
Introduction the evolution of cooperative behaviour in one-time PD-interaction C D C 3 0 D 4 1 Robert Frank (1987): Homo Economicus might prefer a utility function with a conscience moral sentiments have evolved to counteract the temptation to cheat in one-time interaction in combination with signals that are contingent upon the sentiments, compliant behaviour is stable
Introduction Pr(H S j ) = h f H (S j ) h f H (S j ) + (1 h) f D (S j ) E(π H S j ) = π(c, C) Pr(H S j ) + π(c, D) (1 Pr(H S j )) interaction with j if E(π H S j ) > π(e) = π(d, D) stringent assumptions knowledge of the population structure correct interpretation of signals
Introduction alternative idea in Frank (1988) 1. moral sentiments develop in stable relationships 2. the matching law (Herrnstein, 1997) is behavioural model 3. impulsiveness: immediate reward often exceeds long-term benefits (chocolate cake during a diet, smoking,.. ) 4. moral sentiments make the actor prudent when: a) choosing an action in the iterated PD b) choosing to interact with a partner 5. moral sentiment develops by evolutionary process 6. same moral sentiments affect interactions with strangers
Model 1. moral sentiments develop in stable relationships 0.05 0.77 0.43 0.4 0.84 0.31 0.23 0.86 0.72 0.84 0.86 0.75 0.78 0.76 0.02 0.95 0.26 0.39 0.64 0.72 0.51 0.5 0.92 0.69 0.94 0.91 0.43 0.65 0.83 0.14 0.1 0.56 0.53 0.88 0.51 0.42 0.63 0.53 0.96 1 0.48 0.34 0.53 0.55 0.91 0.91 0.02 0.3 0.86 0.29 0.54 0.05 0.13 0.22 0.11 0.53 0.32 0.68 0.99 0.83 0.17 0.05 0.88 0.28 0.64 0.99 0.25 0.08 0.49 0.41 0.22 0.4 0.25 0.53 0.64 0.08 0.88 0.12 0.15 0.6 0.21 0.68 0.59 0.93 0.2 0.68 0.21 0.87 0.99 0.02 0.7 0.85 0.11 0.77 0.07 0.75 0.69 0.96 0.21 0.16 100 agents on a two-dimensional grid Moore neighbourhoods N recurrent interactions with neighbours
Model 2. the matching law (Herrnstein, 1997) is behavioural model Definition Let A = {a 1,..., a m } be the set of all possible actions, and let T (a i ) denote the number of times when action a i was chosen during a specified time period. Furthermore, let U(a i ) = t u t(a i ) be the sum of all reinforcements that were received after emitting action a i during this period. The matching law holds if and only if T (a i ) T (a 1 ) + T (a 2 ) + + T (a m ) = U(a i ) U(a 1 ) + U(a 2 ) + + U(a m ), for all i {1,..., m}.
Model 3. impulsiveness: if the reinforcement is delayed by d t (a i ): V (a i ) = u t (a i ) 1 + I d t t (a i ) exponential discounting: δ d(a i ) U(a i ), δ [0, 1], x R: δ d(a) U(a) > δ d(b) U(b) δ d(a)+x U(a) > δ d(b)+x U(b) hyperbolic discounting: I = 1.0 and x = 100: 10 1 + I 100 < 20 1 + I 102, but 10 1 + I 0 > 20 1 + I 2 implications for PD: the immediate value of the temptation pay-off overwhelms the player of an iterated prisoner s dilemma
Model 4. moral sentiments help to overcome impulsiveness U(a i ) V (a i ) = 1 + I d(a i ) Guilt is just such a feeling. [..] If it is felt strongly enough, it can negate the spurious attraction of the imminent material reward (Frank, 1988, p.82).
Model a) choosing an action in the iterated PD one memory entry of length λ for each neighbour n N: (σ(n), π(n)): σ(n) {C, D, E} λ, π(n) R λ bookkeeping β: V (n, a) = algorithm 1: j:σ(n) j =a min(j+β,λ) i=j 1: for all a {C, D} do n N 2: calculate v(a) = V (n,a) T (a) 3: end for 4: â select action with highest v(a) 5: return â π(n) i 1 + I (i j)
Model b) choosing to interact with a partner average value of a partner v(n): v(n) = algorithm 2: 1: for all n N do 2: calculate v(n) 3: end for 4: ˆn select neighbour with highest v(n) 5: return ˆn λ i=1 π(n) i T (n) algorithm 1 and 2 are called melioration learning (Herrnstein, 1997) melioration learning is a process that leads to the matching law
Model 5. moral sentiment develops in an evolutionary process the impulsiveness I resembles the impact of an actor s moral sentiments evolution of moral sentiment evolution of I an agent s fitness is the average of the reinforcements during one generation (1000 interactions) after one generation, new agents are bred a parent is chosen with a probability directly proportional to the parent s fitness a parent passes on his impulsiveness value to a new agent random noise: p mut = 0.1 probability of experimenting ɛ : choose random neighbour as partner and random action
Paramter analysis experimenting & memory length one memory entry of length λ for each neighbour n N: (σ, π), σ {C, D, E} λ, π R λ average impulsiveness 1.0 0.8 0.6 0.4 0.2 0.0 1.0 0.8 0.6 0.4 0.2 0.0 experimenting ε, memory length λ (bookkeeping β = 10) 0.05, 25 0.05, 50 0.05, 100 0.10, 25 0.10, 50 0.10, 100 0 200 400 600 800 1000 0 200 400 600 800 1000 0 200 400 600 800 1000 generation cooperation interaction
Paramter analysis bookkeeping: accounting for the future V (n, a) = min(j+β,λ) j:σ(n) j =a i=j π(n) i 1 + I (i j) average impulsiveness 1.0 0.8 0.6 0.4 0.2 0.0 bookkeeping β 1 5 10 0 200 400 600 800 1000 0 200 400 600 800 1000 0 200 400 600 800 1000 generation cooperation interaction (experimenting ε = 0.1, memory length λ = 100)
Model: Interactions with Strangers 6. same moral sentiments affect interactions with strangers a certain percentage, φ, of interactions take place with strangers (= actors who are met only once) there is no memory of past interactions with a stranger but there might be a signal that is contingent on the existence of moral sentiments a signal s is a number between 0 and 9 indicating the actor s impulsiveness one memory entry for each signal strength λ i=1 average value of a signal v(s): v(s) = π(s) i T (s) actors can choose not to interact with a stranger
Paramter analysis Interactions with Strangers average impulsiveness 1.0 0.8 0.6 0.4 0.2 0.0 1.0 0.8 0.6 0.4 0.2 0.0 no signals / signals, strangers φ no signals, 0.2 no signals, 0.4 no signals, 0.6 no signals, 0.8 signals, 0.2 signals, 0.4 signals, 0.6 signals, 0.8 200 400 600 800 1000 200 400 600 800 1000 200 400 600 800 1000 200 400 600 800 1000 generation (experimenting ε = 0.1, memory length λ = 100, bookkeeping β = 10)
Paramter analysis Interactions with Strangers 1.0 impulsiveness 1.0 cooperation with partners 1.0 cooperation with strangers 0.8 0.8 0.8 0.6 0.6 0.6 0.4 0.4 0.4 0.2 0.2 0.2 0.0 0.0 0.0 0.2 0.4 0.6 0.8 1 strangers σ 0.2 0.4 0.6 0.8 1.0 strangers σ 0.2 0.4 0.6 0.8 1.0 strangers σ no signal signal no signal signal no signal signal (generation > 900, experimenting ε = 0.1, memory length λ = 100, bookkeeping β = 10)
Conclusion Frank (1987): formal model of signalling and moral sentiments that lead to the evolution of cooperation in the one-time PD; heavy assumptions Frank (1988): informal ideas about the development of moral sentiments behaviouristic view on human behaviour (matching law) sentiments emerge because of their effects to repress impulsiveness leading to cooperation among friends in case of one-time interactions, signals are needed to support the development of moral sentiments and cooperation among strangers
References Frank, R. H. (1987). If Homo Economicus could choose his own utility function, would he want one with a conscience? The American Economic Review 77(4), 593 604. Frank, R. H. (1988). Passions within Reason. The Strategic Role of the Emotions. W. W. Norton & Company. Herrnstein, R. J. (1997). The Matching Law. Papers in Psychology and Economics. Harvard University Press. source code: http://code.google.com/p/signalling-emotions
Supplementary each of the simulations was performed for a fixed set of parameter values several repetitions that differ in their random seeds since the simulation can be represented as stochastic time-homogeneous Markov chains, they tend toward a unique and stationary state distribution we use the average level of I and the rate of interaction and cooperation as summary statistics to describe the unique distribution statistical tests to check whether the summary statistic describes the stationary state additionally: check for the most promising conditions that lead to the evolution of moral sentiments
Supplementary: Convergence Statistics Gelman, A. and D. B. Rubin (1992). Inference from Iterative Simulation Using Multiple Sequences. Statistical Science 7 (4), 457 472. Brooks, S. P. and A. Gelman (1998). General Methods for Monitoring Convergence of Iterative Simulations. Journal of Compuational and Graphical Statistics 7 (4), 434 455. we generate m 1 chains of a simulation with n time steps : (x 11, x 12,..., x 1n ), (x 21, x 22,..., x 2n ),..., (x m1, x m2,..., x mn ). ˆR I = s ˆR s = length of pooled-chains interval mean length of the within-chain 1 m n mn 1 j=1 i=1 x ji x s n i=1 x ji x j s 1 m(n 1) m j=1 iterated graphical approach: sub-chains (x j1,..., x j(2kb) ), with b being a batch length and k = 1,..., n/b
Supplementary: Example average impulsiveness 1.0 0.8 0.6 0.4 0.2 0.0 ticks per generation 100 250 500 750 1000 200 400 600 800 1000 200 400 600 800 1000 200 400 600 800 1000 200 400 600 800 1000 200 400 600 800 1000 generation ticks per generation 100 250 500 750 1000 6 R^ c = 1.91 R^ c = 1.18 R^ c = 1.73 R^ c = 1.34 R^ c = 1.09 5 R^ l = 2.12 R^ l = 1.35 R^ l = 1.73 R^ l = 1.36 R^ l = 1.06 statistic statistic 4 3 3 R^ 3 = 1.59 3 R^ 3 = 1.14 3 R^ 3 = 1.42 3 R^ 3 = 1.21 3 R^ 3 = 1.04 R^ c R^ l 3 2 R^ 3 1 10 20 30 40 50 10 20 30 40 50 10 20 30 40 50 10 20 30 40 50 10 20 30 40 50 iteration no
Supplementary: The Wald-Wolfowitz Test Wald, A. and J. Wolfowitz (1940). On a test whether two samples are from the same population. Annals of Mathematical Statistics 11 (2), 147 162. Grazzini, J. (2012). Analysis of the emergent properties: Stationarity and ergodicity. Journal of Artificial Societies and Social Simulation 15 (2), 7. check whether two samples X and Y respectively with n and m observations are from the same population i.e. whether the distributions of the two samples are identical two samples are pooled and arranged in ascending order as Z = (z 1, z 2,..., z n+m ), where z 1 < z 2 < < z n+m sequence V of zeros and ones: replacing each element of Z by 0 if z i is element of X and by 1 if z i is element of Y statistic U(X, Y ) of two samples X and Y is the number of runs in the corresponding V sequence given X = {5, 2.2, 4.5, 1} and Y = {2, 4.3, 2.5, 1.4, 3}, V = (0, 1, 1, 0, 1, 1, 1, 0, 0) and U(X, Y ) = 5.
Supplementary: Example stationarity: check whether sections of the mean distribution belong to the same distribution ergodicity: check whether m chains of a simulation belong to the same distribution b = 10 b = 20 tpg u s p s u e p e u s p s u e p e 100 27 0.000000 33 0.004850 40 0.000000 57 0.000000 250 31 0.000640 42 0.700180 57 0.000000 74 0.052510 500 23 0.000000 37 0.101700 46 0.000000 65 0.000110 750 34 0.011770 42 0.700180 66 0.000270 76 0.126710 1000 39 0.281090 40 0.412330 67 0.000600 78 0.256780