A Behaviouristic Model of Signalling and Moral Sentiments

Size: px

Start display at page:

Download "A Behaviouristic Model of Signalling and Moral Sentiments"

Nelson Sullivan
5 years ago
Views:

1 A Behaviouristic Model of Signalling and Moral Sentiments Johannes Zschache, University of Leipzig Monte Verità, October 18, 2012

2 Introduction Model Parameter analysis Conclusion

3 Introduction the evolution of cooperative behaviour in one-time PD-interaction C D C 3 0 D 4 1 Robert Frank (1987): Homo Economicus might prefer a utility function with a conscience moral sentiments have evolved to counteract the temptation to cheat in one-time interaction in combination with signals that are contingent upon the sentiments, compliant behaviour is stable

4 Introduction Pr(H S j ) = h f H (S j ) h f H (S j ) + (1 h) f D (S j ) E(π H S j ) = π(c, C) Pr(H S j ) + π(c, D) (1 Pr(H S j )) interaction with j if E(π H S j ) > π(e) = π(d, D) stringent assumptions knowledge of the population structure correct interpretation of signals

5 Introduction alternative idea in Frank (1988) 1. moral sentiments develop in stable relationships 2. the matching law (Herrnstein, 1997) is behavioural model 3. impulsiveness: immediate reward often exceeds long-term benefits (chocolate cake during a diet, smoking,.. ) 4. moral sentiments make the actor prudent when: a) choosing an action in the iterated PD b) choosing to interact with a partner 5. moral sentiment develops by evolutionary process 6. same moral sentiments affect interactions with strangers

6 Model 1. moral sentiments develop in stable relationships agents on a two-dimensional grid Moore neighbourhoods N recurrent interactions with neighbours

7 Model 2. the matching law (Herrnstein, 1997) is behavioural model Definition Let A = {a 1,..., a m } be the set of all possible actions, and let T (a i ) denote the number of times when action a i was chosen during a specified time period. Furthermore, let U(a i ) = t u t(a i ) be the sum of all reinforcements that were received after emitting action a i during this period. The matching law holds if and only if T (a i ) T (a 1 ) + T (a 2 ) + + T (a m ) = U(a i ) U(a 1 ) + U(a 2 ) + + U(a m ), for all i {1,..., m}.

8 Model 3. impulsiveness: if the reinforcement is delayed by d t (a i ): V (a i ) = u t (a i ) 1 + I d t t (a i ) exponential discounting: δ d(a i ) U(a i ), δ [0, 1], x R: δ d(a) U(a) > δ d(b) U(b) δ d(a)+x U(a) > δ d(b)+x U(b) hyperbolic discounting: I = 1.0 and x = 100: I 100 < I 102, but I 0 > I 2 implications for PD: the immediate value of the temptation pay-off overwhelms the player of an iterated prisoner s dilemma

9 Model 4. moral sentiments help to overcome impulsiveness U(a i ) V (a i ) = 1 + I d(a i ) Guilt is just such a feeling. [..] If it is felt strongly enough, it can negate the spurious attraction of the imminent material reward (Frank, 1988, p.82).

10 Model a) choosing an action in the iterated PD one memory entry of length λ for each neighbour n N: (σ(n), π(n)): σ(n) {C, D, E} λ, π(n) R λ bookkeeping β: V (n, a) = algorithm 1: j:σ(n) j =a min(j+β,λ) i=j 1: for all a {C, D} do n N 2: calculate v(a) = V (n,a) T (a) 3: end for 4: â select action with highest v(a) 5: return â π(n) i 1 + I (i j)

11 Model b) choosing to interact with a partner average value of a partner v(n): v(n) = algorithm 2: 1: for all n N do 2: calculate v(n) 3: end for 4: ˆn select neighbour with highest v(n) 5: return ˆn λ i=1 π(n) i T (n) algorithm 1 and 2 are called melioration learning (Herrnstein, 1997) melioration learning is a process that leads to the matching law

12 Model 5. moral sentiment develops in an evolutionary process the impulsiveness I resembles the impact of an actor s moral sentiments evolution of moral sentiment evolution of I an agent s fitness is the average of the reinforcements during one generation (1000 interactions) after one generation, new agents are bred a parent is chosen with a probability directly proportional to the parent s fitness a parent passes on his impulsiveness value to a new agent random noise: p mut = 0.1 probability of experimenting ɛ : choose random neighbour as partner and random action

13 Paramter analysis experimenting & memory length one memory entry of length λ for each neighbour n N: (σ, π), σ {C, D, E} λ, π R λ average impulsiveness experimenting ε, memory length λ (bookkeeping β = 10) 0.05, , , , , , generation cooperation interaction

14 Paramter analysis bookkeeping: accounting for the future V (n, a) = min(j+β,λ) j:σ(n) j =a i=j π(n) i 1 + I (i j) average impulsiveness bookkeeping β generation cooperation interaction (experimenting ε = 0.1, memory length λ = 100)

15 Model: Interactions with Strangers 6. same moral sentiments affect interactions with strangers a certain percentage, φ, of interactions take place with strangers (= actors who are met only once) there is no memory of past interactions with a stranger but there might be a signal that is contingent on the existence of moral sentiments a signal s is a number between 0 and 9 indicating the actor s impulsiveness one memory entry for each signal strength λ i=1 average value of a signal v(s): v(s) = π(s) i T (s) actors can choose not to interact with a stranger

16 Paramter analysis Interactions with Strangers average impulsiveness no signals / signals, strangers φ no signals, 0.2 no signals, 0.4 no signals, 0.6 no signals, 0.8 signals, 0.2 signals, 0.4 signals, 0.6 signals, generation (experimenting ε = 0.1, memory length λ = 100, bookkeeping β = 10)

17 Paramter analysis Interactions with Strangers 1.0 impulsiveness 1.0 cooperation with partners 1.0 cooperation with strangers strangers σ strangers σ strangers σ no signal signal no signal signal no signal signal (generation > 900, experimenting ε = 0.1, memory length λ = 100, bookkeeping β = 10)

18 Conclusion Frank (1987): formal model of signalling and moral sentiments that lead to the evolution of cooperation in the one-time PD; heavy assumptions Frank (1988): informal ideas about the development of moral sentiments behaviouristic view on human behaviour (matching law) sentiments emerge because of their effects to repress impulsiveness leading to cooperation among friends in case of one-time interactions, signals are needed to support the development of moral sentiments and cooperation among strangers

19 References Frank, R. H. (1987). If Homo Economicus could choose his own utility function, would he want one with a conscience? The American Economic Review 77(4), Frank, R. H. (1988). Passions within Reason. The Strategic Role of the Emotions. W. W. Norton & Company. Herrnstein, R. J. (1997). The Matching Law. Papers in Psychology and Economics. Harvard University Press. source code:

20 Supplementary each of the simulations was performed for a fixed set of parameter values several repetitions that differ in their random seeds since the simulation can be represented as stochastic time-homogeneous Markov chains, they tend toward a unique and stationary state distribution we use the average level of I and the rate of interaction and cooperation as summary statistics to describe the unique distribution statistical tests to check whether the summary statistic describes the stationary state additionally: check for the most promising conditions that lead to the evolution of moral sentiments

21 Supplementary: Convergence Statistics Gelman, A. and D. B. Rubin (1992). Inference from Iterative Simulation Using Multiple Sequences. Statistical Science 7 (4), Brooks, S. P. and A. Gelman (1998). General Methods for Monitoring Convergence of Iterative Simulations. Journal of Compuational and Graphical Statistics 7 (4), we generate m 1 chains of a simulation with n time steps : (x 11, x 12,..., x 1n ), (x 21, x 22,..., x 2n ),..., (x m1, x m2,..., x mn ). ˆR I = s ˆR s = length of pooled-chains interval mean length of the within-chain 1 m n mn 1 j=1 i=1 x ji x s n i=1 x ji x j s 1 m(n 1) m j=1 iterated graphical approach: sub-chains (x j1,..., x j(2kb) ), with b being a batch length and k = 1,..., n/b

22 Supplementary: Example average impulsiveness ticks per generation generation ticks per generation R^ c = 1.91 R^ c = 1.18 R^ c = 1.73 R^ c = 1.34 R^ c = R^ l = 2.12 R^ l = 1.35 R^ l = 1.73 R^ l = 1.36 R^ l = 1.06 statistic statistic R^ 3 = R^ 3 = R^ 3 = R^ 3 = R^ 3 = 1.04 R^ c R^ l 3 2 R^ iteration no

23 Supplementary: The Wald-Wolfowitz Test Wald, A. and J. Wolfowitz (1940). On a test whether two samples are from the same population. Annals of Mathematical Statistics 11 (2), Grazzini, J. (2012). Analysis of the emergent properties: Stationarity and ergodicity. Journal of Artificial Societies and Social Simulation 15 (2), 7. check whether two samples X and Y respectively with n and m observations are from the same population i.e. whether the distributions of the two samples are identical two samples are pooled and arranged in ascending order as Z = (z 1, z 2,..., z n+m ), where z 1 < z 2 < < z n+m sequence V of zeros and ones: replacing each element of Z by 0 if z i is element of X and by 1 if z i is element of Y statistic U(X, Y ) of two samples X and Y is the number of runs in the corresponding V sequence given X = {5, 2.2, 4.5, 1} and Y = {2, 4.3, 2.5, 1.4, 3}, V = (0, 1, 1, 0, 1, 1, 1, 0, 0) and U(X, Y ) = 5.

Supplementary: Example stationarity: check whether sections of the mean distribution belong to the same distribution ergodicity: check whether m chains of a simulation belong to the same distribution

24 Supplementary: Example stationarity: check whether sections of the mean distribution belong to the same distribution ergodicity: check whether m chains of a simulation belong to the same distribution b = 10 b = 20 tpg u s p s u e p e u s p s u e p e

Lecture 3: Markov Decision Processes

Lecture 3: Markov Decision Processes Joseph Modayil 1 Markov Processes 2 Markov Reward Processes 3 Markov Decision Processes 4 Extensions to MDPs Markov Processes Introduction Introduction to MDPs Markov