Concurrency models and Modern Processors

Size: px

Start display at page:

Download "Concurrency models and Modern Processors"

Randall Stephens
5 years ago
Views:

1 Concurrency models and Modern Processors 1 / 17

2 Introduction The classical model of concurrency is the interleaving model. It corresponds to a memory model called Sequential Consistency (SC). Modern processors dot not implement SC, but so-called relaxed memory models, such as Total Store Order (TSO). 2 / 17

3 Memory models Sequential Consistency All memory accesses are immediately visible to all processors. Total Store Order (TSO) All write operations go through a buffer. Each processor reads the most recent value in its buffer, if there is such a value ; If not, the value held in memory is read. 3 / 17

4 Memory models (continued) x86-tso Intended for the programmer Based on TSO Extended with the concept of a global lock Compatible with the tests found in processors documentation Does not aim at closely modeling the internal structure of processors 4 / 17

5 Memory models (continued 2) Examples that can be fully executed under TSO (and X86-TSO), but not under SC initially : x = y = 0 Processor 1 Processor 2 store(p 1, x, 1) (s 1 ) store(p 2, y, 1) (s 2 ) load(p 1, y, 0) (l 1 ) load(p 2, x, 0) (l 2 ) store(p, m, v) p : [m] v load(p, m, v) p : ([m] == v) Possible interleavings : SC TSO s 1 s 1 s 1 s 2 s 1 l 1 s 2 s 2 l 2 l 1 s 2 l 1 blocks l 1 s 1 s 2 l 2 blocks l 2 l 1 l 2 blocks 5 / 17

6 Memory models (continued 3) Examples that can be fully executed under TSO (and X86-TSO), but not under SC initially : x = y = 0 Processor 1 Processor 2 store(p 1, x, 1) (s 1 ) store(p 2, y, 1) (s 2 ) load(p 1, x, 1) (l 1 ) load(p 2, y, 1) (l 3 ) load(p 1, y, 0) (l 2 ) load(p 2, x, 0) (l 4 ) Possible interleavings : SC TSO s 1 s 1 s 1 s 2 s 1 l 1 s 2 s 2 l 3 l 1 l 2 l 1 l 1 s 1 l 2 s 2 l 3 l 2 l 1 s 2 l 3 l 2 blocks l 3 l 2 l 3 l 4 blocks l 4 l 4 l 4 blocks 6 / 17

7 SC Formal definition Order relations : program order (< p ) (partial, per processor order) memory order (< m ) (global) To correspond to the SC model, an execution must satisfy the following conditions : op i, op j : op i < p op j op i < m op j where op x représents a store or load operation. The result of a load is the one compatible with the memory order. 7 / 17

8 SC (continued) Operational definition P 1 P 2 P n Loads/Stores Loads/Stores Loads/Stores Switch Single Port Memory 8 / 17

9 SC (continued 2) Memory orders, first example initially : x = y = 0 Processor 1 Processor 2 store(p 1, x, 1) (s 1 ) store(p 2, y, 1) (s 2 ) load(p 1, y, 0) (l 1 ) load(p 2, x, 0) (l 2 ) where s 1 < p l 1 et s 2 < p l 2. The possible memory orders are all interleavings satisfying the conditions s 1 < m l 1 et s 2 < m l 2. Example : s 1 < m l 1 < m s 2 < m l 2 9 / 17

10 TSO Formal definition To correspond to the TSO model, an execution must satisfy the following conditions : 1. l a, l b : l a < p l b l a < m l b 2. s a, s b : s a < p s b s a < m s b 3. l, s : l < p s l < m s 4. val(l a ) = val(max < m {s a s a < m l a s a < p l a }). Note that stores can be delayed, but loads have access to the latest locally written value. 10 / 17

11 TSO (continued) Operational definition P 1 P 2 P n Stores Stores Loads Loads Loads Stores FIFO Store Buffer Switch Single Port Memory Transferring a store from a buffer to main memory is called a commit. 11 / 17

12 TSO (continued 2) Memory orders, first example initially : x = y = 0 Processor 1 Processor 2 store(p 1, x, 1) (s 1 ) store(p 2, y, 1) (s 2 ) load(p 1, y, 0) (l 1 ) load(p 2, x, 0) (l 2 ) where s 1 < p l 1 et s 2 < p l 2. The compatible memory orders, are the interleavings of s 1, l 1, s 2 et l 2. Example : l 1 < m s 2 < m l 2 < m s 1 or l 1 < m l 2 < m s 1 < m s 2 12 / 17

13 TSO (continued 3) Memory orders, second example initially : x = y = 0 Processor 1 Processor 2 store(p 1, x, 1) (s 1 ) store(p 2, y, 1) (s 2 ) load(p 1, x, 1) (l 1 ) load(p 2, y, 1) (l 3 ) load(p 1, y, 0) (l 2 ) load(p 2, x, 0) (l 4 ) where s 1 < p l 1, l 1 < p l 2, s 2 < p l 3 et l 3 < p l 4. The compatible memory orders are all interleavings of s 1, l 1, l 2, s 2, l 3 et l 4 satisfying l 1 < m l 2 et l 3 < m l 4 Exemple : l 1 < m l 2 < m l 3 < m l 4 < m s 1 < m s 2 13 / 17

14 TSO (continued 4) Memory orders, second example - Details Sequence of operations Memory order store(p 1, x, 1) (s 1 ) - 14 / 17

15 TSO (continued 4) Memory orders, second example - Details Sequence of operations Memory order store(p 1, x, 1) (s 1 ) - load(p 1, x, 1) (l 1 ) l 1 14 / 17

16 TSO (continued 4) Memory orders, second example - Details Sequence of operations Memory order store(p 1, x, 1) (s 1 ) - load(p 1, x, 1) (l 1 ) l 1 load(p 1, y, 0) (l 2 ) l 1 < m l 2 14 / 17

17 TSO (continued 4) Memory orders, second example - Details Sequence of operations Memory order store(p 1, x, 1) (s 1 ) - load(p 1, x, 1) (l 1 ) l 1 load(p 1, y, 0) (l 2 ) l 1 < m l 2 store(p 2, y, 1) (s 2 ) l 1 < m l 2 14 / 17

18 TSO (continued 4) Memory orders, second example - Details Sequence of operations Memory order store(p 1, x, 1) (s 1 ) - load(p 1, x, 1) (l 1 ) l 1 load(p 1, y, 0) (l 2 ) l 1 < m l 2 store(p 2, y, 1) (s 2 ) l 1 < m l 2 load(p 2, y, 1) (l 3 ) l 1 < m l 2 < m l 3 14 / 17

19 TSO (continued 4) Memory orders, second example - Details Sequence of operations Memory order store(p 1, x, 1) (s 1 ) - load(p 1, x, 1) (l 1 ) l 1 load(p 1, y, 0) (l 2 ) l 1 < m l 2 store(p 2, y, 1) (s 2 ) l 1 < m l 2 load(p 2, y, 1) (l 3 ) l 1 < m l 2 < m l 3 load(p 2, x, 0) (l 4 ) l 1 < m l 2 < m l 3 < m l 4 14 / 17

20 TSO (continued 4) Memory orders, second example - Details Sequence of operations Memory order store(p 1, x, 1) (s 1 ) - load(p 1, x, 1) (l 1 ) l 1 load(p 1, y, 0) (l 2 ) l 1 < m l 2 store(p 2, y, 1) (s 2 ) l 1 < m l 2 load(p 2, y, 1) (l 3 ) l 1 < m l 2 < m l 3 load(p 2, x, 0) (l 4 ) l 1 < m l 2 < m l 3 < m l 4 (commit(s 1 )) l 1 < m l 2 < m l 3 < m l 4 < m s 1 14 / 17

21 TSO (continued 4) Memory orders, second example - Details Sequence of operations Memory order store(p 1, x, 1) (s 1 ) - load(p 1, x, 1) (l 1 ) l 1 load(p 1, y, 0) (l 2 ) l 1 < m l 2 store(p 2, y, 1) (s 2 ) l 1 < m l 2 load(p 2, y, 1) (l 3 ) l 1 < m l 2 < m l 3 load(p 2, x, 0) (l 4 ) l 1 < m l 2 < m l 3 < m l 4 (commit(s 1 )) l 1 < m l 2 < m l 3 < m l 4 < m s 1 (commit(s 2 )) l 1 < m l 2 < m l 3 < m l 4 < m s 1 < m s 2 14 / 17

22 x86-tso Formal definition The order constraints on the loads and stores are the same as those for TSO. Extended operations : mfence(p) blocks processor p until its buffer is empty. lock(p) If the lock is not already held by another processor, p takes the lock and obtains exclusive access to the global memory : the other processors are not allowed to execute the operations commit or load unlock(p) p flushes the buffer to global memory and releases the lock. 15 / 17

23 x86-tso Operational view Loads P 1 Stores FIFO Store Buffer LoadsP n Stores Switch Single Port Memory Lock 16 / 17

24 Other memory models Partial Store Order (PSO) Relaxed Memory Order (RMO) 17 / 17

Multicore Semantics and Programming

Multicore Semantics and Programming Peter Sewell Tim Harris University of Cambridge Oracle October November, 2015 p. 1 These Lectures Part 1: Multicore Semantics: the concurrency of multiprocessors and