aboutsummaryrefslogtreecommitdiffhomepage
path: root/notes/memory_access_ordering.txt
blob: 99decfa8b13ba771d1ca9e78b59d500af71bb8de (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129

Memory model
============
Require explicit ordering?

 func f()
 {
     threaded ref var int a = ...
     threaded ref var int b = ...
     a data = 10 label l1
     b data = 20 happens-after l1
 }

How about operations in loops?
I.e. should a line always happens-after itself?

 func f()
 {
     threaded ref var int a = ...
     while .true {
         a data = a + 1 happensafter self
     }
 }

How about function calls (also across modules)?
(decide a syntax for this)

 func g(threaded ref int a)
     threaded a call-ordered   # "synced" at function entry and exit

 func g(threaded ref int a)
     threaded a unordered


How about compound objects?

- TODO
- Some systems might handle int64 with two int32 accesses.
    => Two threads reading/writing int64 values could lead to
       "half-written" values being read/written.
        => threaded int64 can't have constraints.
        => threaded int64-backed enums cannot be closed enums.
           (perhaps it should be required that all int64-backed
            enums are not closed. This is tricky with private enums!)
        These restrictions apply to all platforms, since SLUL
        guarantees identical behavior on all platforms.
        (The other option would be trap instead.)
- Function pointers could be larger than the platform's word size.
    => Accessing a such pointer must cause an exception.
        - The exception can happen can happen on read, write
          or call. As usual with SLUL exceptions, it can be
          asynchronous/deferred.


Read and write operations:

     threaded ref var int a = ...
     threaded ref var int b = ...
     threaded ref var int c = ...
     threaded ref var int d = ...
     a unordered = (b ordered) + (c unordered label c1) + (d happensafter c1)
     # the read of "b" is ordered in respect to all other (read and/or write) operations to any variable
     # the read of "c" is not ordered in respect of any operation
     # the read of "d" is ordered after the read of c
     # the write to "a" is unordered

happensbefore vs happensafter?
- happensafter is more intuitive in my opinion
- language specs generally use "happens-before". is there a reason for this?


Forbid circular/impossible orderings:
- TODO Are there other impossible orderings than circular ones?

     threaded ref var int a = ...
     threaded ref readonly int b = ...
     threaded ref readonly int c = ...
     a unordered = (b happensafter c1 label b1) + (c happensafter b1 label c1)


Atomics:
= that the data is read/written in one go

     threaded ref var int a = ...
     a += 1 atomic
     # How to specify this for more complex operations
     # such as compare and swap?
     # perhaps something like this: (assign n to a if equal to c, and return old value)
     #o = (a ?= n if c)
     o = compare_and_swap(a, c, n)
     # other possibly useful operations:
     # - compare_and_set (without using the result)
     # - exchange
     # TODO C++ has weak and strong compare-and-swap. what are those?

RCU stuff?


Per-(arena,arena) memory ordering
---------------------------------
It would be cool if memory ordering does not have to be a "global" property
of all accessible memory.

The following operations are the minimal needed:

    # The barrier can only be initiated via the arenas created by the
    # ArenaBuilder, and any sub-arenas.
    #
    # The local side of the barrier operation is always the initiating thread,
    # for two reasons:
    #   - Otherwise, there could be race conditions
    #   - Also, I don't think any system supports "remote" barriers.
    #
    # If other types of arenas than plain memory arenas are added
    # (e.g. some kind of mmap'ed arenas), then they may or may not be
    # supported by this function.
    func ArenaBuilder.prepare_barrier(
            FenceType type,     # Type of barrier
            ?ref List<arena> local, # List of local sides. none = all (memory) arenas reachable in the current call chain.
            ?ref List<threaded arena> remote, # List of remote sides. none = all (memory) arenas.
            ) -> ref MemBarrier
            lifetime remote >= return

    # The barrier must be initiated with a matching arena.
    func ref MemBarrier.barrier(arena)

But the API copuld be greatly simplified:

    # Synchronizes memory in the given arena with other threads.
    func cpu_membarrier_XYZ(threaded arena)