aboutsummaryrefslogtreecommitdiff
path: root/notes/loopcall_lambdas_etc.txt
blob: 2bb1232ee62981be345ca38e18be8d178ac6db95 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147

Loopcall
========

    s = get_a
    for i in l
        do_stuff s i
    end

becomes (in pseudocode):

    s = get_a
    do_stuff__loop LOOP_ON_PARAM_2 s l


Avoiding duplication
--------------------

It's inefficient to have both a normal and an __loop version.
Also, that would mean that you have to decide beforehand whether
a function should be loopable or not.

Maybe it can be detected dynamically?

    s = get_a
    for i in l
        __mark_as_loopcall 2 l
        do_stuff s i
        if __has_looped break
    end

Another way:

    s = get_a
    for i in l
        __mark_as_loopcall 2 l __loopend
        do_stuff s i
    end
    __loopend:

A third way:

    s = get_a
    __prepare_loopcall 2 l __loopend
    for i in l
        __set_loopcall_flag  # sets some register or flag.
        do_stuff s i
    end
    __loopend:

Which register/flag to set?

* Maybe some special-purpose register can be set to an unusual value?
    - But it needs to work still, in case the callee is not aware (or
      does not benefit from being aware) of this calling convention.
* Maybe a non-nullable parameter (e.g. the `this`-parameter) can be
  set to an unusual value?
    - In this case, utility (class-less) functions could have an additional
      parameter. Maybe passed the same way as the `this`-parameter for
      methods (functions on classes).
    - The real `this`-value can be passed in another register
      (e.g. the last parameter)
    - Unfortunately, this method requires that all functions (callees)
      are aware of this, and handle it properly.
* Setting some flag in the TLS area could also work.

In case the callee does NOT support the calling convention, it is necessary
to detect this in sub-callees (that might support it).


Is there a portable and safe solution?

* Return address / link register might be either protected (for increased
  memory safety) or optimized (i.e. only in case it is not "touched").
    - Maybe some "signature" can be stored at the return instruction,
      some kind of no-op instruction that otherwise shouldn't appear in
      normal code.
* Carry flag might be set by arithmetic operations.
  (Also, not all CPU architectures have any carry flag, e.g. RISC-V doesn't.)
* TLS area might be slow to read from. And it's generally a memory access
  and not a register access.
* General-purpose registers can have any value set by unaware callers.
* Can't safely use "magic" value of general purpose registers, because that
  might introduce a side-channel if the magic value appears in cryptographic
  keys or similar.
    - Unless callers can be expected to zero out any sensitive things from
      registers.
    - Maybe a pair of "distant" registers could be used, that are unlikely
      to both be used for sensitive information.
* Cannot use registers that by ABI must have a specific value/state.
  E.g. cannot use the direction register on x86.

IF it is possible to tell SLUL functions apart from other functions, then
it would be possible to jump to N machine-words before the normal address
when the caller is loopcall-aware.

Non-portable ways:

* There are some x86 (non-portable) registers to investigate:
    - PF "parity flag"
    - AF "auxiliary carry flag"
    - TF "trap flag"
    - SF "sign flag"

Most realistic solution: Additional function with additonal parameters
----------------------------------------------------------------------

On CPU architectures with many registers:

    s = get_a
    for i in l
        do_stuff__loop s i 2 l __loopend
    end
    __loopend:

On CPU architectures with few registers (or maybe always?):

    s = get_a
    __lc = __prepare_loopcall 2 l __loopend
    for i in l
        do_stuff__loop s i __lc
    end
    __loopend:

Better yet, for supporting complex loops also:

    s = get_a
    for i in l
        do_stuff__loop  s  i  l
        # this one would need to store a fat pointer somewhere
        # so the holders of `item` can get a full chunk.
        item = get_stuff__loop  s  i  l
        store_stuff__loop item  item  l
    end

Alternative solution: Do something in the iterators?
----------------------------------------------------

    s = get_a
    __loop with
        iterator = l
        func     = do_stuff_loop
        num_args = 2
        iter_arg = 2
        static_args = [s]
    end

But this will not work with non-trivial loop bodies.