aboutsummaryrefslogtreecommitdiffhomepage
path: root/notes/slul2.txt
blob: 9d103d00f44b600fef402654826449edd43e25c9 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340

SLUL2
=====

Making SLUL:
* easier to use
* easier to implement
* more future-proof / more portable

Desirable changes:

* Revise ref
    - forbidding refs to non-struct/non-array types might enable some optimizations
    - *removing* explicit refs would definitely have impacts on usability.
      not sure if good or bad. it makes the proglang more implicit/"magic",
      which can be a bad thing.
* Implicit arenas?
    - and for mutating methods, use the 4 "allocation variants": (inspired by Vale)
        1. placement new        (uses arena only for "indirect"/references fields)
        2. arena new            (allocate in given arena)
        3. modify self          (in-place modification. Can keep referenced data)
        4. discarding self-mod  (in-place modification. Cannot keep referenced data)
        some of these are applicable to constructors also
* Garbage-collected areanas?
    - I.e. local GC
    - It would remove "anxiety" around memory allocation
    - Downside 1: It requires meta-data, which isn't needed with plain
      arena allocation.
    - Downsides N: The usual downsides with GC
* Revise expr integer types
  What are the use-cases for "non-plain ints"?
    - length type / ssize/usize
    - byte/int16 arrays
    - small/bitsliced fields in structs
    - fixed-size wrapping uints (e.g. for hash functions).
* Move some stuff from hard-coded syntax to code? e.g. like Scheme, REBOL, Nim.
    - might actually work with statements:
        - they can only appear inside function bodies
        - so the toplevels are available, and their types are known.
    - But it would definitely need inlining to work with reasonable speed.
    - Perhaps a bad idea after all?
* Misc syntax stuff:
    - Use tabs instead of spaces?
      But this gets tricky with alignment of e.g. parameters.


Control statements defined in library module headers
----------------------------------------------------

These need some kind of analysis of the IR to check varstates (liveness, etc).
It also needs to handle nested if-elseif-elseif...

So perhaps this is a bad idea?

Example:

    statement "if" Expr cond Statements true_block "else" Statements false_block
    {
        cond
        CONDJUMP FALSE false_block
        true_block
        JUMP end
        false_block
        end
    }
    statement

Super-simple proglang
---------------------

Only two/three kinds of typedefs. Not allowed as anonymous types (or? it is useful for e.g. return values)

    record SomeStruct {
        int field1
        OtherType field2    # <--- compiler chooses whether to put in ref or not
                            # this makes FFI trickier. But non-closed types are always refs.
                            # this also makes lifetimes and aliasing trickier.
                            # maybe "var" should not be allowed to alias?
    }

    enum SomeEnum {
        ...
    }

    # Maybe some kind of sum/union/variant type
    record ExprNode {
        ExprType type
        int line
        int column
        switch type {
        case .unary
        case .binary
            Expr operand_a
            if type == .binary {
                Expr operand_b
            }
        case .call
            Expr func_expr
            int num_args
            # have a built-in list type?
            # and choose the best possible representation?
            # (in this case it's runtime-determined frozen-length, so it could be a pointer to an array. or a full-blown list type)
            int[num_args] args
        }
    }

    # Maybe some kind of constraints
    func process_op(ExprNode<.type in (.unary, .binary)> expr)
    func process_op(ExprNode expr [.unary, .binary])
    func process_op(ExprNode(.unary .binary) expr)
    func process_op(ExprNode<.unary .binary> expr)
    func ExprNode<.unary .binary>.process_op()
    func ExprNode.process_op()
        for (.unary, .binary)
    func ExprNode.process_op()
        with (.unary, .binary)
    func ExprNode.process_op()
        this in (.unary, .binary)
    func ExprNode.process_op()
        given type == .unary or type == .binary

Qualifiers for records and enums:

    record Point closed {  # (require a newline here?)
        int x
        int y
        # no more fields can be added. allows some optimizations, such as call-by-value / embedding into structs
    }

    enum SubPixel closed {
        .red
        .green
        .blue
    }

Enums can have a base type and/or integral values also
(this is mainly useful for FFI)

    enum StatusByte closed byte {
        .ready = 10
        .running = 20
        .failure = 90
    }

Integer / elementary types:

* Perhaps even use variable-size integers?
  The downside is that += 1 etc. might require allocation.

Methods:

* Skip "this". But disallow shadowing.

Type identifiers

* For consistency, always include the "." in typeidentifiers, even in
  e.g. enum definitions.
* Constructors are maybe not that intuitive (can they be improved?):

    func .new(int a, int b) -> Thing


Avoiding punctuation:

* Can the . in typeidentifiers be skipped?
* Can the () in function calls be skipped?
    - if the function call fits on one line
    - (unless a comma is required between them) and the parameters are terms
    - and the function call is not nested inside
      a function call, field or index expression.
    - related: tuples. but that would be ambiguouos if used as function arguments
* Can the () in function declarations be skipped?

    func example
        int a
        int b
        return bool
    {
        if a == b {
            otherfunc a, 123
            return true
        }
        
    }

Can refs be avoided?

    # objects:
    # These are always passed by reference.
    # References can be compared with "ref_is" or "is" or a similar operator.
    # The "==" and "!=" operators are not allowed (maybe it should be allowed to implement them? e.g. with a method called "equals"?)
    type Box = object {
        # These are references:
        Item a
        Item b
        # Perhaps allow syntax like this:
        Item a1, b1
        Item a1, Item b1
        # Regarding tuples:
        # I think that maybe they CAN be references if it too large to use values :)
        # - We can require that if the object is mutable, it must also be passed by arena-ref.
        # - Tuples up to some certain size could be embedded / passed by value
        #   (Check the optimal limit. It's at least the size of two pointers, but it could be larger)
        # - Tuples allocated in the *same* arena can just be referenced directly!
        #   (this should be fairly simple and fast to check).
        #   - If each thread uses a contiguous virtual-memory block,
        #     then this would be a trivial range check.
        # - Tuples allocated by the same thread and in SLUL code, can
        #   (as an optional optimization) be referenced if
        #   1) the lifetime allows it (how to check this at runtime?), or
        #   2) the runtime uses garbage collection, and can perform GC in
        #   this case.
        # - Tuples allocated in SLUL code from other threads may or may not
        #   be possible to reference depending on whether the runtime
        #   supports cross-thread GC. For consistency accross implementations,
        #   it might be better to just re-allocate/copy in this case.
        # - Other tuples would require a copy. (This is really a requirement
        #   for tuples allocated from C code, unless it uses SLUL's arena
        #   allocation functions in slulrt.)
        #   
        LargeValue large
    }
    # opaque objects:
    # - Like objects, but fields (and layout/size) are inaccessible
    # - Lacks {} and has the layout defined in the impl, just like a function can have it's body in the impl
    # - Perhaps it should be forbidden to have non-opaque objects in interfaces? It's generally an anti-pattern.
    type Item = object
    # tuples:
    # - The ABI decides when to pass these by ref or value
    # - Reference comparison operators are not allowed.
    # - The contents can be compared with the "==" and "!=" operators.
    # - Tuples can't be opaque/private.
    type Point = (int x, int y)
    type Point = (int x, y) # perhaps allow this syntax as well (...and multi-line syntax without comma also)
    type LargeValue = ([10000]byte buffer)
    # For type-scoped functions that return an object ("constructors"):
    # - They implicitly take an arena parameter
    # - The returned reference is an arena reference
    func .new() -> Box
    constructor Box.new()   # maybe type-scoped functions should have this syntax?
    # Return values in methods have the same lifetime as the object itself
    # - Should the this parameter be "var"? 
    # - Should the this parameter be "arena"? 
    func Box.get_contents() -> Item
    # Parameters do not implicitly transfer ownership. Inside the callee, the lifetime of "other" ends when the function returns.
    # - Should the parameter be "arena"? 
    func Box.equals(Box other) -> bool
    # Parameters can be marked with "keep" to allow shared ownership
    func Box.set_contents(keep Item contents)
    func Box.set_contents!(keep Item contents)    # perhaps there should be a ! for functions that modify the object?
    func var Box.set_contents(keep Item contents) # or a qualifier like this.
    # Parameters can be passed as "var"
    # - if passed as "keep", we need exclusive access (or the item can be marked as aliased)
    fucn Box.squeeze_item(keep var Item item)
    fucn Box.squeeze_item(keep aliased var Item item)
    # To have mulitple outputs from a function, use a tuple as the return value:
    # - The ABI decides when to pass these by ref (implicit parameter) or value
    # - Because tuples can't be opaque, the return value could be returned by value.
    # - Because tuples can't be opaque, the *caller* allocates (on stack) if it's not possible to pass by value.
    func Box.get_both_items() -> (Item a, Item b)

    # How should function references work?
    # - What keyword to use when there are no refs?
    # - Most of the time, you want a context-parameter
    # - For non-ref (or slot) types, you may want a (reference, length) to process multiple items at once.
    func Box.process_contents(delegate(Item item) handler)

Should the builtin types use TitleCase names also?

    Probably yes, for consistency.

    String
    Byte
    Int16

    Java developers might confuse these with reference types, though :(
    And worse, as a Java developer, you might start using e.g. Byte
    where it should be byte in your Java code. That will usually silently
    compile without any warnings, but can be broken (with == != operators)
    or slow.

    Solution:

    Use different names:
    - byte -> UInt8 or U8
    - int  -> Integer (or even skip this type, and have only fixed-sized Int*/UInt*)

    Regarding the extra finger strain to hold shift and stretching out the
    finger to push the letter button: That could be solved by having the IDE
    auto-capitalize the type if it exists and a type is expected at the given
    location.


Should/can there be a arbitrary-sized integer type?

    E.g. allow integers -16384..16383 to be stored directly, and use a
    reference for larger integers.

    What should it be called?

        num
        int
        intn
        integer
        BigNum
        BigInt
        Num
        Int
        IntN
        Integer

    The compiler could optimize it to a more efficient type if the range
    is known!

        num i = get_number()  # since it is immutable, we can infer the type from the return value of get_number()

    Maybe it should be possible to specify a range? What syntax to use?

        var num<0..=10> i = 0
        var num<0 upto 10> i = 0
        var num i [[0 <= value <= 10]] = 0
        var num<0-10> i = 0     # but "-" is also the minus operator :(
        var num<0~10> i = 0

Print function:

    How simple can it be, without creating confusion/problems or hard-coding things?

        out.print("number: {}", .[123])     # array constructor
        out.print("number: {}", 123)        # safe variant-type var-arg
        out.print "number: {}", 123         # allowing () to be skipped (in some cases)
        out "number: {}", 123               # allowing a default function on objects
        out("number: {}", 123)              # allowing a default function on objects, but without allowing () to be skipped
        # error handling?

    Input streams could also have a default function.
    But it would be limited to only reading e.g. a line.
    (That's probably what iterators should do as well.)
    What should it do on error?

        string s = in()     # reads a line