notes/slul2.txt


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296


Module header syntax
--------------------


Function syntax
---------------

Constructor/OO syntax
------------------

No constructor should mean that there is a default constructor where all
uninitialized fields are assigned from a corresponding parameter.

    int x
    int y = 123

    constructor
        int x
    code
        this.x = x
    end

How to denote exported vs module-private vs class-private fields?
How to denote exported vs module-private vs class-private functions?
Should there be "friend classes"? Or "friend namespaces"?
    - Should those allow access from other modules? Probably not?

For record-style classes, it would be nice if it was enough to specify
only the fields, and then have a usable struct.


Type syntax
-----------

Integer types:
- Should they be bit-sized? E.g. int32
- Should there be unsigned integers? E.g. uint/uint32
- Should they be range-limited? E.g. int from 0 to 9, int<0..9>, number from 0 to 9

Maybe "neutral" types like int and string should be avoided,
and more specific types should be defined? At the defintiion, it could be
possible to specify a range, maxlength or allowed characters.

Data syntax
-----------

Should the type specification syntax be:

* `int i`
* `i: int`
* `i` i.e. inferred from name
* `const i` i.e. inferred from name, but with explicit definition
    - or e.g. `def i`, `let i`
* `num i`, `string s`, `ref o`, i.e. distinguishing between "major" types
  but inferring more precise types.
* Allow type inferrence in `given` lines.

As field in arena / At ...?
- Should this be possible?
- Where to put the definitions? A specific place per module?
- Or should there be some kind of dynamic mechanism?
  (but then there could be a risk of name conflicts, unless
   each module has it's own namespace)

As field in object / At file-level:

    # Final (only assignable in constructors, must be assigned)
    int x
    # function-specific instance variable
    var int y, only in (f,g)
    # Function-specific, but must be assigned in constructors
    int x2, only in f

As local variable in function / At function-level:

    func f
    code
        # name only
        i = 123
        # type+name
        int i = 123

        # XXX this is probably a bad idea. it hides state!
        #instancevar state 
    end


Statement syntax
----------------


Expression syntax
-----------------

Function calls. Only allowed at the beginning of statements, or at the
beginning of grouping parentheses:

    f
    f arg1
    f arg1, arg2

Should there be methods?
Or should there be per-type identifier scopes?
- That could mean that two places would have to be searched!
- Unless "functions" and "procuderes" are different things
  But that would mean different function naming:
    add_number (from_thing t)
  Instead of:
    add_number (get_number t)
  I think there are pros and cons. In some cases one syntax is more readable
  than the other.

    obj.f arg1
    obj.f arg1

The advantage of this is that getters can be called in subexpressions
without using parentheses:

    obj.f  obj2.get_x, obj2.get_y

How should variables/fields(/properties?) be assigned?

    obj.value = 123
    obj.value! = 123
    obj!.value = 123
    set obj.value, 123
    set obj.value!, 123
    set obj!.value, 123
    # with getters/setters only:
    obj.set_value  123
    obj!.set_value  123

There are actually only two types of expressions that can appear at the
statement level:

- function calls
- assignment expressions

Of course, the left-hand side is a subexpression and could be some deep
multi-level expressionm e.g. obj.arr[obj.index][2].field

Instance variable access and member function calls
--------------------------------------------------

How to distinguish between instance-variables/locals/parameters/constants?

In some languages that might be:

    this.instancevar
    localvar
    parameter
    CONSTANT

Is there a way to make this/local use different syntax but still avoid the
"this." thing?

- "with" statement like in Pascal:
  (but it also leads to ambuigities)
    with this
        instancevar = 123
    end
- With sigills/prefixes
    .instancevar = 123
    %instancevar = 123
    m_instancevar = 123
    .memberfunc 123
    %memberfunc 123
    memberfunc 123  # maybe all calls without an object should be this-calls?
- With implicitly defined setters
    this.set_instancevar 123
    set_instancevar 123    # with implicit this-calls when there's no object
- With some keyword:
    our instancevar = 123
    # related: typescopes
    our bgcolor = fromtarget green
    our bgcolor = fromtype green

Main block
----------

Make simply programs easy to create:
(the main file could inherit from SlulApp, and that could have functions
such as `writeln`)

    main
        writeln "Hello"
    end

(Disallow defining a constructor in this case?)

File structure
--------------

One-class-per-file can lead to lots of tiny files.
Also, what identifier to use for files in subdirectories?

But it is still really nice to be able to see which file a class is defined in
by just looking at it's name.

Maybe an implicit namespace is a good idea?
And there could be some keyword for nested classes.

If files have implicit namespaces, how to define instance fields and 
constructors?

    # should there be interfaces?
    implements Printable

    # must come before any functions or it gives a warning?
    int number
    bool flag

    # Should overloading be allowed?
    # Should constructors have a name or not?
    constructor new
        int number
        bool flag
    code
        ...
    end

    func do_stuff
    ...

Extensability / API / Object Orientation
----------------------------------------

In most languages, interface types (for dynamic function calls) are
implemented some some pointer (e.g. to some list of vtables).

To make it possible to turn an "interface-less" class into a class
with intefaces, without breaking ABI, it would be necessary to put
the vtable separately.

That could be done by passing the vtable in a "fat pointer" when passing
around a reference to the object. I think that is how Go does it?

Another (probably worse?) method could be to store the vtable pointer
in an out-of-band structure, such as a ptr_address-to-vtable map in some
thread-local area?

Or, if arena-chunks are used (it might be better to use thread-local storage
for arena allocation), then the arena chunk header could have such a tree.
Or, it could be a list where each entry has a range (or bitmask) of
addresses that it applies to (this would make objects allocated in a loop
use only one vtable). A downside with this is that the vtables remain
allocated even when not in use.

A completely different approach to OO-interfaces
------------------------------------------------

Have functions return a "state" when the OO-interace should be called.
But this would have to be a deep return (like from a co-routine) if
the OO-interface is nested inside an object in a parameter.

Can the need for "OO-interfaces" be completely eliminated?

* For "global" interfaces, such as for logging, the function pointer
  could go into either the arena or in thread-local-storage.

* For builtin types such as lists, there could be some sentinel value in
  the "header" of the data structure that indicates whether to use
  indirection via a vtable (e.g. for custom lists/maps).
  But it still needs to be compatible with objects that implement
  both a builtin interface as well as a non-builtin interface (or
  multiple builtin interfaces).

Custom statements
-----------------

It would be really nice to be able to create custom control structures etc.
Problems:

* How to distinguish them in the tokenizer?
    - it would actually have been "trivial" (exept for the "else" case)
      if functions had ().
* How to handle following blocks, e.g. "else", "except", "finally", etc.?
* How to implement it internally in an efficient way?
    1. lambdas
    2. an internal loop with switch-case of a return value, e.g.
        int rv;
        CustomCtl state;
        while (rv = custom_ctl_stmt(&state)) {
            switch (rv) {
            case 1:
                ...
                break;
            case 2:
                ...
                break;
        }
    3. an internal loop, where the return value is a series of states?
       (which has the potential to decrease the number of function calls, but
        but on the other hand increases complexity and might increase the
        number memory access,